MEASURING THE MEANING OF FRAGRANCE 757 don't get much that is useful. Most respondents will tell you that it is "very nice" or "fragrant" or maybe that "I don't like it much" and if you are lucky, you get some more specific descriptions, such as "smells like something you eat," "like my grandmother's garden" or "like a drugstore." If, as is the case, every respondent who says anything at all says something different, how can you combine and interpret the comments ? There are, then, real difficulties. Still, if fragrance is, in essence, a message, it is highly important to search for techniques which will re- liably measure the meaning of this message to the consumer. And it is a milestone in perfumery that such techniques have recently been devel- oped and are beginning to get used. ODOR PROFILES AND THE SEMANTIC DIFFERENTIAL In 1960, Paukner (4) obtained from respondents unschooled in perfumery descriptions of the odorants, citral, p-methyl quinoline, eugenol, geraniol, menthone, and hexenyl formate, which were very interesting since they partly confirmed but also partly conflicted with the meanings which these materials hold for professional perfumers. More important, these descriptions could be said to have statistical reliability. There are several noteworthy features about Paukner's approach. For one thing, he did not use just a handful of people in his test he used 287 respondents. In tests of this type, there is real value in large numbers. People disagree about such questions as "To what extent is the odor of citral stimulating?", but if you take a sufficiently large group, the opinions of those who find the odor extremely stimulating will be counterbalanced by a group of people who hardly find it stimulating at all. Averaging all the votes, you arrive at a value which is valid in the sense that a very similar value can be obtained by posing the same question about citral to a different group of respondents. Poffenberger, in 1932, conducted an experiment which, although it does not deal with odor, nicely illustrates this point (5). He presented his test subjects with ten different shapes and asked them to rank these in order of decreasing area (Fig. 1). Poffenberger then scored each indi- vidual's performance by calculating the rank correlation coefficient be- tween the actual order and the order guessed by the subject. Lining up the shapes in perfectly correct order would result in a coefficient of +1.00, doing it all wrong (reverse order) would give a coefficient of -- 1.00, and guessing at random would usually give values between +0.40 and -- 0.40. Judging the area of these complex shapes is not easy. When
758 JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS A B C D E F G H I J Figure 1. Shapes of different areas (after Poffenberger) seven people were asked to rank the shapes their scores ranged from- 0.03 to +0.67, with an average score of +0.36. The interesting point is that when the judgments of these seven people were combined by addition, a ranking was obtained which scored +0.79, which is not only very much better than the average score of the respondents but is even distinctly better than the score of the best judge. When a ranking was obtained by combining the judgment of 20 individuals it scored -+-0.92, which is remarkably close to perfect. There was nothing unique about this experiment and about the results obtained. It has been repeated many times and you can repeat it at home, if you wish. You always find that the judgment of the group will be closer to the truth than that of the individual judges. The reason is that the mistakes of different judges tend to go in different directions. Among our seven judges there may have been three who underestimated the size of shape A, two who over- estimated it, and two who judged it correctly. When all the answers were combined, the negative and the positive errors largely canceled one another, so that the group answer was close to the truth. In ex- periments where you can't establish objectively how correct an answer is, you will find that the average judgments of larger groups are generally in better agreement with one another than the opinions of individuals. This is the reason why Paukner worked with 287 respondents. Another important feature of his experiment was that he did not allow his respondents to choose freely the words with which to describe the odors. If 287 people start associating freely with drugstores, grand- mother's gardens, things they eat, etc., it becomes impossible to combine their votes. He used a questionnaire in which the respondents were given words such as delicate, bitter, cold the instructions were to indicate on a clearly defined scale (0 = not at all appropriate, 4 = completely appropriate) the extent to which each word fits each odor.* With such * In similar experiments, many investigators prefer to use word pairs (delicate-rough, cold- warm) rather than single words each procedure has certain advantages.
Previous Page Next Page