Sensory testing - a statistician's approach

W. A. Pridmore

224 JOURNAL OF THE SOCIETY OF COSMETIC CItEMISTS figures are taken from a test on an alternative de-naturant used for the alcohol in an aftershave. From this we can deduce the following: of the total panel of 30, 1t3 (53%) were able to discriminate (got six or five correct, significant at P 0.001, see Table IV), and of those discriminating, nine preferred A and two preferred B (with five declaring no preference) showing a clear pre- ference for A. Long experience shows that panel members not merely do these tests but enjoy doing them. The task gives an intellectual challenge and there is no problem of motivation panel members ask eagerly if they have "got the test right", and this it is possible to tell them as an incentive to do better. A straight preference comparison between two samples has no "right" or "wrong" answer, and the subject who asks whether his or her answer is "right" in a paired comparison is under a misapprehension about the purpose of preference questions of this kind. But the challenge of the n-t-m type of test, with its right and wrong answers which may be com- municated to the subject without invalidating further testing, leads to considerable enthusiasm and maintains a high degree of motivation over long years of testing. One question which is frequently asked about this form of test concerns sensory fatigue it is suggested that sensory fatigue rules out the possibility of a subject smelling as many as twelve samples and correctly distinguishing between them. Our experience is otherwise in the field of odour we have no difficulty in getting subjects to distinguish between sets of twelve with quite trivial differences between them. Indeed, for an investigation of this very point, we arranged for a panel of some 3:3 members to repeat the same test four times in succession (forty-eight jars smelled altogether) and the rate of discrimination (52%, 58%, 50%, 50%) remained effectively con- stant over the four tests. On the other hand, there are limitations, not so much of physical sensory fatigue, but of ability to get rid of one substance before sensing the next which limits the ability to taste long series of samples. 2-3-1 (triangle tests) are usually the upper limit to the size of test that can be offered to the taste panellist. Pungent tastes which linger in the mouth and deaden the taste buds are difficult to classify whether they represent true sensory fatigue is perhaps arguable. It would be wrong to suggest that sensory fatigue was not present in the larger scale t3+t3 type testing of smells, however constant the response remained over time. It would be arguable that it makes discrimination of

SENSORY TESTING - A STATISTICIAN'S APPROACH 225 small differences more effective, if its effect is to cancel out the common elements of two odours, and merely leave those portions of the odours which were not common to both samples more obvious. There is some support for this hypothesis in the finding that the 6-t-6 test is relatively insensitive to large differences in perfume concentration (where the ratio may be of the order of 100:150) in cases where we are talking about identical perfumes at different levels, but is extremely sensitive to quite small contaminant traces of extraneous odour. However, it is arguable that if sensory fatigue does enter into the judgements of panelmembers assessing smells under these test conditions, then this is entirely appropriate since, particularly with personal cosmetic products it is the smell after continued exposure which requires to be judged. FURTHER DEVELOPMENTS This document has confined itself to discussing in some detail the statistician's approach to certain forms of sensory testing in which two different samples are compared together. Much work has now been carried out on an extension to these techniques in the odour testing field in which many more different products are used (up to 10 or 12) and in which the subject is given the opportunity to create whatever groups he or she thinks appropriate. This approach has proved exceedingly valuable in the quality control field where there are a number of batch samples to compare with some form of control sample this particu- lar application is built upon the structure of the 69-6 test described in the earlier part of this report, which is used to provided known and measured differences to be inserted in amongst the set of smells to be evaluated. The procedure is to take the series of batch samples and to bulk them together having done this the bulk sample is divided in half, and one half has a known amount of a contaminating substance mixed in with it. This difference is checked by a 69-6 test, which is expected to yield a difference of about 30% scoring six and five correct. This level was hit upon empirically as the result of long testing on a variety of different formulae changes of greater or lesser degree 30% was established to be the kind of test difference found between samples available in the retail trade at the end of two years at a time when no complaints whatsoever about smell were being received. This and similar arguments led to the establishment of a 30% discrimina-

Previous Page Next Page

Purchased for the exclusive use of nofirst nolast (unknown)
From: SCC Media Library & Resource Center (library.scconline.org)

Volume 22 No 4 Page 26 (26 of 85)

Volume 22 No 4 resources

Help