J. Soc. Cosmetic Chemists, 18,323-331 (May 27, 1967) Pitfalls and Problems Predictive Testing in MATTHEW J. BRUNNER, M.D.* Presented September 20-21, 1966, Seminar, New York City Synopsis--Predictive tests of cosmetics for sensitization potential and irritancy, as now con- stituted, serve as guides rather than absolute criteria. Alterations in the conditions employed in these test procedures can increase or decrease the number of sensitized subjects, but the crucial question of the relationship between sensitizations in the laboratory test and in actual usage remains unanswered. Properly supervised consumer use tests are still required to supplement the laboratory studies. Further study of the basic mechanisms of sensitization is required before tests can be significantly improved. Since the most common reactions to cosmetics involve untoward effects on the skin, a number of test procedures for new products have been suggested to aid in predicting the probable incidence of these skin reactions, as produced either by direct irritancy or by sensitization of the contact dermatitis type. The multiplicity of these procedures suggests that none is perfectly satisfactory. A good predictive test should, in ad- vance of consumer use, be able to determine the irritating or sensitizing powers of a new formulation when these are at such a low level that they may not be revealed in a small-scale trial of ordinary usage. Even very low reaction rates may present a problem when multiplied by the mil- lions of usages of a nationally sold product. To determine the actual rate of reaction in these low ranges, the test procedure must in some way exaggerate exposure conditions so that reactions become frequent enough for one to compare new formulations with controls in the small test groups which are practical to use. Otherwise, the "test" would consist * 910 Via De La Paz, Pacific Palisades, Calif. 90272. 323
324 JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS only in the supervised normal use of a product. To make sure with 95% certainty that the reaction rate in the general population is no more than 1: 1000, it would be necessary to have some 3000 subjects use the product without finding a reaction. Groups of such size are impractical for pre- liminary tests. On the other hand, when a test involves exaggeration of exposure conditions, one cannot directly apply the results to normal con- sumer usage, since irritancy and sensitization rates depend to a great ex- tent on conditions of contact. In evaluating the irritancy of new cosmetics, predictive tests are usually considered to be useful only for screening out the more violently reactive agents. Animal test procedures, such as the Draize (1) test for irritancy which involves a simple occlusive application, can be helpful for range finding. Such tests serve for preliminary evaluation of entirely new agents, the toxicity and irritancy of which are unknown. Even in those tests employing human subjects the single application of a sub- stance to the skin by the usual twenty-four to forty-eight hour patch test technic is an unreliable means for predicting irritancy. Both false positive and false negative reactions are possible. The greater the dif- ference between use conditions and patch test conditions the more poten- tially misleading are the results. Some substances need multiple appli- cations before irritancy results. With other agents, such as nail polish remover, a single covered patch application will give misleading positive results. It is necessary to design tests specifically for each class of for- mulations, changing the factors of occlusion and repetition or adding some damaging stimulus. By such means the test can be made stringent enough to produce relatively low level but measurable irritant reactions with a control formulation which in actual use has produced a tolerable level of irritancy. Then, if a new formulation produces no more irrita- tion than the control by paired comparison, it can be assumed that the two probably will also give comparable results in actual usage. This is a safe assumption only if the usage conditions are similar. The test used must be reliable, that is, give reproducible results on repetition, and the results must, of course, be tested for statistical significance. It is nec- essary to choose the test panel carefully, since individuals vary consider- ably in their general level of reactivity to irritants, and some individuals rarely show irritancy even with the more reactive substances. The com- position of panels should be such that equivalent numbers of strong and weak reactors are present in different panels. There is also an important seasonal effect on irritancy, with wintertime increases in reactivity. Absolute values are therefore not comparable for tests performed at dif-
Purchased for the exclusive use of nofirst nolast (unknown) From: SCC Media Library & Resource Center (library.scconline.org)