168 JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS The test designs are fairly standard throughout the industry. When two treatments are tested, each subject is applied with both treatments, one on the left axilla and the other on the right. Treatments are assigned to an equal number of left and right axillae in the panel. The reasoning for this assignment is that, in the general population, the right axillae tend to have slightly higher sweat rates than the left axillae (known as the laterality or "sides" effect). Test designs are also available for three or more treatments in the same panel. In antiperspirant efficacy testing it is common practice to repeatedly use the same cadre of human subjects. Baseline testing can be used as a monitoring tool for clinical test quality assurance. Major deviation of a panelist's baseline result from a historical norm can detect protocol violations (nonabstinence from antiperspirant use) or physical changes that might affect the test outcome. Baseline data have also been utilized in the statistical analysis to adjust the post- treatment data. Although treatments are randomly assigned to subjects, the mean sweat rates of the axillae destined to receive each treatment may be different at baseline. If baselines are run, it is sensible to extract whatever information exists in the baseline data to improve the test precision. A number of statistical methods have been proposed and are in use for analyzing the results of these tests, and this profusion has led to some controversy. The FDA (3) recommended the use of nonparametric tests based on rankings of the sweat rates, either with or without baseline data. Majors and Wild (4) described a simple baseline adjust- ment procedure where the post-treatment sweat rates were divided by the baseline, or pretreatment, sweat rates. Wooding (5) and Wooding and Finklestein (6) demonstrated the need for a logarithmic transformation of the sweat rates prior to normal-theory statistical analysis. They de- scribed the use of this transformation for both uncorrected and baseline-corrected data, but they recommended against the use of simple baseline adjustment. SUMMARY OF RESULTS AND CONCLUSIONS Our investigations of statistical methods have shown that baseline information can best be utilized by analysis of covariance (ANCOVA), a technique that has been used extensively in pharmaceutical clinical trials and was briefly mentioned by Wooding and Finklestein (6). Using data from 70 recent clinical studies, the test precision using the ANCOVA procedure was always better than either using post-treatment data alone (POSTRT) or using the simple change from baseline method (CHGBAS). The basic principle behind baseline correction is that post-treatment sweat rates are correlated with baseline rates. In ANCOVA, a regression line relating post-treatment to baseline data is used for the adjustment. The slope of the regression line thus determines the degree of baseline adjustment required, which may vary from study to study. In the CHGBAS method, the slope is fixed at the value 1.0. Averaged over all tests, the POSTRT and CHGBAS methods were roughly comparable in precision, the CHGBAS method being about 4% lower in test variance. In 60% of the tests, the CHGBAS method gave better precision than the POSTRT method. The ANCOVA method was superior in precision to the other two methods, averaging 14%
ANTIPERSPIRANT RESULTS 169 lower in variance than CHGBAS and 17% lower than POSTRT. The variance of either POSTRT or CHGBAS ranged from approximate equality to almost double the ANCOVA variance over the 70 studies. Our conclusion is that the ANCOVA method guarantees maximum precision for any given test and justifies the use of baseline measurement. The amount of baseline correction needed was found to vary from study to study. The typical difference between POSTRT and CHGBAS comparisons of treatment pairs was +5% percent reduction, which is large when related to differences commonly seen among commercial products. In general, the ANCOVA result was numerically between the POSTRT and CHGBAS results, sometimes agreeing more with one method or the other, the typical difference being about + 2.5% sweat reduction from either POSTRT or CHGBAS. Averaged over all studies, the differences among the three methods was essentially zero, empirically confirming that all three methods were mutually unbiased. Two critical assumptions of the ANCOVA analysis are (a) that the regression line is linear and (b) that slopes are equal across treatment groups within a study. Through examination of many antiperspirancy studies, we have found no evidence that would refute either assumption. In this work, we transformed sweat rates to their logarithms prior to data analysis, as was so convincingly advocated by Wooding (5). Our data diagnostics have confirmed the need for log transformation to achieve the normal error distribution essential for valid normal-theory statistical testing. The arithmetic handling of sweat rates or their ratios in statistical analyses will lead to misleading results and should be avoided. Our mathematical model was based on the fact that the experimental design has two sizes of experimental units, subjects and axillae. This model was slightly more complex than the one suggested by Wooding and Finklestein (6) and allowed for proper testing of the sides by treatment interaction. We found this interaction effect to be nonexistent, leading to a simpler statistical model and analysis, which will be later described. We have had extensive experience with multitreatment studies of three to seven treat- ments for comparing many developmental products and/or competitive products in a single test. For these studies, we recommend a round robin design in which all treat- ment pairs are allocated equally among test subjects. We present a simple model and analysis for this test design. The ANCOVA method has been found to be appropriate for analysis of multitreatment studies. Other findings were related to variants of the test protocol and their repeated-measures characteristics. Baseline results taken on adjacent days were observed to be highly correlated and thus did not appreciably increase baseline precision, leading to a recom- mendation of a single baseline per study. Similarly, the two sweat measurements taken during a test day were shown to be highly correlated and did not greatly increase measurement precision. However, the duplicates were found to be useful diagnostics to identify questionable data. The panel size required to declare specific numerical differences between treatments as statistically significant is also explored. The panel size is a function of the test variation level, the size of the difference between treatments, and the levels of efficacy of the two treatments. The panel size is also dependent on test design in multiple treatment studies. A sample size table is presented for the two treatment cases.
Purchased for the exclusive use of nofirst nolast (unknown) From: SCC Media Library & Resource Center (library.scconline.org)











































































