META-ANALYSIS 285 STUDY DESIGN All four studies were conducted according to a forearm controlled application technique protocol (FCAT) (17). The FCAT technique has been shown to yield relative mildness rankings that correlate with consumer experience. Subjects were healthy female volun- teers, aged 18-55, with Fitzpatrick skin types I-IV and forearm visual redness and dryness grades less than 3.0 on a 0-6 point scale with potential 0.5 increments. The studies were limited to female subjects because the bars compared were both intended to be marketed to women who use beauty bars. All subjects provided informed consent. Each screening study enrolled 20-25 subjects, whereas the larger study enrolled 105 subjects. The former studies were designed only to show directional rankings of product mildness. The latter was sized to show a difference as large as 0.17 in visual dryness (the estimate from meta-analysis), with at least 95% confidence and 80% power. Each study was randomized according to a Latin Square design. In this design, subjects received all treatments, and treatments were represented approximately the same num- ber of times on each of the eight treatment sites on the volar forearms. This is a well-recognized design that has been employed since the early 1950s (18,19). It accounts for biological differences among subjects and between application sites on the arm of a given subject, and for the order of treatment application, so that these differences do not unfairly bias treatment outcomes. TEST PRODUCTS The test products of interest were two commercially available mild cleansing bars. Each of the three screening studies included other personal cleansing bars and liquids that addressed objectives outside this publication. Only treatment estimates for the two mild cleansing bars of interest were common to all three screening studies and were included in the meta-analyses. The larger study included only the two mild cleansing bars of interest, with a total of four replicates of each being assigned to each subject. EVALUATIONS The skin condition on each treatment area was evaluated by an expert grader at baseline and three hours after the final study wash. A 0-6 scale with half-point increments was used to score dryness. Skin capacitance readings were also collected at each of these time points in duplicate, using a skin surface hydrometer manufactured by I.B.S. Co., Ltd., Japan. This measurement provides an indirect measure of moisture content in the skin. DATA ANALYSIS An analysis of variance model for the individual study data that accounted for subject- to-subject, side-to-side, and site-to-site variability and treatment effect was used in each study. These sources of variability have been noted historically (20). The general model for an observed response in this case can be expressed by: response/j•/m = la + T i + Sj + A• + P/ + SAj• + eijkl m
286 JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS where ia is the grand mean T, the effect due to the ith treatment S, the effect due to the jth treatment site A, the effect due to the side (right or left) to which the treatment is applied P, the effect due to the lth subject SA, the site-by-site interaction term and ½, an error term that includes experimental error and error due to uncontrolled factors. From this model, least-squares means for treatment effects, with associated standard errors, were estimated. Skin capacitance data were log transformed prior to analyses to stabilize the variance and correct the skewness of the distribution. A logarithmic rela- tionship between skin capacitance and moisture content had been presented previously (21). To combine the three screening studies, both unweighted (equally weighted) and weighted meta-analyses were performed on dryness and skin capacitance data according to documented techniques to estimate overall mean differences between the products (3,4). Appropriate tests for homogeneity were performed prior to pooling the data. Weights were derived from the variability associated with treatment differences for the weighted analyses. Simpler techniques, such as Fisher's technique and the inverse normal technique (4), exist for establishing overall p-values over a series of tests. However, the techniques chosen had the advantage of estimating the actual difference between treat- ments as well as estimating overall p-values, whereas the simpler procedures do not provide estimates of treatment differences. The estimates from the weighted and un- weighted analyses were compared with each other and with the results of the larger study for consistency. The protocol for meta-analysis was more straightforward than those usually encountered when pooling treatment estimates from literature sources. All the data that had been generated by the FCAT protocol comparing the two bars were available and were included in the meta-analyses. Thus, publication bias and selective inclusion of studies were not issues. Publication bias is the phenomenon whereby positive results that show treatment differences are more likely to be published than those that do not. All data were generated through a contract research organization. Each of the studies was ran- domized, blinded, and conducted in compliance with Good Clinical Practices guide- lines. Thus, the quality of the studies included in the meta-analysis is not an issue. RESULTS AND DISCUSSION Weighted meta-analysis combining data from the screening studies shown in Table I estimated the visual dryness difference between the two mild cleansing bars, denoted bars 1 and 2, respectively, to be 0.167 + 0.079 (p-value = 0.035). The p-value is the probability that a difference as large as the one observed would be detected by random Table I Day 5 Changes From Baseline in Visual Dryness Estimates From Screening Studies Clinical trial Std. Std. Mean n Bar 1 error Bar 2 error delta P-value Screening study 1 Screening study 2 Screening study 3 22 0.317 0.075 0.519 0.104 0.202 0.1189 20 0.467 0.073 0.698 0.127 0.231 0.1183 20 0.567 0.098 0.637 0.098 0.070 0.9564
Previous Page Next Page