Scale development
Dr. James W. Varni (JWV) permitted the translation of the PedsQL-I [11] into Japanese. Two researchers separately translated the original scale using an approved translation procedure [17]. We reconciled these translations into a single Japanese version, preserving the wording between this scale and the Japanese version of PedsQL Generic Core Scales [18]. This preservation is needed because consistency between PedsQL-I and the PedsQL Generic Core Scales Toddler version (PedsQL-T) is necessary for tracking HRQOL over time. A native English translator, who was blinded to the original version, back-translated the reconciled version into English. Seven Japanese health professionals (pediatric nurses, clinical psychologists, and a nurse-midwife, including bilingual/bicultural researchers) compared the back-translated version with the original English version and made minor amendments to the reconciled Japanese version to produce a pilot questionnaire [19].
Ten native Japanese-speaking parents who nurtured infants aged 1–24 months participated in the pilot test between November 2013 and January 2014 [19]. The data obtained from the pilot test were used to produce a final Japanese version of the PedsQL-I. No words or phrases required modification after the pilot test. In addition, JWV confirmed the conceptual and linguistic equivalence between the Japanese version and the original version.
Study population
Parents with children aged 1–30 months were recruited from eight day care centers and one pediatric clinic located in Tokyo using distributed flyers. Recruitment was carried out between July and September 2014. Parents whose ability to read, understand and communicate in Japanese was judged as insufficient by the day care center/pediatric clinic staff were excluded. The present study included parents with children aged 25–30 months to examine transitional validity at the age boundary.
We calculated sample size based on known-group validity. Assuming a 1:5 ratio between those with and without a disease, it was calculated that 38 infants and 190 infants, respectively, were needed to detect a moderate difference (effect size d = 0.5) with a power of 0.8 at a significance level of 0.05.
Procedure
The study collaborators distributed the questionnaires and return envelopes. Participants received a detailed information sheet informing them of the content and extent of the study, and their completing the questionnaire was taken as providing informed consent. The questionnaires were either returned by mail or collected from designated questionnaire return boxes placed in the day care centers and the pediatric clinic. Parents with children aged 1–24 months were informed of the retest procedure during the first test. Participants who consented to the retest were sent retest questionnaires in self-addressed envelopes 1–2 weeks after the initial questionnaire, asking participants to return the retest within a week of receipt. Upon completion of the study, a summary of the study results was sent to the heads of the day care centers and the pediatric clinic.
Measurement
The PedsQL-I has two age-appropriate versions, for ages 1–12 months and 13–24 months [11]. The 1–12 months version and the 13–24 months version include the same five scales (36 and 45 items, respectively): Physical Functioning (6 and 9 items, respectively), Physical Symptoms (both include 10 items), Emotional Functioning (both include 12 items), Social Functioning (4 and 5 items, respectively), and Cognitive Functioning (4 and 9 items, respectively). The PedsQL-I also includes the Physical Health Summary score (the mean of the item scores included in the Physical Functioning and Physical Symptoms Scales) and the Psychosocial Health Summary score (the mean of the item scores included in the Emotional, Social, and Cognitive Functioning Scales). Respondents are asked to describe the extent to which each item had troubled their children over the past 1 month (the PedsQL’s standard recall period). A 5-point Likert response scale is used (0 = never a problem; 1 = almost never a problem; 2 = sometimes a problem; 3 = often a problem; 4 = almost always a problem). Items are reverse-scored and linearly transformed to a 0–100 scale (0 = 100, 1 = 75, 2 = 50, 3 = 25, 4 = 0), where higher scores indicate a better HRQOL. Scale scores and the total score are computed as the sum of the items divided by the number of items answered. If more than 50% of the items are missing or incomplete, the scale score is not computed.
PedsQL-T consists of 21 items that belong to one of the four following scales: Physical Functioning (8 items), Emotional Functioning (5 items), Social Functioning (5 items), and School Functioning (3 items) [18]. The PedsQL-T also includes the Physical Health Summary score (same as the Physical Functioning scale) and the Psychosocial Health Summary score (the mean of the item scores included in the Emotional, Social, and School Functioning Scales). These scale scores and the total score are computed as the sum of all the items on the PedsQL-I in a similar manner.
Kessler-6 (K6) consists of six items and was used to screen parents for psychological distress [20]. Respondents rate how often they felt (1) nervous, (2) hopeless, (3) restless or fidgety, (4) so depressed that nothing could cheer them up, (5) that everything was an effort, and (6) worthless over the past one month. A 5-point Likert response scale was used (0 = none of the time; 1 = a little of the time; 2 = some of the time; 3 = most of the time; 4 = all of the time). Responses to the six items were summed up to yield a K6 score between 0 and 24, with higher scores indicating a greater tendency towards psychological distress.
Participants were asked about their age, gender, familial relations to their child, and economic status. They also answered the items regarding characteristics of their children (age, gender, place where a child spends daytime, and presence of acute, chronic illness and/or other disease). We also added one question to the retest questionnaire: ‘Has a significant event affecting you happened since responding to the initial questionnaire?’.
Participants with children aged 1–18 months completed the PedsQL-I and participants with children aged 19–30 months completed the PedsQL-I and PedsQL-T. All participants answered the items of K6.
Statistical analyses
All analyses were performed using IBM SPSS software, version 21 (SPSS, Inc., Chicago, IL, USA) and R version 3.2.1 (R Foundation for Statistical Computing, Vienna, Austria) [21]. The level of significance was set at 0.05.
Score distributions for the PedsQL-I were summarized as mean, standard deviation, minimum and maximum scores, and percentages of floor (0) and ceiling (100) scores, in children aged each 1–12 months and 13–24 months. We defined a high floor/ceiling effect as more than 20% and an especially-high floor/ceiling effect as more than 50%. We assessed correlations between subscales on PedsQL-I in children aged each 1–12 months and 13–24 months by calculating Pearson’s product–moment correlation coefficient.
Feasibility was determined based on the percentage of missing values. Independence of easily missed items was assessed by Cochran’s Q test.
Reliability was assessed based on internal consistency and test–retest reliability among children aged 1–24 months. Good internal consistency was defined as a Cronbach’s alpha exceeding 0.70. To determine test–retest reliability, intraclass correlation coefficients (ICC) between the initial test and retest scores in a one-way random effects model were calculated; an ICC value of 0.40 represented moderate, 0.60 good, and 0.80 high agreement [22]. A paired t test between the initial test and retest scores was used to check whether or not the PedsQL scores had changed.
Validity was assessed based on factorial validity, known-groups validity, concurrent validity, convergent and discriminant validity, and relative validity. Factorial validity and known-groups validity were assessed among children aged 1–24 months. Concurrent validity, and convergent and discriminant validity were assessed among children aged 18–24 months. Relative validity was assessed among children aged 18–30 months.
We used multi-trait analysis to assess factorial validity [23]. The Pearson’s product–moment correlation coefficients of an item of PedsQL-I with its own scale (corrected for overlap) and other scales were calculated. We used pairwise correlations because only participants with infants aged 13–24 months answered 3 items of the Physical Functioning Scale, one item of the Social Functioning Scale, and 5 items of the Cognitive Functioning Scale. We hypothesized that an item correlates more strongly with its own scale than other scales. Scaling success for any scale was also assessed as the number of convergent correlation coefficients significantly higher than the discriminant correlation coefficient divided by the total number of correlations.
To assess known-groups validity, we calculated the regression coefficients for the presence of acute or chronic illness with the Physical Health Summary, the Psychosocial Health Summary, and the total scores of the PedsQL-I using a linear mixed model. We controlled the child’s age, gender, the parent’s age, gender, and K6 score as fixed effects and the day care centers/pediatric clinic as a random effect in this model. The presence of acute illness was defined as a child’s having only acute illness, and the presence of chronic illness was defined as a child’s having chronic illness regardless of having acute illness. We hypothesized that PedsQL-I would demonstrate a lower Physical Health Summary score and total score under acute illness and a lower Physical and Psychosocial Health Summary score and total score under chronic illness. Before conducting the analysis, we checked the interaction between infant age and disease presence for both the 1–12 months and the 13–24 months version using a regression model, but the difference was not statistically significant. Therefore, we conducted this analysis using the combined versions.
To assess concurrent validity, we calculated Pearson’s product–moment correlation coefficients to confirm that the Physical Health Summary, the Psychosocial Health Summary, and the total score on the PedsQL-I were positively correlated with the same scale scores on the PedsQL-T. Correlation coefficients of 0.10 represent small, 0.30 medium, and 0.50 large correlations [24].
Convergent and discriminant validity were examined by calculating Pearson’s product–moment correlation coefficient between scales from PedsQL-I and PedsQL-T. We hypothesized that the correlation of the Physical Functioning and Physical Symptoms Scale of the PedsQL-I would be highest with the Physical Functioning Scale of PedsQL-T, the Emotional Functioning Scale of PedsQL-I with that of PedsQL-T, and the Social Functioning Scale of PedsQL-I with that of PedsQL-T.
Relative validity was assessed via the ratio of squared t values. PedsQL-I must be better able to show HRQOL differences between known-groups than PedsQL-T among infants. If PedsQL-T can differentiate HRQOL between known groups better than PedsQL-I, the suitability of PedsQL-I for infants would be questionable (because PedsQL-T can take the place of PedsQL-I). We calculated the t value of the difference of the PedsQL-I Scale scores between 19–24-month-old infants with and without any (acute, chronic, and/or other) diseases, and compared it with those of the difference of the PedsQL-T scores. A larger than 1 ratio of squared t values means that the PedsQL-I Scale is relatively-valid as HRQOL scale for infants compared to the PedsQL-T. As control experiments, we checked the ratio of squared t values among 25- to 30-month-old toddlers.
We wanted to answer the question of whether the results measured by the PedsQL-I were ready to be taken over by the PedsQL-T when an infant visiting a follow-up clinic (or participating in a longitudinal study) matured. The transitional validity was determined by estimating two linear models of the PedsQL score depending on age: the PedsQL-I Scale scores among 19–24-month-old infants and the PedsQL-T Scale scores among 25–30-month-old toddlers. From these models, we obtained two estimates of PedsQL score at 24.5 months old. We hypothesized that the two estimates coincided with each other and tested it by bootstrapping. Similarly, we also checked the score difference between 1–12-month-old and 13–24-month-old infants at 12.5 months old.