Skip to main content

Hyperhidrosis quality of life index (HidroQoL): further validation by applying classical test theory and item response theory using data from a phase III clinical trial

Abstract

Background

The Hyperhidrosis Quality of Life Index (HidroQoL ©) is a well-developed and validated patient-reported outcome measure assessing the quality-of-life impacts in hyperhidrosis with 18 items. Our aim was to extend the already existing validity evidence for the HidroQoL, especially in relation to structural validity. Especially Rasch analysis has not been applied to the final 18-item HidroQoL before.

Methods

Data from a phase III clinical trial were used. Confirmatory factor analysis was conducted to confirm the two a priori HidroQoL scales within classical test theory. Furthermore, the assumptions of the Rasch model (model fit, monotonicity, unidimensionality, local independence) and Differential Item Functioning (DIF) were assessed using item response theory.

Results

The sample included 529 patients with severe primary axillary hyperhidrosis. The two-factor structure could be confirmed by the confirmatory factor analysis (SRMR = 0.058). The item characteristic curves showed mainly optimally functioning response categories, indicating monotonicity. The overall fit to the Rasch model was adequate and unidimensionality for the HidroQoL overall scale could be confirmed, since the first factor had an eigenvalue of 2.244 and accounted for 18.7%. Local independence was below assumed thresholds (residual correlations ≤ 0.26). DIF analysis, controlling for age or gender, was critical for four and three items, respectively. However, this DIF could be explained.

Conclusion

Using classical test theory and item response theory/Rasch analyses, this study provided further evidence for the structural validity of the HidroQoL. This study confirmed several specific (measurement) properties of the HidroQoL questionnaire in patients with physician-confirmed severe primary axillary hyperhidrosis: the HidroQoL is a unidimensional scale allowing the summation of scores to generate a single score, and simultaneously it has a dual structure, also allowing the calculation of separate domain scores for daily activities and psychosocial impacts. With this study, we provided new evidence of the structural validity of the HidroQoL in the context of a clinical trial.

Trial registration The study was registered (ClinicalTrials.gov identifier: NCT03658616, 05 September 2018, https://clinicaltrials.gov/ct2/show/NCT03658616?term=NCT03658616&draw=2&rank=1).

Background

Hyperhidrosis (HH) is a clinical condition causing excessive sweating that exceeds the physiological needs of the person concerned [1]. This condition can either be classified as primary HH due to an overactivity of the sympathetic nerves or as secondary if the excessive sweating results from a medical condition or the consumption of medications [2]. In the US, approximately 2.8% of the population are affected by this condition, half of which suffer from axillary hyperhidrosis. Furthermore, more than 10% of four million individuals affected by axillary HH rated their disease as intolerable and stated that it interferes with their day to day activities [3]. HH can range from dampness of parts of the body to severe dripping and therefore, this condition possibly has a substantial impact on the patient’s life [2] and can be detrimental to the patients’ social, psychological, professional, and physical well-being [4].

These individual impacts can be captured using Patient-Reported Outcome Measures (PROMs) which are self-completed questionnaires capturing the individual perspective of the patients themselves rather than their physicians. As there are many PROMs regarding hyperhidrosis, Gabes et al. [5] conducted a systematic review of the quality of existing PROMs. As a result, three PROMs were rated as category A meaning that these questionnaires have sufficient measurement properties and that they can be recommended for future use. These three PROMs were the Hyperhidrosis Questionnaire (HQ) [6], the Sweating Cognitions Inventory (SCI) [7] and the Hyperhidrosis Quality of Life Index (HidroQoL) [8, 9]. Of these three PROMS, the HidroQoL proved to be the most convincing in the systematic review, as it had a higher level of evidence for content validity (moderate) and internal consistency (high) than the HQ and SCI. Its strong measurement properties were also supported in terms of structural validity, reliability, construct validity, and responsiveness, all of which received sufficient ratings and high-quality evidence and were based on larger study populations. In this study, we focused on the HidroQoL and aimed to evaluate its psychometric properties (especially the structural validity) in patients with primary axillary HH, thereby extending existing validity evidence [9, 10]. Modern test theory, especially Rasch analysis, has not been performed on the final 18-item HidroQoL before.

Patients and methods

In a phase III (a/b) clinical trial, patients with primary axillary HH were asked to complete the HidroQoL at several timepoints (baseline, after 4, 8, 12, 28, 52, and 72 weeks). This clinical trial investigated the effects of a topical cream containing 1% glycopyrronium bromide for which safety and efficiency was reported recently [11]. Ethical approval was obtained by the corresponding ethics committees of the different countries and the study was registered (ClinicalTrials.gov identifier: NCT03658616). It was a multi-national (UK, Sweden, Denmark, Germany, Poland and Hungary), multi-center (n = 37) trial [11]. The study was sponsored by Dr. August Wolff GmbH & Co. KG Arzneimittel.

Data from this clinical trial (phase III a) have already been used for previous validation analyses [10]. In the validation analyses of this manuscript, the baseline (pooled) data of the phase III b clinical trial was used for the assessment of structural validity since high sample sizes are required when performing Rasch analysis. The manuscript was prepared in accordance with the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) Reporting guideline for studies on measurement properties of PROMs (Appendix 1) [12].

The HidroQoL

The HidroQoL was developed in 2014 with qualitative patient and expert input. An 18-item questionnaire with three response options resulted. Following qualitative development, the initial validation was based on two observational studies. Overall, the instrument showed very good measurement properties supporting its use in clinical practice in order to assess the impact of HH on Quality of Life (QoL) [8,9,10].

For this reason, the HidroQoL is a well-developed and validated PROM, which measures the QoL impacts in HH. With two main domains and 18 items in total, the questionnaire is short enough to exclude irrelevant topics but still is able to comprehensively assess the impact that HH has. The first domain with six items evaluates the impact of the condition on daily life activities (such as hobbies). The second domain captures the psychosocial life of the affected individuals (such as personal relationships) (see Fig. 1). The participants can choose between three response options (0: No, not at all; 1: A little; 2: Very much). The items are considering the past seven days. The total score ranges from 0 to 36. Before the clinical trial started, different language versions of the HidroQoL have been linguistically validated including forward–backward-translations and cognitive debriefing.

Fig. 1
figure 1

Main domains of the Hyperhidrosis Quality of Life Index (HidroQoL)

Data analysis

Psychometric analyses based on classical test theory (CTT) and item response theory (IRT) analyses including Rasch analysis were performed to evaluate the structural validity and other psychometric properties of the HidroQoL. All statistical analyses were performed using IBM SPSS Statistics 25, MPlus 8.4 software (Muthen & Muthen, Los Angeles, CA), SAS 9.4 and Winsteps.

Distribution of responses

In order to evaluate whether data is missing at random, the pattern of missing data was assessed. Furthermore, we investigated floor and ceiling effects.

Using CTT: confirmatory factor analysis

Confirmatory factor analysis (CFA, using maximum likelihood as an estimator) was performed to verify the two a priori scales of the HidroQoL. According to the COSMIN initiative, the structural validity of an instrument is rated as sufficient if one of the following requirements is met: either the Comparative Fit Index (CFI) or Tucker–Lewis Index (TLI) is > 0.95 or the Root Mean Square Error of Approximation (RMSEA) is < 0.06 or the Standardized Root Mean Square Residual (SRMR) is < 0.08 [13].

Using IRT: analysis of the response categories and performing Rasch analysis

We also performed Rasch analysis in order to confirm unidimensionality, and to determine the model fit, the monotonicity, the local independence and the Differential Item Functioning (DIF). In general, a minimum of ten observations per category per item is recommended in order to reach a stable estimation of category thresholds [14]. It is necessary to include a large and heterogeneous sample of patients reflecting varying levels of disease severity (based on their Hyperhidrosis Disease Severity Scale (HDSS)-Score). If this requirement is fulfilled, it is ensured that the respondents reflect the entire continuum of the construct (from the highest possible QoL impairment to the minimum possible impairment). Therefore, it was aimed to recruit a sample of at least 243 participants in order to achieve precision even in heavily skewed data [14, 15].

According to the COSMIN criteria for a sufficient structural validity rating the following requirements must be fulfilled: an adequate model fit, no violation of monotonicity, unidimensionality and local independence [13].

Model fit

Model fit was assessed for the entire scale, the individual items, and the persons. An overall model fit is reflected by a mean fit residual value of 0 and a standard deviation (SD) of 1–1.5 [15, 16]. For the individual item and person level, infit and outfit mean squares were analyzed. These should be ≥ 0.5 to avoid overfit (redundancy) and ≤ 1.5 to avoid underfit (too much measurement error) [13]. To check the adequacy in spread of the items along the breadth of the latent variable the item-person map was visually examined. Ideally, there should be no large gaps between items [17] and the mean location of persons should be close to 0 to match the item mean location centered at 0 logits [18]. Furthermore, the Person Separation Index (PSI) was calculated to assess the ability of the instrument to differentiate persons according to disease severity. Here, a PSI of 0.8 reflects capability to reliably distinguish patients into at least two groups of severity [17, 19].

Monotonicity

We created item characteristics curves (ICCs) in order to assess the functioning of the response options. Here, the category thresholds should monotonically increase with the category and each response category should have a distinct peak on the graph [14, 15].

Unidimensionality

Unidimensionality was assessed conducting a principal component analysis (PCA) on the residuals of the Rasch model regression. For unidimensionality, the first factor must not account for more than 30% of the variance in the data and must have an eigenvalue of 3 or less [14]. Furthermore, unidimensionality refers to a factor analysis per subscale. Therefore, the CFA was carried out for the HidroQoL as a single scale. Unidimensionality is not violated if the CFI or TLI is > 0.95 or the RMSEA is < 0.06 or the SRMR is > 0.08 according to the COSMIN criteria [13].

Local independence

For testing the local independence, the correlation matrix of the item residuals was examined. A violation of this assumption is reflected by residual correlations exceeding 0.2–0.3 [15, 20].

Differential item functioning (DIF)

DIF was assessed for the key demographic factors gender and age using a two-way ANOVA test. DIF by country was not assessed given the huge difference in sample sizes (e.g. Germany: n = 156 vs. United Kingdom: n = 10). Invariance testing on small sample sizes was considered as problematic. For a significant DIF on an item the probability must be ≤ 0.05 and the difference in the item difficulty must exceed 0.43 logits. Based on these thresholds, we used the following categorization of DIF sizes (Table 1) [21].

Table 1 Categorization of the DIF sizes

Results

The sample consisted of n = 529 participants with severe primary axillary hyperhidrosis, represented by a HDSS score of 3 or 4. Of these, 283 of the patients were female (53.5%) and 246 were male (46.5%). The mean age of the study participants was 35.61 years (SD = 11.68), with a median of 33 years. The age range was from 18 to 65 years.

Distribution of responses

Only item 18 had a single missing entry at baseline. The test for normal distribution over the subjects' sum scores was significant, indicating a left-skewed distribution with some ceiling effects. These effects can be explained as a result of the homogeneous study population, which included only patients with severe hyperhidrosis. The percentage of participants selecting the highest response option (very much) across the items ranged from 16.8 to 88.7%.

Using CTT: confirmatory factor analysis

Confirmatory factor analysis confirmed the a-priori assumed two-factor structure of the HidroQoL. With a value of 0.058, SRMR fulfilled the COSMIN criteria and thus supported sufficient structural validity. Other key values are listed in Table 2.

Table 2 Goodness-of-fit indices obtained by the confirmatory factor analyses at baseline (n = 528)

Using IRT: analysis of the response categories and performing Rasch analysis

Model fit

Overall model fit was adequate with a mean fit residual of 0 (SD = 1.37). The mean squares of infit and outfit presented in Table 3 were above 0.5 (infit) and below 1.5 (outfit) for all items, indicating that the items are not redundant with each other. The correlations and expected correlations were close to each other. Thus, an adequate model fit for the HidroQoL overall scale was given. Visual examination using the person-item map (Fig. 2) revealed an adequate spread of item difficulty centered around zero on the scale. The plot of the person measures, however, reflected the left-skewed distribution of the data. Person measures and item difficulty were thus slightly shifted against each other along the scale, which implies that persons with higher severity of hyperhidrosis in our sample might not be well differentiated by the questionnaire.

Table 3 Infit and outfit mean square for the 18 items of the HidroQoL (Baseline)
Fig. 2
figure 2

Person-item map

Monotonicity

All items of the HidroQoL showed adequate looking graphs, indicating optimally functioning response categories (Fig. 3). However, it should be noted that for both item 1 (“My choice of clothing is affected”) and item 10 (“I feel uncomfortable physically expressing affection (e.g. hugging)”), the recommended minimum number of ten observations per response category was not reached for the lowest category in either case. Item 8 (“I feel embarrassed”) is slightly ambiguous, as the middle response category could not be assigned a distinct range along the scale. Overall, the criterion of monotonicity can be assumed.

Fig. 3
figure 3

Item characteristic curves for items 1–18

Unidimensionality

The criterion of unidimensionality was fulfilled, as demonstrated by the PCA of the residuals. The first component of the PCA had an eigenvalue of 2.244 and accounted for 18.7% of the variance. In addition, structural validity was demonstrated by confirmatory factor analysis for the HidroQoL overall scale. With a CFI of 0.811, a TLI of 0.786, an RMSEA of 0.099, and an SRMR of 0.063, the scale met at least one COSMIN criterion for sufficient structural validity.

Local independence

Regarding local independence, residual correlations were above 0.2 in five cases, however, above 0.3 in no case. Correlations and corresponding items are shown in Table 4.

Table 4 Largest standardized residual correlations

In terms of reliability, the HidroQoL achieved a PSI of 0.85 and showed a person separation of 2.36, meaning that the PROM was able to differentiate between at least two statistically significantly different severity groups. Item separation had a value of 12.99 with a high item reliability of 0.99, reflecting almost 13 levels of item difficulty in the data. Thus, the difficulty hierarchy of the items could be verified as an indicator of the construct validity of the instrument.

Differential item functioning (DIF)

Finally, when comparing subjects by gender, four items (Item 1, Item 8, Item 9 and Item 15) showed differential item functioning. Items 1 (“My choice of clothing is affected”), 8 (“I feel embarrassed”), and 9 (“I feel frustrated”) with a DIF contrast of 0.70, 0.77, and 0.60, respectively, had a moderate to large impact (category C). Item 15 (“I avoid meeting new people”) with a DIF contrast value of − 0.48 was in category B, indicating a mild to moderate impact. It referred to avoiding new people due to the condition and was more symptomatic for men. Before testing for differential item functioning for age, we divided the sample into two subgroups. The discriminatory criterion was the medium age of 33 years in order to divide the sample in two groups, one representing younger patients and one composed of adults and older patients. Both groups accounted for approximately 50% of the original sample and were thus suitable for a valid comparison of the response behavior of these subgroups. When testing for differential item functioning for age three items were significant. Item 7 (“I feel nervous”) and Item 12 (“I worry about my future health”) were both in category B with a DIF contrast of − 0.54 and 0.53, respectively. Item 14 (“I worry about leaving sweat marks on things”) had a DIF contrast of − 1.12 (category C).

Discussion

Applying CTT, we were able to reconfirm the two-factor structure of the HidroQoL. Moreover, structural validity was supported by further psychometric analyses using IRT and the Rasch model, which mainly reflected adequate fit. The dual structure of the HidroQoL allows on the one hand, the questionnaire to be interpreted as a measure of a unidimensional underlying construct, namely the quality of life of affected individuals. On the other hand, each domain can be further explored to investigate the differential impact of hyperhidrosis. DIF was found when controlling for gender and age for few items. However, this DIF could be explained in terms of content: for DIF by gender, the Items 1 (“My choice of clothing is affected”), 8 (“I feel embarrassed”), and 9 (“I feel frustrated”) showed a moderate to large impact. For all three items, there was a tendency for women to choose higher response categories, which can possibly be explained by the fact that women attach greater importance to external appearance than men, and consequently a more negative perception of it affects them more strongly [22]. Testing DIF by age, also resulted in three items having a significant impact (Item 7 (“I feel nervous”) and Item 12 (“I worry about my future health”), and Item 14 (“I worry about leaving sweat marks on things”)). For all three items, younger patients more likely selected the higher response options. This is especially not surprising for item 12, since the tendency to worry about one’s future health generally tends to decrease with age [23] and which might be indirectly related to item 7 and item 14, even though it is often underestimated how much importance young people attribute to their health [24]. According to Douglas and colleagues [25], there are two different types of DIF: adverse DIF occurs, when the probability of endorsing an item is different between groups because of artifacts in the measurement instrument, such as different understandings of the wording of items. This type of DIF represents a measurement error, since it is a bias in the measurement process. However, the second form of DIF does not represent a measurement error. Benign DIF occurs, when the varying probabilities of endorsing an item are governed by something other than the (dimension of the) construct measured by the instrument, such as belonging to a certain age group. Since this type of DIF reflects real differences in the underlying (dimension of the) construct and not different understandings of the wording for example, benign DIFs are not harmful to the measurement accuracy of the instrument. As described above, we could explain the DIFs between the different groups based on evidence and we were able to find real differences (e.g. greater importance to external appearance for women). Thus, we suppose that these reported DIFs can be categorized into the benign type and do not represent measurement errors. Thus, since all the relevant DIF could be explained, the DIF is unlikely to affect the reliability of the PROM.

The results of this study are in line with the findings presented by Kamudoni et al. [9], the initial validation of the HidroQoL, and those of Gabes et al. [10], who conducted further validation and clinical application of the HidroQoL. Gabes et al. [10] hereby used data from a randomized controlled phase III a trial. As the study progressed with the phase III b trial, this dataset was subsequently examined and analyzed in this work. A comparison of the results reported in the different studies on the measurement properties of the HidroQoL can be found in Table 5. It shows that the evidence for the good measurement properties was replicated at least once in different studies and that the findings of the three studies for each measurement property of the HidroQoL are complementary and mutually reinforcing.

Table 5 Overview of the measurement properties assessed for the HidroQoL

Strengths and limitations

With a sample size of n = 529 for the Rasch analysis, the requirements for a very good rating according to the COSMIN guidelines (sample size ≥ 200) were fulfilled [26]. Besides the large sample size, another strength of this study was the almost complete absence of missing data, reflecting a high motivation of the study participants to respond to the HidroQoL and indicating the ease of understanding and feasibility of the questionnaire. As limitations of this study one could mention the inclusion criteria, with patients reporting an HDSS of 3 or 4 only, indicating severe hyperhidrosis, as in the previous paper on the phase III-a part of the study [10]. Kamudoni and colleagues [8] did also include patients with an HDSS score ≥ 2, although eventually the majority of the sample were patients with an HDSS score of 3–4. Furthermore, in this study, DIF by country could not be assessed due to very large differences in the sample sizes of the various countries. Thus, in future studies, DIF by country or language should be investigated in order to broaden the validity evidence of the HidroQoL.

Additionally, significant DIF regarding age and gender was found. This can possibly affect the validity of the HidroQoL, since the response to the items showing DIF is governed by something other than the underlying construct health-related QoL. One common solution is to remove the items showing DIF from the questionnaire in order to preserve its validity. Nevertheless, the HidroQoL is a well-established and much used questionnaire in the clinical assessment of hyperhidrosis. Removal of items always needs to be balanced against maintaining a questionnaire in its original format enabling standardized assessment and comparability. For this reason, we refrained from removing these items right now. If future research also reports DIF in the same items, removal should be considered again since they can detrimentally affect the validity of the HidroQoL.

In this study, we could confirm the unidimensionality of the HidroQoL, as well as an underlying two-factor structure. This might be confusing, since both findings do not seem to align with each other. The HidroQoL as a whole scale is unidimensional (meaning that the HidroQoL has one underlying construct: health-related QoL) allowing the calculation of a sum score (confirmed by Rasch analysis and CFA). At the same time, it has two subscales (daily life activities and psychosocial domain) that are capturing different aspects of health-related QoL (confirmed by CFA). Both approaches aim for a different construct of hyperhidrosis impacts, and are not exclusive. They are based on different levels (two-factor solution: lower level, i.e. daily life activities or psychosocial impact versus unidimensionality: higher level, i.e. health-related QoL). The single factor solution (based on the Rasch analysis and CFA) seemed to be a more robust factor extraction approach than the two-factor solution (based solely on CFA), since in the first approach, the factor solution could be confirmed by CFA and PCA. Additionally, this study and the development study reported a correlation of the two subdomains of 0.651 and 0.645, respectively [8]. These correlations can be seen as an indicator that the unidimensionality might be more robust for this PROM than the two-factor solution. Nevertheless, Kamudoni also confirmed both solutions (one factor and two factors) with Rasch analysis and CFA, respectively [8]. Therefore, our results are in line with the previous research on this topic and both solutions could be confirmed twice. Nonetheless, the fit index we used to assess the CFA (SRMR) may not be the best one for this analysis. Unfortunately, the other fit indices did not pass the proposed thresholds. Thus, since only the SRMR reached the proposed threshold, the model has a reasonable fit to the data on only one aspect of the model’s fit. We suggest analyzing this two-factor solution in future studies and including different fit indices, such as CFI or TLI, in order to report more robust results regarding the two-factor structure. Additionally, in retrospect, other estimators than the ML might have been more suitable for ordinal data. Thus, we recommend for future studies calculating the CFA based on another estimator (i.e. weighted least square mean and variance adjusted (WLSMV) or diagonally weighted least squares (DWLS)), since this may lead to better fit indices and a better fit overall.

In summary, our study extends existing evidence on the measurement properties of the HidroQoL regarding structural validity, based on data from a large clinical trial in people with confirmed primary axillary hyperhidrosis. According to the COSMIN methodology, PROMs can be placed in the highest recommendation category A if evidence of sufficient content validity and at least low-quality evidence of sufficient internal consistency is provided. In addition, sufficient internal consistency requires at least low evidence of sufficient structural validity [26]. In this study, sufficient structural validity according to CTT and, for the first time, also according to IRT/Rasch could be confirmed. The criteria of content validity and internal consistency were demonstrated elsewhere [8, 10]. Thus, overall the HidroQoL questionnaire can be recommended for further use in clinical trials.

Conclusion

Strong evidence supporting the conceptual structure and scoring approaches is fundamental to valid use of a PROM. The structural validity of the HidroQoL has been established in prior research. Using CTT and additional IRT/Rasch analyses, this study provided new evidence for the structural validity of the HidroQoL questionnaire using data from a phase III-b trial, thus helping to fill an important evidence gap for the HidroQoL. Overall, our findings support the dual structure of the HidroQoL allowing summation of scores to generate a single score, as well as calculation of separate domain scores for daily activities and psychosocial impacts. The findings are consistent with results of previous validation studies. Significant DIF was found which needs to be evaluated further in future studies.

Availability of data and materials

The data used and analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

CFA:

Confirmatory factor analysis

CFI:

Comparative fit index

CI:

Confidence interval

COSMIN:

Consensus-based Standards for the selection of health Measurement INstruments

CTT:

Classical test theory

DIF:

Differential item functioning

DLQI:

Dermatology life quality index

EFA:

Exploratory factor analysis

HDSS:

Hyperhidrosis disease severity scale

HH:

Hyperhidrosis

HidroQoL:

Hyperhidrosis quality of life index

HQ:

Hyperhidrosis questionnaire

ICCs:

Item characteristics curves

IRT:

Item response theory

MNSQ:

Mean squares

PCA:

Principal component analysis

PROMs:

Patient-reported outcome measures

PSI:

Person separation index

QoL:

Quality of life

RMSEA:

Root mean square error of approximation

SCI:

Sweating cognitions inventory

SD:

Standard deviation

SRMR:

Standardized root mean square residual

TLI:

Tucker–Lewis index

References

  1. Stolman LP (1998) Treatment of hyperhidrosis. Dermatol Clin 16(4):863–869. https://doi.org/10.1016/S0733-8635(05)70062-0

    Article  CAS  PubMed  Google Scholar 

  2. Doolittle J, Walker P, Mills T, Thurston J (2016) Hyperhidrosis: an update on prevalence and severity in the United States. Arch Dermatol Res 308(10):743–749. https://doi.org/10.1007/s00403-016-1697-9

    Article  PubMed  PubMed Central  Google Scholar 

  3. Strutton DR, Kowalski JW, DA PharmD G, Stang PE (2004) US prevalence of hyperhidrosis and impact on individuals with axillary hyperhidrosis: results from a national survey. J Am Acad Dermatol 51(2):241–248. https://doi.org/10.1016/j.jaad.2003.12.040

    Article  PubMed  Google Scholar 

  4. Hamm H (2014) Impact of hyperhidrosis on quality of life and its assessment. Dermatol Clin 32(4):467–476. https://doi.org/10.1016/j.det.2014.06.004

    Article  CAS  PubMed  Google Scholar 

  5. Gabes M, Knüttel H, Kann G, Tischer C, Apfelbacher CJ (2021) Measurement properties of patient-reported outcome measures (PROMs) in hyperhidrosis: a systematic review. Qual Life Res. https://doi.org/10.1007/s11136-021-02958-3

    Article  PubMed  PubMed Central  Google Scholar 

  6. Kuo C-H, Yen M, Lin P-C (2004) Developing an instrument to measure quality of life of patients with hyperhidrosis. J Nurs Res. https://doi.org/10.1097/01.jnr.0000387485.78685.1e

    Article  PubMed  Google Scholar 

  7. Wheaton MG, Braddock AE, Abramowitz JS (2011) The sweating cognitions inventory: a measure of cognitions in hyperhidrosis. J Psychopathol Behav Assess 33(3):393–402. https://doi.org/10.1007/s10862-010-9211-8

    Article  Google Scholar 

  8. Kamudoni P (2014) Development, validation and clinical application of a patient-reported outcome measure in hyperhidrosis: the hyperhidrosis quality of life index (HidroQoL ©). Cardiff University, Cardiff

    Google Scholar 

  9. Kamudoni P, Mueller B, Salek MS (2015) The development and validation of a disease-specific quality of life measure in hyperhidrosis: the hyperhidrosis quality of life index (HidroQOL©). Qual Life Res 24(4):1017–1027. https://doi.org/10.1007/s11136-014-0825-2

    Article  CAS  PubMed  Google Scholar 

  10. Gabes M, Jourdan C, Schramm K, Masur C, Abels C, Kamudoni P, Salek S, Apfelbacher C (2021) Hyperhidrosis quality of life index (HidroQoL©): further validation and clinical application in patients with axillary hyperhidrosis using data from a phase III randomized controlled trial. Br J Dermatol 184(3):473–481. https://doi.org/10.1111/bjd.19300

    Article  CAS  PubMed  Google Scholar 

  11. Masur C, Soeberdt M, Kilic A, Knie U, Abels C (2020) Safety and efficacy of topical formulations containing 0·5, 1 and 2% glycopyrronium bromide in patients with primary axillary hyperhidrosis: a randomized, double-blind, placebo-controlled study. Br J Dermatol 182(1):229–231. https://doi.org/10.1111/bjd.18234

    Article  CAS  PubMed  Google Scholar 

  12. Gagnier JJ, Lai J, Mokkink LB, Terwee CB (2021) COSMIN reporting guideline for studies on measurement properties of patient-reported outcome measures. Qual Life Res 30(8):2197–2218. https://doi.org/10.1007/s11136-021-02822-4

    Article  PubMed  Google Scholar 

  13. Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HCW, Terwee CB (2018) COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res 27(5):1147–1157. https://doi.org/10.1007/s11136-018-1798-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Linacre JM (1999) Investigating rating scale category utility. J Outcome Meas 3(2):103–122

    CAS  PubMed  Google Scholar 

  15. Kamudoni P, Nutjaree J, Salek S (2018) Living with chronic disease: measuring important patient-reported outcomes. Springer, Singapore

    Book  Google Scholar 

  16. Shea TL, Tennant A, Pallant JF (2009) Rasch model analysis of the depression, anxiety and stress scales (DASS). BMC Psychiatr 9(1):21. https://doi.org/10.1186/1471-244X-9-21

    Article  Google Scholar 

  17. Wright BD, Masters GN (1982) Rating scale analysis. MESA press, San Diego

    Google Scholar 

  18. Gorecki C, Lamping DL, Nixon J, Brown JM, Cano S (2012) Applying mixed methods to pretest the pressure ulcer quality of life (PU-QOL) instrument. Qual Life Res 21(3):441–451. https://doi.org/10.1007/s11136-011-9980-x

    Article  CAS  PubMed  Google Scholar 

  19. Bond TG, Fox CM, Lacey H (Hrsg) (2007) Applying the Rasch model: Fundamental measurement. Citeseer

  20. Andrich D, Humphry SM, Marais I (2012) Quantifying local, response dependence between two polytomous items using the rasch model. Appl Psychol Meas 36(4):309–324. https://doi.org/10.1177/0146621612441858

    Article  Google Scholar 

  21. Zwick R, Thayer DT, Lewis C (1999) An Empirical bayes approach to Mantel-Haenszel DIF analysis. J Educ Meas 36(1):1–28. https://doi.org/10.1111/j.1745-3984.1999.tb00543.x

    Article  Google Scholar 

  22. Quittkat HL, Hartmann AS, Düsing R, Buhlmann U, Vocks S (2019) Body dissatisfaction, importance of appearance, and body appreciation in men and women over the lifespan. Front Psych 10:864. https://doi.org/10.3389/fpsyt.2019.00864

    Article  Google Scholar 

  23. Basevitz P, Pushkar D, Chaikelson J, Conway M, Dalton C (2008) Age-related differences in worry and related processes. Int J Aging Human Dev 66(4):283–305. https://doi.org/10.2190/AG.66.4.b

    Article  Google Scholar 

  24. Levenson PM, Morrow JR, Pfefferbaum BJ (1984) Attitudes toward health and illness: a comparison of adolescent, physician, teacher, and school nurse views. J Adolesc Health Care 5(4):254–260. https://doi.org/10.1016/S0197-0070(84)80127-8

    Article  CAS  PubMed  Google Scholar 

  25. Douglas JA, Roussos LA, Stout W (1996) Item-bundle DIF hypothesis testing: identifying suspect bundles and assessing their differential functioning. J Educ Meas 33(4):465–484. https://doi.org/10.1111/j.1745-3984.1996.tb00502.x

    Article  Google Scholar 

  26. Mokkink LB, Prinsen C, Patrick DL, Alonso J, Bouter LM, de Vet HC, Terwee CB, Mokkink L (2018) COSMIN methodology for systematic reviews of patient-reported outcome measures (PROMs). User Manual 78(1):6–3

    Google Scholar 

Download references

Acknowledgements

We are grateful to the investigators, the study participants and FGK Clinical Research GmbH (Munich, Germany) for providing the data collected throughout the clinical trial.

Funding

Open Access funding enabled and organized by Projekt DEAL. The study was funded by a grant from Dr. August Wolff GmbH & Co. KG Arzneimittel.

Author information

Authors and Affiliations

Authors

Contributions

TD: Data analysis and interpretation, drafting the article, final approval of the version to be published. CA: Conception or design of the work, critical revision of the article, final approval of the version to be published. GK: Data analysis and interpretation, drafting the article, final approval of the version to be published. CM: Data collection, final approval of the version to be published. PK: Critical revision of the article, final approval of the version to be published. SS: Critical revision of the article, final approval of the version to be published. CA: Critical revision of the article, final approval of the version to be published. MG: Conception or design of the work, data analysis and interpretation, drafting the article, final approval of the version to be published. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Christian Apfelbacher.

Ethics declarations

Ethics approval and consent to participate

Ethical approval was obtained by the corresponding ethics committees of the different countries.

Consent for publication

Not applicable.

Competing interests

Christian Apfelbacher has received institutional funding from Dr. August Wolff GmbH & Co. KG Arzneimittel, and consultancy fees from Dr. August Wolff GmbH & Co. KG Arzneimittel, LEO Pharma, RHEACELL and Sanofi Genzyme. Clarissa Masur is and Christoph Abels was an employee of Dr. August Wolff GmbH & Co. KG Arzneimittel. Paul Kamudoni is a developer of the HidroQoL and is an employee of Merck Healthcare KgaA. Sam Salek is a joint developer of the HidroQoL, has received unrestricted educational grants from GlaxoSmithKline, European Hematology Association, Novartis, Bristol Meyer Squib, Sanofi, Celgene and consultancy fees from Pfizer and Agios.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1: COSMIN reporting guideline for studies on measurement properties of PROMs

Appendix 1: COSMIN reporting guideline for studies on measurement properties of PROMs

General reporting recommendations relevant for all studies on measurement properties

Item number

Item name

Item description

Location(s) reported in the manuscript

Report section: title

T1

Patient reported outcome measure (PROM)

The name of the PROM instrument(s) (and version if relevant) being studied

Line 1 of the title

T2

Measurement property (MP)

What MPs are being studied or more generally, that MPs are being studied (if there are many properties being investigated, for example)

Line 1–2 of the title

T3

Study sample

General description of relevant study sample characteristics (e.g., condition of interest, language) and also any intervention or exposure (e.g., treatments) if applicable

Line 2 of the title

Report section: abstract

A1

PROM

The name of the PROM instrument(s) (and version if relevant) being studied (i.e. the SF-36 or SF-12; language version) or if it concerns an item bank (e.g., PROMIS instruments). The type of instrument (e.g. a self reported questionnaire or interview)

Beginning of the section “Background” in abstract

A2

Measurement property

What MPs are being studied or more generally, that MPs are being studied (if there are many properties being investigated, for example)

Middle to end of the section “Background” in abstract

A3

Design

The type of study being used to test the properties (e.g., testretest design, longitudinal study, cohort, cross sectional, case series, randomized etc.). Other details of the study design if relevant (intervention/exposure, description of comparison instruments, outcomes other than PROMs)

Section “Methods” in abstract

A4

Sample

Inclusion/exclusion criteria. General description of relevant study sample characteristics (e.g., condition of interest, geographic location, language, other relevant demographic and baseline characteristics)

Beginning of the section “Results” in abstract

A5

Methods

A brief description of the methods for investigating each MP including statistical analyses

Section “Methods” in abstract

A6

Results

The main results for all MPs investigated reporting statistics for each result with measures of precision where appropriate

Section “Results” in abstract

A7

Discussion/conclusions

A brief description of the results in the context of existing evidence, main strengths and drawbacks and the need for future research on the PROM(s) investigated

Section “Conclusion” in abstract

Report section: introduction

I1

Name and describe the PROM of interest

Specify the name, type, language, and version of the PROM being investigated and how it was developed. Describe the construct the PROM aims to measure and its subscales; describe the structure of the PROM (e.g., the number of factors, the number of items, scoring algorithm); describe relevant instructions (like time period), and number or type of response categories. State whether the PROM is based on a reflective or formative model. Note: This information may also appear in the methods section in greater detail

Middle to end of the section “Background

I2

Target population

Describe the specific target population that the PROM was designed for. The authors need to provide the appropriate and necessary characteristics of this population

End of the section “Background

I3

Citation for the original development of the PROM

The citation for the original development paper(s) should be provided and other highly relevant citations related to the quality of the specific PROM under investigation

Middle of the section “Background

I4

State of knowledge & rationale

A description of the current scientific knowledge (what is known) regarding the MPs of? the PROM under investigation. The authors should provide a literature review or refer to a recent review of all existing evidence of the specific version (e.g., language, short 6 form) of the PROM and explain why the new study is necessary and important. The rational for the current proposed study should be given

Middle to end of the section “Background

I5

Definitions

Specialized terms should be defined or explained

Not available

I6

Objectives and hypotheses

State the specific objective(s) of the research and hypotheses related to the specific PROM under investigation

End of the section “Background

Report section: general methods

GM1

Study design

State the key elements of the study design

Beginning of the section “Patients and Methods

GM2

Participants

State how the participants were chosen; the inclusion and exclusion criteria. (e.g., if a PROM for a specific condition, then the eligibility and selection criteria should reflect this)

Beginning of the section “Patients and Methods

GM3

PROM administration

An explicit description of how and when the PROM(s) were administered (e.g., in what setting) including data collection devices/system used (e.g. paper based, electronic administration/ePRO) should be provided

Beginning of the section “Patients and Methods

GM4

Data collection procedures

Provide information about other data collection, exposure methods (e.g., allocation to interventions) and time points/follow-up points

Beginning of the section “Patients and Methods

GM5

Power/sample size calculation

Provide a power calculation for all MP analyses. Alternatively, if a rule of thumb is used, state it and the source/citation

Not available

GM6

Statistical analyses

Statistical analyses and tests corresponding to all hypotheses or objectives for all MPs should be reported. Where appropriate, a cut-off for statistical significance should be reported (e.g., p-value less than 0.05). A description of all statistics to be used to estimate the magnitude and direction of effect should also be reported, together with measures of variability or precision. Report statistical package used

Section “Data analysis

GM7

Missing data

State approaches or plan for dealing with missing data

Section “Distribution of responses” in “Patients and Methods

GM8

Post hoc analysis

The report should specify analyses that used data after the data collection period concluded (i.e., if the analyses were post hoc; secondary data analyses) and describe the rationale for any post hoc analyses

Not available

Report section: general results

GR1

Missing data

The amount and reasons for missing data should be explained for all analyses for all PROMs (or other outcome measurement instruments) and relevant groups

Section “Distribution of responses” in “Results

GR2

Participant/patient characteristics

The study patients’ characteristics should be described, including baseline PROM scores

First abstract in the section “Results

GR3

Sample size

If one study contained analyses using different sample sizes, the authors should report the sample size for each analysis

First line in the section “Results

Report section: discussion

D1

MP evidence

Per measurement property the authors should compare the result to the criteria for good measurement properties (e.g., COSMIN criteria) [27], and determine if the specific MP is sufficient or not. Note: This information may also appear in the results section in greater detail in a table for example

Last abstract before the section “Strengths and limitations”, Table 5

D2

Practical relevance

The authors need to discuss the practical relevance of the findings

Last abstract of the section “Strengths and limitations

D3

Strengths and limitations

Strengths and limitations of the study should be discussed. For example, discuss if there were any significant potential biases in the study that could have impacted the results

Section “Strengths and limitations

D4

Generalizability

Generalizability issues related to the PROM results should be discussed. For example, discuss if the results could be generalized to other populations given the sample studied

Last abstract before the section “Strengths and limitations”, Table 5

D5

Instrument changes

Discuss the need for modifications to the existing PROM or new 7 PROM development. If you conclude that one of the measurement properties is insufficient, you could suggest some modification, or if it is really poor, you could suggest stopping use of the PROM (in the specific population or in general)

Not available

D6

Future Research

Report specifically the type of research needed to answer new questions arising out of these findings for the particular MP and PROM investigated

Middle abstract of the section “Strengths and limitations

Report section: conclusions

C1

Conclusions

State the overall conclusions for each MP and of the use PROM investigated

Section “Conclusion

Report section: other information

O1

Conflict of interest

State any relevant conflict of interest related to the PROM under investigation (e.g., an author being the PROM developer, funding body etc.)

See Title Page

Specific reporting recommendations for studies on structural validity

Item number

Item name

Item description

Location(s) reported in the manuscript

SV1

Factor analyses: Classical Test Theory (CTT) PROMs

Report details of the methods and results for any exploratory or confirmatory factor analyses. State the rational for any explorative factor analyses (e.g., no clear a priori hypotheses). For CFA, describe and justify the factor structure of tested models. Methods and results for checking of the assumptions should be described, the method of estimation, goodness-of-fit statistics and cut-off points for good model fit, including factor loadings of best-fitting model

Section “Using CTT: Confirmatory factor analysis” in “Patients and Methods” and in “Results

SV2

Item Response Theory (IRT) analyses

Type of IRT/Rasch model should be reported. Also report the method of estimation, methods and results for checking of the assumptions (unidimensionality (see factor analysis), local dependency (e.g., residual correlations), monotonicity; (e.g. Mokken scaling), goodness-of-fit statistics, and cut-off points for goodness of item/model fit, and all item parameters

Section “Using IRT: Analysis of the response categories and performing Rasch analysis” in “Patients and Methods” and in “Results

  1. COSMIN COnsensus-based Standards for the selection of health measurement instruments, PROM Patient-reported outcome measure

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Donhauser, T., Apfelbacher, C., Kann, G. et al. Hyperhidrosis quality of life index (HidroQoL): further validation by applying classical test theory and item response theory using data from a phase III clinical trial. J Patient Rep Outcomes 7, 55 (2023). https://doi.org/10.1186/s41687-023-00596-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s41687-023-00596-6

Keywords