Statistical analysis | PRO scores | Time points | Reference measures | Population | Interpretation |
---|---|---|---|---|---|
Item performance/variability – extent to which potential available response option for each item is selected by patients | |||||
Descriptive statistics | • MP • UF-DBD item levela • Monthly and bleeding episode sum scores | • RND • EOT | NA | All patients, by study | • Distributional properties • Floor and ceiling effects |
Reliability/Test-retest reliability – ability to give reproducible, consistent scores over a short time period in stable patients | |||||
1. Descriptive statistics, Wilcox signed rank test 2. Intraclass correlation coefficient | • MP • UF-DBD • Monthly and bleeding episode sum scores | • SCR2, RND • T2, EOT | • AHb • PGI-S | Stable patients by study: • AH method: MP and UF-DBD in ASTEROID 1c • PGI-S scores: UF-DBD and MP in ASTEROID 1 and 2 | UF-DBD • ICC ≥0.50: moderate MP • ICC < 0.40: poor • 0.40 to 0.59: moderate • 0.60 to 0.74: good • 0.75+: excellent |
Construct validity – extent to which a scale measures the intended construct | |||||
Known-groups validity – ability of measure to discriminate between patient groups differing in levels of condition severity | |||||
1. Jonckheere-Terpstra test 2. Kruskal-Wallis test | • MP • UF-DBD • Monthly and bleeding episode sum scores | • RND • EOT | • AHb • PGI-S | All patients, by studyc | Significance of ordered difference between known groups |
Convergent and divergent validity – extent of association between a measure and other measures or variables based on an expected relationship | |||||
1. Spearman rank correlation 2. Scatterplots | • MP • UF-DBD • Monthly and bleeding episode sum scores | • RND • EOT • pooled RND and EOT | • MP • UF-DBD • UF-DSD v3 • UF-IS v3 • UFS-QoL • SF-36 v2® | All patients, by study | Strength of correlations • 0.10 to 0.29: weak • 0.30 to 0.49: moderate • 0.50 to 1.0: strong |
Criterion validity – extent of relation between PRO instrument scores and a known gold standard measure of the same concept | |||||
1. Spearman rank correlation 2. Scatterplots | • MP • UF-DBD • Monthly and bleeding episode sum scores | • RND • EOT | • AHb | All patients with AH measurements | Strength of correlations • 0.10 to 0.29: weak • 0.30 to 0.49: moderate • 0.50 to 1.0: strong |
Responsiveness – ability to detect change when a change in the measured concept has occurred | |||||
1. Spearman rank correlation, Scatterplots 2. Kruskal-Wallis test, Jonckheere-Terpstra test | • MP • UF-DBD • Monthly and bleeding episode sum scores | • RND • EOT | Correlation analysis • MP • UF-DBD • UF-DSD v3d • UF-IS v3d • UFS-QoLd • AHb • PGI-S Definition of change • AHb • PGI-S | All patients, by studyc | Strength of correlations • 0.10 to 0.29: weak • 0.30 to 0.49: moderate • 0.50 to 1.0: strong Significant or significant ordered difference across the groups |
Missing data | |||||
Descriptive statistics; frequencies and percentages of missing data (daily scores over time) | • MP • UF-DBD | • RND • EOT | AH | ASTEROID 1, patients with AH measurements: all patients except for Japanese centerse and only US patients | |
Comparability of methods (AH, MP, UF-DBD) | |||||
1. Cross-tabulation of benchmark scores (HMB eligibility, responder status and amenorrhea, calculation of sensitivity, specificity, PPV and NPV) 2. Kaplan–Meier curves; descriptive statistics, histograms of difference | • MP • UF-DBD | • RND • EOT | AH | ASTEROID 1, patients with AH measurements |