Phase 1: qualitative study
A cross-sectional, qualitative study (GSK: 206605; ClinicalTrials.gov: NCT03344406) involving one-on-one semi-structured concept elicitation and cognitive debriefing telephone interviews was conducted between December 2017 and September 2018 with patients attending five US asthma and allergy clinics. Quorum Review Institutional Review Board (IRB) approval was obtained prior to initiation of the study (Quorum Review File # 32810). Interviews were conducted in English or Spanish language, lasted 60–90 min and were recorded and translated into English (if required). Eligibility criteria are summarized in Supplementary Table 1. Briefly, adults aged ≥18 years with moderate/severe asthma with a history of airflow obstruction and evidence of bronchodilator reversibility were eligible for inclusion. Current smokers and patients with a diagnosis of COPD or other clinically important lung conditions were excluded.
Interviews followed a semi-structured interview guide. All participants provided written informed consent prior to the interview. Interview questions were used to elicit asthma symptoms from the patient perspective first, and subsequently participants completed the E-RS: COPD questions in a paper format. Participants provided further feedback during cognitive debriefing to assess comprehension, relevance, and completeness of the tool.
Interviews were transcribed and anonymized prior to descriptive analysis using ATLAS.ti version 7.0 or higher (ATLAS.ti Scientific Software Development GmbH, Berlin, Germany). A coding dictionary was developed iteratively based on concept themes related to participants’ symptom experience. Words and phrases reported by interview participants were selected using the coding dictionary and grouped into key themes, attributes, concepts, and relationships. Saturation was defined as the point at which no major new themes, descriptions of a concept, or terms were introduced as subsequent interviews were conducted . Concepts were subsequently thematically mapped to assess content coverage. Feedback on the E-RS: COPD was obtained using qualitative codes to capture overall feedback, item comprehension, recall period, and response options. Results are reported in accordance with the Consolidated Criteria for Reporting Qualitative Research (COREQ) checklist .
Phase 2: quantitative psychometric evaluation
Utilizing data from two randomized controlled trials (RCTs) (GSK: 205832; ClinicalTrials.gov: NCT03012061 ; GSK: 205715; ClinicalTrials.gov: NCT02924688 ), a quantitative psychometric evaluation was performed to first evaluate the factor structure of the E-RS: COPD with the intent to use this tool in a moderate/severe asthma population (E-RS: Asthma). Subsequently, the scoring algorithm and psychometric properties, including reliability, construct validity, and responsiveness were evaluated. IRB approval was obtained prior to initiation of patient recruitment or administration of measures as part of each clinical trial (205832 and 205715). No additional ethics committee or IRB approval were required for this secondary data analysis.
GSK 205832 was a Phase IIb, randomized, double-blind, placebo-controlled, three-arm, parallel-group study conducted between January 2017 and May 2018. The study evaluated the efficacy, safety, and tolerability of two doses of umeclidinium bromide (UMEC) administered once daily (QD) via the ELLIPTA dry powder inhaler (DPI) over 24 weeks in patients with moderate uncontrolled asthma receiving fluticasone furoate (FF) 100 mcg QD. GSK 205715 was a Phase IIIa, randomized, double-blind, active-controlled, six-arm, parallel-group study conducted between December 2016 and August 2018. This study compared the efficacy, safety and tolerability of four fixed-dose triple combinations of FF/UMEC/vilanterol (VI) with two fixed-dose dual combinations of FF/VI, administered QD via the ELLIPTA DPI for 24–52 weeks in patients with moderate/severe asthma uncontrolled on inhaled corticosteroids (ICS) ± a long-acting β2-agonist (LABA) therapy with the primary outcome of the study (clinic trough forced expiratory volume in 1 s [FEV1]) completed at Week 24.
Detailed eligibility criteria for both trials are summarized in Supplementary Tables 2 and 3, and have been presented elsewhere [11, 12]. Briefly, 205832 enrolled a moderate asthma population prescribed ICS ± a LABA or a long-acting muscarinic antagonist (LAMA), whereas 205715 included a broader population of patients with moderate/severe disease uncontrolled on ICS/LABA. There were also differences in asthma control at baseline. Thus, 205832 required patients to have Asthma Control Questionnaire (ACQ)-6 score > 0.75 at screening (partially or inadequately controlled asthma), whereas 205715 restricted entry to those only with inadequately controlled asthma symptoms (ACQ-6 score ≥ 1.5). Both studies required evidence of reversibility (post-bronchodilator increase in FEV1 of ≥12% and ≥ 200 mL following salbutamol inhalation), and evidence of airflow obstruction at screening, although this differed between studies at screening (e.g., pre-bronchodilator AM FEV1 < 85% and ≤ 90% predicted in 205715 in 205832, respectively). In 205715, patients were required to meet additional entry criteria at the end of the 3-week run-in period before entering the 2-week stabilization period and at the end of the stabilization period prior to randomization (Supplementary Table 3). In contrast, 205832 had a shorter 2-week run-in phase with patients required to meet additional criteria only prior to randomization (Supplementary Table 2).
Timing of collection of individual patient-reported outcomes (PROs) in both RCTs is summarized in Supplementary Table 4. All PROs used in 205832 and 205715 were administered on an electronic diary which was also programmed to allow collection of periodic PRO assessments during patient visits to study sites. Electronic administration of PROs, such as Asthma Quality of Life Questionnaire (AQLQ) and Asthma Control Questionnaire (ACQ) has demonstrated high levels of agreement with paper versions . In addition, the US Food and Drug Administration (FDA) indicates that the St George’s Respiratory Questionnaire (SGRQ) can be administered electronically . To avoid missing data, patients were required to provide a response before they could move to the next question. Once data were submitted, patients were unable to view their previous responses.
The E-RS: COPD consists of 11 items that measure respiratory symptoms recalled by patients over the previous 24 h, rather than a change in symptoms over this timeframe, capturing information related to breathlessness, cough, sputum production, chest congestion, and chest tightness. Daily recording provides an assessment of underlying day-to-day variability of symptoms. Items 1–8 are scored on a 5-point scale of not at all to extreme, and items 9–11 are scored on a 6-point scale of not at all to too breathless to do these. Thus, the RS-Total score has a range of 0–40, comprising three subscales: RS-Breathlessness (sum of 5 items, range 0–17); RS-Cough and Sputum (sum of 3 items, range 0–11); and RS-Chest Symptoms (sum of 3 items, range 0–12) [6, 17]. For both the 205715 and 205832 studies, the E-RS was administered electronically with exactly the same appearance and format as the paper version administered in Phase 1.
Alongside the E-RS: COPD, two supplemental items were included that were rooted in patient feedback from previously reported qualitative work . First, a wheeze item was included, in which participants were asked “Did you wheeze today?” with response options of not at all, rarely, occasionally, frequently, and almost constantly. Second, a SOB with strenuous physical activity item was included with the question “Were you short of breath today when performing strenuous activities such as climbing stairs, running, or participating in sports activity?” with response options of not at all, slightly, moderately, severely, extremely, and too breathless to do these.
Patient Global Impression of Severity (PGI-S) and Patient Global Impression of Change (PGIC)
The PGI-S is a single-item questionnaire to evaluate disease severity; patients rated asthma symptoms they are currently experiencing at each study visit using a 5-point scale (none, mild, moderate, severe, very severe). The PGIC is a single question used to evaluate response to treatment since the start of the study using a 7-point scale (significantly improved, moderately improved, mildly improved, no change, mildly worse, moderately worse, significantly worse).
St George’s Respiratory Questionnaire (SGRQ)
The SGRQ is a measure of health status in patients with chronic airway obstruction , and includes 50 items addressing three domains: symptoms, activity limitations, and impact. Recall periods in the questionnaire include the past 4 weeks and the current day. A 5-point scale is used for rating symptoms and a true/false binary scale used for activity limitations. The total score is expressed as a percentage of overall impairment, with 0 and 100 representing the best and the worst possible health status, respectively. A reduction of ≥4 points in SGRQ total score is considered clinically meaningful .
Asthma Quality of Life Questionnaire (AQLQ)
The AQLQ includes 32 items that measure functional impairment related to asthma. The questions are designed to be self-completed by the patient, with a recall period of the past 2 weeks. The response scale ranges from 1 (totally impaired) to 7 (not at all impaired). A change of ≥0.5 is considered clinically important .
Asthma Control Questionnaire (ACQ-5, ACQ-6)
The ACQ measures various attributes of asthma control . ACQ-5 includes five questions (nocturnal awakening, waking in the morning, activity limitation, SOB, and wheeze) that gauge the frequency and/or severity of symptoms over the previous week. The ACQ-6 includes an additional item relating to rescue medication use. The recall period is the past week. Response options range from 0 (no impairment/limitation) to 6 (total impairment/limitation). Scores < 0.75 indicate well-controlled asthma whereas scores ≥1.5 indicate poorly controlled asthma . A change of ≥0.5 units is considered clinically important .
Quantitative psychometric analyses used a statistical analysis plan informed by previous validation work on the E-RS: COPD in COPD  and in asthma–COPD overlap (ACO) syndrome . Data were not pooled due to differences in study populations (Supplementary Table 5). This allowed for evaluation of psychometric properties across patients with moderate (205832) and moderate/severe asthma (205715).
Analyses were conducted on blinded data (205832) and interim blinded data (205715) from a PRO dataset, defined as patients included in the intent-to-treat populations with a minimum of 4 days of data for the week prior to baseline, using SAS statistical software version 9.4 (SAS Institute Inc., Cary, NC, USA) or STATA 15 (StataCorp LLC, College Station, TX, USA). All statistical tests were two-sided and used a significance level of 0.05. No imputation of missing data was performed.
Confirmatory factor analysis (CFA) was conducted using structural equation modeling to evaluate the fit of the factor structure of the E-RS: COPD in patients with moderate/severe asthma. The hypothesis that the E-RS: COPD has three factors and second order unidimensionality (ie, that the three factors load onto a single construct) was tested. First, the comparative fit index (CFI) evaluated the proportionate improvement in a model by comparing a hypothesized model against a less restricted baseline model ; values ≥0.9 indicate acceptable fit. Second, the standardized root mean residual (SRMR) measured the mean absolute difference between observed and model-implied correlations; values < 0.1 are considered acceptable . Finally, the root mean square error of approximation (RMSEA) assessed the discrepancy between predicted and observed data per degree of freedom; values < 0.08 are considered acceptable . CFA was estimated at Weeks 0 and 24 in both studies. Post hoc exploratory factor analysis (EFA) determined if there was an optimal factor structure, which includes the E-RS: COPD and the two supplemental items in a meaningful way, and was performed at Weeks 0 and 24 in both studies. In EFA, the structure or number of factors was not pre-specified; scree plots and corresponding eigenvalues were examined to determine the number of factors empirically . The psychometric properties of the tool, referred to as the E-RS: Asthma when used in asthma populations, was then assessed using the best fitting factor structure.
Reliability was assessed using internal consistency reliability and test–retest reliability. Cronbach’s alpha coefficient was estimated for the mean weekly RS-total and subscale scores at Week − 2 and at Weeks 0, 4, 12, and 24 (205832) and at Weeks − 5 and − 2 and at Weeks 0, 4, 12, 24, 36, and 52 (205715). Reproducibility of total and subscale E-RS: Asthma daily and mean weekly scores was assessed to evaluate test–retest reliability, utilizing intra-class correlation (ICC) coefficients with a two-way random effects regression model based on absolute agreement (ICC2,1)  and paired t-tests.
For daily scores, reproducibility of scores (test–retest) over consecutive days (Days 1–2, 2–3, 3–4, 4–5, 5–6, and 7–8) and over a 7-day interval (Days 1–7) from screen run-in indicated patients were stable during this period if randomized. For weekly scores, reproducibility of scores from the first to the second week of screen run-in were assessed. Additionally, reproducibility of scores for patients with no change in ACQ score at the Week − 2 visit (Visit 1 in 205832 and Visit 2 in 205715) to randomization (Visit 2 in 205832 and Visit 3 in 205715) were assessed.
Construct validity of the E-RS: Asthma total and subscale scores were assessed using convergent and discriminant validity and known-groups validity at Week 24. Convergent validity was considered supported if E-RS: Asthma scores showed moderate correlation (r > 0.40) with conceptually similar measures. A lower correlation (r < 0.40) was required to support discriminant validity. Spearman’s rank correlation coefficients were calculated between the mean weekly E-RS: Asthma score and mean weekly rescue medication use, expected peak expiratory flow (AM), expected FEV1 (AM), nighttime awakenings due to asthma symptoms, and asthma symptom severity scores. In addition, the mean weekly E-RS: Asthma score was compared with FEV1, FEV1% predicted, SGRQ (total and domain scores), AQLQ, and ACQ scores from baseline or final visit.
Known-groups validity of the total and subscale E-RS: Asthma daily and mean weekly scores was assessed using analysis of variance (ANOVA) by examining score differences during the baseline week in patients grouped according to FEV1% predicted categories, PGI-S, ACQ score and exacerbation history. The F-statistic from the ANOVA and the t-statistic from the t-tests were considered significant if both were below 0.05.
Analysis of covariance (ANCOVA) models were used to examine differences in change in total and subscale E-RS: Asthma daily scores from baseline week to Weeks 4, 12, and 24, among patients in various responder groups. Baseline scores (mean of Day − 7 to − 1) were controlled for in the models. Cohen’s effect size was calculated for E-RS: Asthma total and subscale scores among patients defined as “responders” using PGIC, SGRQ, or AQLQ.
Anchor and distribution-based methods were used to define a responder threshold for E-RS: Asthma. Using anchor-based methods, mean change in mean weekly E-RS: Asthma total and subscale scores from Weeks 0 to 24 were determined by PGIC level and assessed by plotting cumulative distribution function (CDF) graphs. The minimally important change threshold was the mean change in RS-Total and subscale scores of patients who were mildly improved or mildly worse on the PGIC. Exploratory analyses were conducted utilizing the SGRQ and AQLQ minimally important difference thresholds as anchors (change of 4.0 point and 0.5 points, respectively). Distribution-based methods were conducted as supportive information for development of a responder threshold and included an assessment of the standard error of measurement (SEM) and half standard deviation (SD). SEM was estimated by multiplying the baseline SD of the measure by the square root of one minus its reliability coefficient (ICC from the test–retest assessment) [29, 30]. Half an SD of a measure represents a good approximation of the minimally important difference .