Evaluation of the psychometric properties and minimally important difference of the MD Anderson Symptom Inventory for malignant pleural mesothelioma (MDASI-MPM)

Background Symptom assessment requires psychometrically validated questionnaires that are easy to use, relevant to the disease, and quick to administer. The MD Anderson Symptom Inventory for malignant pleural mesothelioma (MDASI-MPM) was adapted from the general (core) MDASI to assess the severity of cancer-related and treatment-related symptoms specific to patients with this condition. The MDASI-MPM includes the 13 core MDASI symptoms, which are experienced by most cancer patients, and 6 MPM-specific items developed via qualitative interviewing, a method favored by the US Food and Drug Administration for instrument item generation and development. Qualitative interviewing that summarizes the item generation and development for the MDASI-MPM is detailed in a separate report. The psychometric study reported here was the next step in developing the validation dossier for the MDASI-MPM. Results In this secondary analysis of data from a Phase II trial, 248 patients provided MDASI-MPM data at multiple timepoints during therapy. Over time, fatigue, pain, shortness of breath, feeling of malaise, and muscle weakness were consistently the worst symptoms reported; symptoms interfered most with work and general activity and least with relations with others. Cronbach coefficient alpha values for all MDASI-MPM subscales were at least 0.88 at baseline and 0.91 during treatment, indicating good internal consistency reliability. Intraclass correlations of at least 0.86 for all MDASI-MPM subscales administered a cycle apart (n = 82) were indicative of good test-retest reliability. Correlations between MDASI-MPM subscales and LCSS-Meso scores were at least 0.70 (P < 0.001 for all comparisons). Patients with good performance status had significantly lower scores than did patients with poor performance status (all P < 0.05), supporting evidence for known-group validity and sensitivity. Effect-size differences were 0.69 and higher, indicating medium-to-large effects. The minimally important difference in the MDASI-MPM subscales ranged from 1.0 to 1.5 points on a 0–10 scale. Conclusions Symptoms specific to a particular cancer, treatment method, or treatment site can be added to the core MDASI to create a tailored, “fit for purpose” instrument. We found the MDASI-MPM to be a valid, reliable, and responsive (sensitive) instrument for assessing the severity of symptoms of patients with MPM and their interference in patients’ daily functioning.


Background
Data reflecting the patient experience during an oncologic clinical trial is playing an increasing role in how drug-approval agencies, such as the US Food and Drug Administration (FDA) and the European Medicines Agency, consider the overall clinical risks and benefits of new therapeutic agents. This patient experience is best captured by patient-reported outcome (PRO) questionnaires administered repeatedly over the course of a clinical trial.
Patients with malignant pleural mesothelioma (MPM), an aggressive cancer of the lung pleura, report distressing symptoms, impaired functioning, and treatment intolerance. Understanding the symptoms of MPM requires evidence-based documentation of the symptoms that best characterize the disease and how these symptoms change over the course of treatment. Capturing improvement in disease symptoms is one of the 3 core PRO concepts that the FDA is proposing to focus on for labeling considerations [1].
Only three PRO questionnaires have been validated for use in patients with MPM: the European Organization for Research and Treatment of Cancer Core Quality of Life Questionnaire (EORTC QLQ-C30) [25,28], the EORTC Quality of Life 13-item Lung Cancer-specific Questionnaire (EORTC QLQ-LC13) [26,28] and the Lung Cancer Symptom Scale (LCSS) for Mesothelioma (LCSS-Meso) [27]. All three of these instruments measure health-related quality of life (HRQOL). None were originally developed with qualitative input from patients with MPM as required by the FDA, and none measure the symptom burden of MPM and its treatment. Taken together, these deficiencies indicate a significant unmet need.
Building on our group's extensive experience in developing symptom measures, we used a qualitative approach to adapt an existing multisymptom assessment questionnaire, the MD Anderson Symptom Inventory (MDASI), to include additional symptoms specific to the MPM experience. The resulting provisional version of the MDASI-MPM was based on a conceptual model of MPM-related symptom burden and was found to be content valid and amenable for further psychometric testing [2].
We analyzed a dataset from a large Phase II trial in which the provisional MDASI-MPM was administered longitudinally to patients receiving second-line treatment for MPM. We report here the most prevalent symptoms over time and the degree to which symptoms interfered with patient functioning, and we summarize evidence demonstrating the reliability, validity, and sensitivity of the finalized MDASI-MPM. We also propose metrics for minimally important differences in the MDASI-MPM's severity and interference subscales.

Study participants
Data used in this secondary analysis were from a randomized (2:1 ratio), open-label, active-controlled, Phase II study of intravenous anetumab ravtansine (BAY 94-9343) or vinorelbine in 248 patients with advanced or metastatic MPM who were overexpressing mesothelin and who had progressed on first-line treatment (platinum in combination with pemetrexed, with or without bevacizumab) (Clinical-Trials.gov Identifier: NCT02610140). Patients were allowed to be randomized into the trial only if they demonstrated mesothelin overexpression at a moderate or stronger level in at least 30% of tumor cells. The dataset does not include the treatment arm and therefore the analyst was blinded to this grouping variable.
Patients completed the provisional MDASI-MPM and the Lung Cancer Symptom Scale (LCSS)-Mesothelioma at times when symptoms were expected to change (to test the MDASI-MPM's sensitivity) or to be stable (to test the MDASI-MPM's stability). The MDASI-MPM was completed at baseline, on Days 1 and 15 of each cycle up to 3 cycles, and on Day 1 of Cycles 4, 5, and 6. The LCSS-Meso was administered at baseline and on Day 1 of each cycle (except Cycle 1) for up to 6 cycles. Clinicians rated their patients' performance status at baseline and on Day 1 of each cycle for up to 6 cycles. Tests for sensitivity were performed between baseline and safety follow-up; tests for stability were conducted between Cycle 2 Day 1 and Cycle 3 Day 1. Patients also completed the provisional MDASI-MPM at a safety follow-up, when most patients were experiencing disease progression, which allowed for additional sensitivity estimates. See Table 1.

The MD Anderson symptom inventory
The core MDASI asks patients to rate the severity of 13 disease-related and treatment-related symptoms during the past 24 h [3]. Each symptom (pain, fatigue, nausea, disturbed sleep, distress, shortness of breath, trouble remembering, lack of appetite, feeling drowsy, dry mouth, feeling sad, vomiting, and numbness or tingling) is rated on an 11-point scale ranging from 0 (not present) to 10 (as bad as you can imagine). The MDASI-MPM includes the 13 core MDASI symptoms and 6 MPM-specific items (feeling of malaise, coughing, muscle weakness, trouble with balance or falling, chest heaviness or tightness, and eye problems) that were developed using qualitative interviewing, a method favored by the FDA for item generation and development [4]. A summary of the item generation and development of the draft MDASI-MPM items on the basis of qualitative interviewing results is reported elsewhere [2]. For the psychometric analyses described in this paper, we included the additional symptoms identified during the qualitative development and evaluated psychometric evidence for their inclusion in the MDASI-MPM symptom severity subscale.
Patients also rated the degree to which their symptoms interfered with various aspects of life during the past 24 h, which is represented by the symptom interference items of the MDASI). Each interference item (general activity, mood, normal work [including both work outside the home and housework], relations with other people, walking ability, and enjoyment of life) is rated on an 11-point scale ranging from 0 (did not interfere) to 10 (interfered completely).
In summary, the MDASI-MPM has 19 items that measure symptom severity and 6 items that measure symptom interference. Correspondingly, the ratings in the MDASI-MPM can be averaged into 2 subscale scores: mean severity (the 13 core symptom items plus the 6 MPM-specific items) and mean interference (the 6 interference items only). A composite symptom score that could serve as a basis for developing a responder analysis or as a potential outcome measure in a pivotal trial can also be calculated but is beyond the scope of this paper.

Lung Cancer symptom scale
To evaluate the validity of the MDASI-MPM in comparison with an established instrument, we used the LCSS-Meso, an 8-item questionnaire. The LCSS-Meso is a valid and reliable QOL measure that was designed for patients with non-small cell lung cancer and that has been modified for use in patients with MPM. A recent paper [5] presents cognitive debriefing of the LCSS-Mesothelioma. One of the 8 items in the LCSS-Meso is a global QOL rating (rated using a visual analogue scale on a 0-100 scale). We also calculated an average of all 8 items [6].

Performance status
Eastern Cooperative Oncology Group performance status (ECOG PS) was used to represent disease severity [7]. ECOG PS is a physician-rated measure of functional ability, ranging from 0 (fully active; able to carry on all predisease performance without restriction) to 4 (completely disabled; cannot perform self-care; totally confined to bed or chair).

Statistical analysis
All statistical analyses were conducted using Statistical Package of the Social Sciences software version 21.0 (SPSS, Inc.; Chicago, IL, USA). Correlations, means, standard deviations (SDs), ranges, and 95% confidence intervals (CIs) were computed for all symptoms and subscales. Proportions of patients reporting moderate to severe symptoms were calculated and tabulated. We defined a moderate-to-severe symptom as one rated ≥5 on the MDASI's 0-10 scale, on the basis of results from previous studies showing that "pain at its worst" is related to greater interference with function when rated ≥5 by cancer patients [8][9][10] and community samples [11]. Severe ratings are those symptoms rated ≥7. These cutpoints have also been applied to other symptoms [12]. Statistical significance was set using a 2-tailed alpha level of 0.05. To address missing data, analyses using complete data and analyses using all available data at each assessment times were compared, as applicable.

Reliability of the MDASI-MPM
Internal consistency reliability Internal consistency reliability refers to the extent to which the items in a scale are measuring the same concept. Cronbach coefficient alphas were computed to estimate the internal consistency reliability of the 2 MDASI-MPM subscales: the severity subscale (13 core plus 6 MPM-specific items) and the interference subscale (6 interference items). The criterion for good internal consistency (reliability) requires a Cronbach alpha value of 0.70 or higher [13].
Test-retest reliability Test-retest reliability is typically examined using assessments taken 1 day apart. However, for this study, we used data from assessments made between Cycle 2 Day 1 and Cycle 3 Day 1 to evaluate test-retest reliability, calculated using intraclass correlations, for the 2 MDASI-MPM subscales. We hypothesized that restricting the analysis to patients who reported relative stability over time based on their response to the LCSS-Meso global QOL item should also report stable symptoms. Hence, change of < 10 points on this QOL item was used as indicative of less-than-meaningful change, on the basis of prior research [14,15].

Validity of the MDASI-MPM
Criterion (concurrent) validity Criterion validity refers to the extent to which an instrument correlates with another instrument that measures a similar, but not the same, concept [13]. To show concurrent validity, we correlated MDASI subscale scores and items with the LCSS-Meso average aggregate scores of all 8 items.
Known-group (construct) validity Construct validation requires demonstrating that the instrument measures the underlying construct it is intended to measure [13]. Various methods of establishing construct validity can be used, such as differentiation between groups (known-group validity), factor analysis, and multitraitmultimethod matrices. For this report, independentsample t tests were used to demonstrate known-group validity, which refers to the extent to which an instrument can distinguish between groups known to be clinically different. Effect sizes were calculated to estimate the magnitude of the differences in the 2 MDASI-MPM subscale scores between those with good (0) versus poor (1 and above) ECOG PS [16,17].

Sensitivity of the MDASI-MPM
Sensitivity (responsiveness) is defined as the ability of an instrument's subscales or items to detect change in outcomes when such change is expected. We evaluated whether the MDASI-MPM could detect a worsening of symptoms among patients with deteriorating performance status (a clinical estimate of worsening disease status). Specifically, we examined whether the MDASI-MPM could detect whether symptom severity increased for patients whose ECOG PS deteriorated over time.
Note that we used ECOG PS to demonstrate both known-group validity and sensitivity/responsiveness, based on data from patients who had ECOG PS ratings from both baseline and the safety follow-up. The main difference is that to demonstrate sensitivity/responsiveness, we examined change over time in MDASI-MPM subscales for those patients whose performance status deteriorated over time. Change scores and the associated 95% CI for MDASI-MPM subscales were computed, and effect sizes were calculated, to estimate the magnitude of differences in subscale scores and items [16,17].

Estimation of meaningful change for the MDASI-MPM
Anchor-based approach We used the LCSS-Meso global QOL item as an anchor and as a basis for meaningful change evaluations. MDASI-MPM severity and interference subscale scores between baseline and Cycle 2 Day 1 were calculated. A 10-point change or greater on the global QOL item was used as indicative of a meaningful change, on the basis of prior research [14,15]. Minimally important differences for improvement and worsening were suggested.
Distribution-based approach Although meaningful change evaluations were primarily derived using relevant patient-based anchor, distribution-based method was also used to complement and support the estimates obtained via anchor-based approach [18]. Our estimates for meaningful change were estimated by tabulating one-half SD, one-third SD, and the standard error of measurement for the MDASI-MPM severity and interference subscales at baseline and at Cycle 2 Day 1 [18].

Demographic and clinical characteristics
For the 248 patients in the dataset, age ranged from 42 to 84 years, with the median at 66 years. Women comprised 26% of the sample (n = 64). Most (94%) were white and 36% (n = 90) were fully active in terms of performance status (ECOG PS = 0).
Of the 248 patients, 239 completed the MDASI-MPM and LCSS-Meso and were graded by clinicians using ECOG PS at baseline. See Table 1 for details on the availability of patients at each assessment time. assessments. At baseline, the most severe symptoms were fatigue, shortness of breath, pain, lack of appetite, feeling of malaise, muscle weakness, and disturbed sleep. Overall, the list and rankings of the worst symptoms are similar. At baseline, 47% of patients had moderate to severe fatigue, with 20% reporting fatigue as severe, and 39% reported having moderate to severe shortness of breath; further, at least 25% of patients also reported having moderate to severe pain, distress, muscle weakness, or feeling of malaise. Overall, 6 symptoms (fatigue, shortness of breath, pain, muscle weakness, feeling of malaise, and disturbed sleep) were moderate to severe for at least 25% of this patient sample. Symptom interference change was observed between baseline and safety follow-up (baseline = 2.8 vs safety follow-up = 4.3, P < 0.001; 95% CI, − 2.1 to − 1.0; n = 103). Symptoms interfered most with work, followed by general activity, and least with relations with others (Data not shown.). Figure 1 shows the symptom trajectories for the 7 most-severe MDASI-MPM items (fatigue, shortness of breath, pain, distress, muscle weakness, feeling of malaise, and lack of appetite. Over time, fatigue, pain, shortness of breath, feeling of malaise, and muscle weakness were consistently the worst symptoms. Figure 1 also shows that all symptoms were more severe at the safety follow up, which occurred 7 cycles or 21 weeks from baseline, corresponding to disease progression; this difference was significant (baseline = 2.1 vs safety follow-up = 2.9, P < 0.001, 95% CI, − 1.2 to − 0.5; n = 103).

Psychometric properties of the MDASI-MPM Reliability
Internal consistency reliability The MDASI-MPM subscales showed good internal consistency reliability. Cronbach coefficient alpha values were at least 0.88 for the severity subscale, and 0.90 for the interference subscale at baseline. Both severity and interference subscale scores were 0.91 or higher during treatment. We found no notable differences in Cronbach coefficient alpha values if only data from the 103 patients who completed the MDASI-MPM at the safety follow-up were used to calculate coefficient alpha values at each assessment time point (Data not shown.).
Test-retest reliability The intraclass correlations of the MDASI severity and interference subscales administered 1 cycle apart for those patients who reported relative stability based on their QOL ratings (n = 82) were 0.86 and 0.88, respectively. These values were indicative of good test-retest reliability. Validity Criterion (concurrent) validity Our analysis of the concurrent validity of MDASI-MPM items showed that the MDASI-MPM subscales were correlated with the LCSS-Meso scores (P < 0.001 for all comparisons) ( Table 3). For any of the assessments, the correlations between the 2 subscales of the MDASI-MPM and the LCSS-Meso score were higher than 0.70. We found no notable differences in the correlations if only data from the 95 patients who completed both the MDASI-MPM and LCSS-Meso at the safety follow-up were used to calculate correlations at each assessment time point.
Known-group (construct) validity Known-group validity comparisons were made for the MDASI-MPM subscales relative to ECOG PS scores at 6 assessment time points. From baseline to Cycle 3 Day 1, the MDASI-MPM discriminated between patients with good versus poor performance status: patients with good ECOG PS had significantly lower scores for both subscales than did patients with poor ECOG PS (all P < 0.05) ( Table 4). Similar results were seen for MPM-specific symptoms (all P < 0.05). Effect-size differences were 0.65 and higher, indicating medium-to-large effects [16,17] (Data not shown.).

Sensitivity (responsiveness)
We assessed whether the MDASI-MPM could detect symptom changes when performance status changed during the course of treatment. In the Phase II trial, 95 of the 117 patients with ECOG PS data also had MDASI-MPM data at the safety follow-up; of these, 53% (50/95) showed a decline in performance status. We found that the increase in the 2 MDASI-MPM subscales were correlated with change in ECOG PS (Table 5). Change scores for patients whose ECOG PS worsened over time were statistically significant for both subscales. With Bonferroni correction for multiple comparisons, MDASI-MPM symptoms, such as trouble with balance or falling, eye problems, muscle weakness, numbness, and dry mouth, were also significantly correlated with change in ECOG PS. These differences were clinically meaningful, as reflected by effect sizes of one-half SD and higher. Table 6 shows estimates of meaningful change in the MDASI-MPM subscales using the QOL item from the LCSS-Meso as the anchor. Meaningful change estimate for improvement was associated with an approximately 1-point (on a 0-10 scale) improvement in the MDASI-MPM subscales. Table 6 also demonstrates that our estimates for meaningful change via distribution-based methods were approaching 1 point based on tabulated values using one-half SD, one-third SD, and the standard error of measurement for the MDASI-MPM severity and interference subscales at baseline and at Cycle 2 Day 1.

Estimation of meaningful change for the MDASI-MPM
Both above approaches are considered group-level because we are examining scores calculated for each relevant group. To illustrate how individual patient's symptom severity vary from baseline to the safety follow-up, we present a waterfall plot (Fig. 2).

Discussion
In this study, we tested a provisional MPM-specific version of the MD Anderson Symptom Inventory adapted to encompass major symptoms reported by patients with MPM over the duration of treatment. We adopted all of the MDASI-MPM items for psychometric testing on the basis of qualitative interviewing results that established the importance of each item [2]. This psychometric testing is a necessary step in the validation dossier of the MDASI-MPM. This new MDASI version was administered longitudinally to more than 200 patients in a large Phase II trial, with the analysis being blinded to treatment arm, so as to collect data on its psychometric properties. A limitation of the study is that we were unable to determine if the MDASI-MPM is sensitive to drug-related treatment benefits because we did not have the treatment arm in the analytic dataset.
The results provide psychometric support for the use of the MDASI-MPM in clinical trials in which symptom change is a candidate endpoint. The instrument's severity and interference subscales exhibited high test-retest  reliability and acceptable internal consistency reliability. As expected, QOL ratings from the LCSS-Meso were highly correlated with ratings from the MDASI-MPM severity and interference subscales. The 2 MDASI-MPM subscales, symptom severity and interference with functioning, were sensitive to changes in performance status (related to disease), as evidenced by significant correlations between MDASI-MPM ratings and ECOG PS over time. The 2 MDASI-MPM subscales were also correlated with patient global QOL ratings. By rank ordering the severity of symptom items across various cancer types, researchers can identify most of    the symptoms that are consistently burdensome for patients with cancer. For example, the 7 most severe symptoms reported by our study participants-fatigue, shortness of breath, pain, distress, muscle weakness, feeling of malaise, and lack of appetite-include both core and disease-specific MDASI items that are also used in MDASI modules for other disease sites [19][20][21][22]. One advantage to the adaptation of the MDASI for MPM is that having symptom data derived from the core items of the MDASI allows for comparison of common symptoms across disease sites. Another advantage of the MDASI-MPM is its use of a numeric rating scale, which has been shown to offer distinct advantages in measuring symptoms, especially pain [23]. The MDASI-MPM was developed according to the general principles expressed in the FDA's guidance on the use of PROs in labeling claims [4], including qualitative generation of items. While none of the HRQOL measures validated for use in patients with MPM were developed with qualitative input from patients with MPM, a recent paper [5] evaluated the content validity of the LCSS-Meso in qualitative interviews with patients with MPM, although an expert panel evaluation of the relevance of these symptoms was not done. This study  19 and 20, and % severe 8 and 7 respectively) at baseline, making it an important symptom to assess and monitor. In addition the MDASI-MPM includes symptoms of trouble with balance or falling and muscle weakness, identified by patients in qualitative interviews and recommended by an expert panel for relevance. The current study showed both symptoms to be very sensitive to worsening performance status in patients with MPM. The MDASI-MPM presents a more comprehensive picture of the symptom burden of MPM than the HRQOL measure LCSS-Meso. The general principles expressed in the FDA's guidance on the use of PROs in labeling claims [4] also recommends psychometric evaluation of the items and subscales in an early-phase study [2]. One of the criteria set forth in the FDA guidance is that a PRO instrument must be able to detect change over time. In particular, the regulatory agency is interested to see if changes in the scores are related to changes in a patient's clinical status. We have shown here that the MDASI-MPM is sensitive to changes in performance status (related to disease) and to patients' QOL ratings.
The MDASI-MPM takes less than 5 min to complete and can be easily adapted for clinical settings. This conciseness makes the MDASI-MPM well suited for frequent administration, which provides rich information about the trajectory of symptoms across the course of treatment. Such longitudinal information can be especially informative for stakeholders such as patients, clinicians, and regulators making decisions about evaluating new cancer therapies [24].
The symptom data from this large Phase II trial also indicate that some symptoms, both core MDASI and MPM-specific items, were relatively infrequently endorsed in the psychometric evaluation. Because these symptoms were also less severe, they could be excluded when developing a composite score as a potential outcome for future clinical trials.

Conclusion
This study provides psychometric evidence for the use of the MDASI-MPM in tracking changes in MPM symptoms during treatment. The instrument's severity and interference subscales exhibited high test-retest reliability and acceptable internal consistency reliability. The MDASI-MPM subscales were sensitive to changes in performance status (related to disease), as evidenced by significant correlations between MDASI-MPM ratings and ECOG PS over time and between MDASI-MPM ratings and QOL ratings over time. Our examination of the minimally important differences for the MDASI-MPM subscales should be useful as a guide for those designing clinical trials in which symptom change is a potential endpoint.