Skip to content

Advertisement

  • Research
  • Open Access

The responsiveness of the PROMIS instruments and the qDASH in an upper extremity population

  • 1, 2, 3Email author,
  • 1,
  • 3,
  • 1,
  • 1,
  • 1,
  • 1,
  • 1 and
  • 1
Journal of Patient-Reported Outcomes20171:12

https://doi.org/10.1186/s41687-017-0019-0

  • Received: 10 April 2017
  • Accepted: 13 November 2017
  • Published:

Abstract

Background

This study evaluated the responsiveness of several PROMIS patient-reported outcome measures in patients with hand and upper extremity disorders and provided comparisons with the qDASH instrument.

Methods

The PROMIS Upper Extremity computer adaptive test (UE CAT) v1.2, the PROMIS Physical Function (PF) CAT v1.2, the PROMIS Pain Interference (PI) CAT v1.1 and the qDASH were administered to patients presenting to an orthopaedic hand clinic during the years 2014–2016, along with anchor questions. The responsiveness of these instruments was assessed using anchor based methods. Changes in functional outcomes were evaluated by paired-sample t-test, effect size, and standardized response mean.

Results

There were a total of 255 patients (131 females and 124 males) with an average age of 50.75 years (SD = 15.84) included in our study. Based on the change and no change scores, there were three instances (PI at 3 months, PI >3 months, and qDASH >3 months follow-ups) where scores differed between those experiencing clinically meaningful change versus no clinically meaningful change. Effect sizes for the responsiveness of all instruments were large and ranged from 0.80–1.48. All four instruments demonstrated high responsiveness, with a standardized response mean ranging from 1.05 to 1.63.

Conclusion

The PROMIS UE CAT, PF CAT, PI CAT, and qDASH are responsive to patient-reported functional change in the hand and upper extremity patient population.

Keywords

  • Responsiveness
  • Patient-reported outcomes
  • PROMIS
  • qDASH
  • Physical function
  • Pain
  • Orthopaedics

Background

There has been an important shift toward the use and development of quality patient-reported outcome (PRO) instruments that minimize responder burden and exhibit sufficient reliability, validity, and clinical relevance. [1] These tools can assist in the accurate measurement of clinical outcomes, which are fundamental for rigorous clinical research as well as in improving the quality of care offered to patients. In order for PRO instruments to have these desired research and clinical benefits, validation studies are critical. Fitting this new emphasis, the Journal of Patient-Reported Outcomes has included rigorous studies on the development and evaluation of PROs in the aims and scope of the journal. [2] Determining whether a PRO instrument is responsive—i.e. able to detect a change in a patient’s reported health status or function—is an important pre-requisite for using such instruments to assess treatment effect.

The Patient-Reported Outcomes Measurement Information System (PROMIS) health measure improvement initiative was funded by the National Institutes of Health with the purpose of improving the quality of PROs. The development took a systematic approach, drawing from instruments already in use. Existing items were categorized, reviewed, and revised, creating a large pool of items that were then validated with item-response theory to allow for computer adapted testing, making the PROMIS instruments an important contribution to clinical and research practice while minimizing respondent burden. [35] The PROMIS Physical Function Computer Adaptive Test (PF CAT) and PROMIS Upper Extremity (UE) CAT instruments can be utilized to measure patients’ self-reported upper extremity health status, and have several advantages over other metrics. [6] The PROMIS UE and PF CAT have both demonstrated favorable performance characteristics and correlate well with the shortened version of the Disabilities of the Arm, Shoulder, and Hand (qDASH) in an orthopaedic upper extremity patient population. [7, 8] The responsiveness of these PROMIS instruments have not yet been evaluated in this same patient population.

Assessing responsiveness requires longitudinal data with repeated measures, where the same individual is assessed with the same instrument on at least two occasions. [9] Responsiveness can be assessed with either internal or external methods. Internal analysis of responsiveness evaluates the level of change based on the size of the differences between scores, and how much scores vary over time. [10] External responsiveness methods use an external anchor to relate the level of change to some other meaningful report of patient change, either a clinical gold-standard assessment or the patient’s own report of change. [11, 12] Detecting change is particularly important for PRO instruments if they are to be used to guide decisions in clinical practice.

The purpose of this study, therefore, is to evaluate the responsiveness of three PROMIS patient-reported outcome measures in patients with hand and upper extremity (non-shoulder) disorders and provide comparisons with the qDASH legacy instrument.

Methods

Patient sample

Institutional Review Board approval was obtained prior to the start of this study and informed consent was obtained from all participants as they sought medical care for orthopaedic conditions. The sample consisted of 255 new patients presenting to an academic upper-extremity (non-shoulder) clinic between the years of 2014 and 2016. All patients were 18 years or older and sought treatment for upper extremity musculoskeletal conditions. At the time of their clinic visits and prior to seeing a physician, patients were administered anchor questions and PROs electronically on a handheld tablet computer. Patients were recruited consecutively and PROs were administered as part of the standard clinic treatment protocol, with 1.5% of patients refusing to participate clinic-wide.

Patients were seen for a variety of upper extremity conditions with treatments including wound and bone care, skin grafts, tendon/ligament repair, incisions, implants, bursas, reconstructions, fractures, transplants, decompression, arthroscopy, endoscopy, nerve blocks, and carpel tunnel surgery. Depending on individual patient circumstance and timing in follow-up care, different patient samples could be included in the different follow-up periods (see Table 1). Also, depending on the diagnostic condition and treatment planning, patients differed in the amount or type of treatment received during the follow-up periods. This variation in treatment and follow-up timing is typical of a standard UE orthopaedic practice. Four patient follow-up periods were examined in this study: (1) 3-month follow-up (i.e., 80 to 100 days after initial assessment; (2) >3-month follow-up (i.e., 90 days or more after initial assessment); (3) 6-month follow-up (i.e., 170 to 190 days after initial assessment); and (4) >6-month follow-up (i.e., 180 days or more after initial assessment). Three and six-months are common time-periods for follow-up in orthopaedic practice. [1320] These time-points were included in this analysis to correspond with prior literature and clinical practice.
Table 1

Demographics of patients

Patient characteristics

n

Percent

Mean (SD)

Range

Age (years)

  

50.75 (15.84)

18–90

Gender

 Male

124

48.6

  

 Female

131

51.4

  

Race

 White or Caucasian

221

86.7

  

 Asian

4

1.6

  

 American Indian and Alaska Native

3

1.2

  

 Black or African American

6

2.4

  

 Other

15

5.9

  

 Missing

6

2.4

  

Ethnicity

 Hispanic

18

7.1

  

 Non-Hispanic

232

91.0

  

 Missing

5

1.9

  

Tobacco User

 Yes

25

9.8

  

 No

211

82.7

  

 Missing

19

7.5

  

Procedure Type

 Removal of implant

4

1.6

  

 Excision, repair, surgery on the Humerus

7

2.8

  

 Excision, repair, surgery on the wrist or forearm

17

6.7

  

 Excision, repair, surgery on the hands and fingers

43

16.8

  

 Amputation procedures on the hand

1

0.4

  

 Neuroplasty, neurorrhaphy, arthroscopy, and misc. procedures

133

52.2

  

 Missing

50

19.5

  

Insurance Provider

 Industrial/Workers Compensation

23

9.0

  

 Medicaid

1

0.4

  

 Medicare

49

19.2

  

 No Fault Auto Insurance

3

1.2

  

 Private Insurance

168

66.2

  

 Self-Pay

6

2.4

  

 Tricare

3

1.2

  

Employment Status

 Disabled

14

5.5

  

 Full Time

121

47.5

  

 Part Time

13

5.1

  

 Not employed

28

11.0

  

 Retired

36

14.1

  

 Self Employed

13

5.1

  

 Student

9

3.5

  

 Missing

21

8.2

  

Patient-reported outcome measures

Three PROMIS instruments were administered to the patients: the PROMIS UE CATv1.2, the PROMIS PF CAT v1.2, and the PROMIS Pain Interference (PI) CAT v1.1. The PROMIS PF CAT v1.2 contains both upper extremity and lower extremity items and draws from a 121-item test bank. The PROMIS UE CAT v1.2 has a 16-item test bank and the PROMIS PI CAT v1.1 has a 40-item test bank. The qDASH was also administered, which is an 11-item, validated, shortened version of the 30-item Disabilities of the Arm and Shoulder (DASH) instrument. [21] The PROMIS instruments were made available through the Assessment Center, a secure web-based portal established by PROMIS developers. [22] Each of the four instruments were administered at baseline (i.e., either within seven days prior to the clinic visit of a new upper extremity condition or on the day of the first clinic visit) and at each follow-up visit patients attended.

All PROMIS instruments were calibrated in the general population with a mean of 50 and a standard deviation of 10 in the T-score scale, with patient scores primarily clustering between 20 to 80 points. [23] The larger the PROMIS PF or UE scores, the higher were the patients’ function, where the larger the PROMIS PI scores, the greater the pain interference experienced by the patients. The qDASH scores ranged from 0 to 100 with higher scores representing lower functioning levels.

Anchor questions

For physical function, patient responses were anchored by the question; ‘Compared to your FIRST EVALUATION at the University Orthopaedic Center: how would you describe your physical function now?’(much worse, worse, slightly worse, no change, slightly improved, improved, much improved). The idea of anchoring a change score to some other measure of patient outcome is to provide a reference point. When that reference point comes from patient reports of noticeable improvement or decline, it may be considered a meaningful level of change. [24] Patients reporting meaningful change (much worse, worse, improved, much improved) were included in the responsiveness analysis to detect the ability of the PROs to measure meaningful levels of change. [25] When there is symmetry in data characteristic, the improved and deteriorated change groups can be considered together creating a distinction between those experiencing change versus those with stable symptomology. [26]

For the PI, the anchor question queried pain (i.e., Compared to your FIRST EVALUATION at the University Orthopaedic Center: how would you describe your episodes of PAIN now?) rather than physical function, and patients reporting pain which was worse, much worse, improved, or much improved since their first clinic visit were included in the responsiveness analyses.

Statistical analysis

Patient demographics were examined and changes in their functional and pain outcomes were evaluated at four time points. Baseline scores were compared to the three-month follow-up scores (90 days plus or minus 10 days), six-month follow-up scores (180 days plus or minus 10 days), 90 days and beyond follow-up scores, and 180 days and beyond follow-up scores on all four patient-reported measures.

Change in the PRO metrics was calculated as the absolute value difference between the baseline score and the follow-up score for each patient. A paired sample two sided t-test was used to test the hypothesis that there was no difference in the PRO measures between time points on an individual patient level [10], with significance level set at p = 0.05. ANOVA was run to test the hypothesis that patients did not differ across levels of change.

A standardized measure of effect size (ES) was calculated using the Cohen’s d. Cohen’s d computes the difference in score between the baseline and the follow-up and then divides this difference by the baseline score standard deviation. This method takes into consideration the variability in scores, a step beyond the mean differences considered in the paired sample t-test [10]. In interpreting Cohen’s d, a small, medium, and large ES can be considered as d = 0.20, 0.50, and 0.80 respectively.

The standardized response mean (SRM) is another important indicator of ES, similar to the paired t-test, but removing dependence on sample size from the equation. [10] This is computed as the mean difference between baseline and follow-up PRO scores divided by the standard deviation of difference scores, reflecting individual changes in scores. Although there is not perfect consensus, recommended guidelines for interpreting SRM values are similar to interpretation of Cohen’s d. [10] All analyses were performed using either SPSS 23.0 (IBM SPSS Statistics for Windows, Armonk, NY: IBM Corp.), [27] or R 3.30 (R Development Core Team, Vienna, AT: R Foundation for Statistical Computing,) [28].

Results

This study included a total of 131 females and 124 males with ages ranging from 18 years to 90 years (mean age = 50.75, SD = 15.84). For demographic information including gender, race, ethnicity, tobacco use, procedure and insurance type, see Table 1.

Mean, SD, range, and median along with mean differences of scores of the PROMIS UE, PF, PI and qDASH are presented in Table 2. Mean change scores for the PROMIS PI ranged from 4.81–10.68 whereas mean for no change scores ranged from 4.32–6.05. The PROMIS PI at 3-month and >3-month follow-up and the qDASH at >3-month follow-up were the only measures and only time-points with confidence intervals (CI)‘s showing a substantial difference between change groups (see Table 3). The PROMIS PF mean change scores ranged from 8.36–8.91 whereas mean for no change scores ranged from 5.92–9.00. The UE had mean change scores ranging from 7.57–9.51 and mean no change scores ranging from 6.67–8.21. Lastly, the qDASH showed mean change scores between 18.18 and 24.22 and mean no change scores between 17.21 and 24.40.
Table 2

Descriptive statistics of PROMIS instruments and qDASH of patients

Instrument

n

Mean (SD)

Range

Median

Baseline

 PROMIS PF

254

45.45 (9.53)

23.47–70.25

43.18

 PROMIS PI

254

60.85 (7.34)

38.67–80.96

61.52

 PROMIS UE

254

32.48 (9.28)

14.66–56.39

32.13

 qDASH

255

50.09 (22.53)

4.54–97.73

50.00

3-month follow-up

 PROMIS PF

31

50.61 (10.73)

33.10–50.61

51.45

 PROMIS PI

31

52.77 (8.62)

38.67–71.60

52.57

 PROMIS UE

28

36.89 (10.14)

18.34–56.39

36.53

 qDASH

29

33.39 (23.74)

2.27–79.54

27.27

>3-month follow-up

 PROMIS PF

151

46.34 (8.76)

24.07–73.28

47.41

 PROMIS PI

177

56.20 (8.47)

38.67–80.07

55.98

 PROMIS UE

148

34.95 (8.30)

18.34–56.39

35.09

 qDASH

149

39.73 (22.76)

2.27–97.72

36.36

6-month follow-up

 PROMIS PF

18

47.70 (5.59)

32.60–56.06

47.83

 PROMIS PI

20

55.94 (3.46)

50.12–62.64

55.10

 PROMIS UE

18

35.62 (8.13)

18.35–56.26

35.85

 qDASH

18

34.84 (16.16)

11.36–77.27

35.23

>6-month follow-up

 PROMIS PF

53

44.51 (10.23)

2.84–70.25

44.72

 PROMIS PI

62

56.77 (8.87)

38.67–79.99

55.98

 PROMIS UE

52

33.77 (8.66)

17.74–56.39

34.54

 qDASH

55

41.79 (25.67)

6.82–93.18

36.36

Table 3

Mean Score Changes for PROMIS Instruments and qDASH

Instrument

n

No Change (SD)

n

Change (SD)

Mean Difference [95% CI]a

3-month follow-up

 PROMIS PF

29

9.00 (8.18)

31

8.64 (8.20)

0.36 [−3.88, 4.59]

 PROMIS PI

25

5.95 (7.51)

31

10.68 (6.56)

−1.47 [−8.50, −0.96]

 PROMIS UE

28

8.04 (6.19)

28

9.51 (7.54)

0.18 [−5.17, 2.24]

 qDASH

30

24.40 (20.53)

29

24.22 (16.81)

−4.72 [−9.62, 9.98]

>3-month follow-up

 PROMIS PF

177

7.14 (6.85)

151

8.53 (7.31)

−1.39 [−2.93, 0.15]

 PROMIS PI

145

6.05 (5.78)

177

7.48 (6.86)

−1.44 [−2.82, −0.05]

 PROMIS UE

173

7.44 (6.46)

148

8.54 (6.86)

−1.10 [−2.56, 0.36]

 qDASH

175

18.23 (17.10)

149

22.34 (17.75)

−4.10 [−7.93, −0.27]

6-month follow-up

 PROMIS PF

11

5.92 (6.23)

18

8.91 (7.06)

−2.99 [−8.30, 2.33]

 PROMIS PI

9

4.32 (3.70)

20

4.81 (4.16)

−0.48 [−3.80, 2.83]

 PROMIS UE

11

8.21 (5.46)

18

7.57 (5.33)

0.64 [−3.58, 4.87]

 qDASH

11

17.77 (14.40)

18

18.18 (13.34)

−0.41 [−11.60, 10.77]

>6-month follow-up

 PROMIS PF

78

6.73 (5.65)

53

8.36 (6.67)

−1.62 [−3.78, 0.53]

 PROMIS PI

69

5.97 (5.35)

62

6.71 (5.85)

−0.74 [−2.69, 1.20]

 PROMIS UE

76

6.67 (6.50)

52

8.37 (5.84)

−1.70 [−3.92, 0.52]

 qDASH

81

17.21 (17.09)

55

21.86 (17.34)

−4.65 [−10.60, 1.29]

aThis is the mean difference with its associated 95% confidence interval between the no change score and the change score

Only 20% of the patient sample had baseline PROMIS PF scores at the average 50th percentile T-score of 50, 5% had PROMIS UE scores over 50, and 5% had an average PROMIS PI pain score of 50, indicating this group had low levels of function and high levels of pain at baseline.

Paired t-test

At the 3-month, 6-month and >3-month follow-up, changes from baseline scores were significant for all instruments (p < 0.05). However, score changes for the >6-month time period varied in significance. The only instrument that did not show a significant change in scores was the UE CAT (p = 0.253), whereas the PF CAT, PI CAT, and qDASH showed significant changes in scores (p < 0.05; see Table 4). For all instruments, the baseline scores were not significantly different between the patients with missing and non-missing follow-up visit scores at all time points (p < 0.05) (results available upon request).
Table 4

Responsiveness of PROMIS instruments and qDASH of patients from baseline

Follow-up Period

Instrument

n

SRM [95% CI]

ES [95% CI] p-value

Paired t-test

3-month follow-up

 

PROMIS PF

31

1.05 [0.51, 1.57]

0.84 [0.31, 1.35]

< 0.001

PROMIS PI

31

1.63 [1.04, 2.18]

1.48 [0.90, 2.02]

< 0.001

PROMIS UE

28

1.26 [0.67, 1.81]

1.05 [0.48, 1.59]

0.006

qDASH

29

1.44 [0.84, 2.00]

1.12 [0.28, 1.66]

< 0.001

>3-month follow-up

 

PROMIS PF

150

1.16 [0.91, 1.40]

0.92 [0.68, 1.16]

< 0.001

PROMIS PI

176

1.09 [0.86, 1.31]

0.99 [0.77, 1.21]

< 0.001

PROMIS UE

148

1.24 [0.99, 1.49]

0.88 [0.64, 1.12]

0.001

qDASH

149

1.26 [1.01, 1.51]

0.97 [0.73, 1.21]

< 0.001

6-month follow-up

 

PROMIS PF

18

1.26 [0.52, 1.94]

0.83 [0.13, 1.49]

< 0.001

PROMIS PI

20

1.16 [0.47, 1.80]

0.79 [0.13, 1.42]

< 0.001

PROMIS UE

18

1.42 [0.66, 2.12]

0.85 [0.15, 1.51]

0.001

qDASH

18

1.36 [0.61, 2.05]

0.80 [0.10, 1.46]

< 0.001

>6-month follow-up

 

PROMIS PF

52

1.25 [0.82, 1.66]

0.87 [0.46, 1.27]

0.033

PROMIS PI

61

1.15 [0.76, 1.53]

0.96 [0.58, 1.33]

< 0.001

PROMIS UE

52

1.43 [0.99, 1.85]

0.85 [0.44, 1.24]

0.253

qDASH

55

1.26 [0.84, 1.66]

0.93 [0.53, 1.32]

0.006

Effect size

All four instruments showed a high degree of responsiveness across all four follow-up periods. For the 3-month follow-up group, all instruments had high responsiveness ranging from 0.84–1.48. The instrument that was the most responsive for the 3-month follow-up was the PI CAT (1.48), whereas the PF CAT was the least responsive (0.84).

The 6-month follow-up also showed high responsiveness ranging from 0.79–0.85. The PI CAT was the least responsive at the 6-month follow-up (0.79) whereas the UE CAT was the most responsive (0.85). When looking at the >3-month follow-up time period of 90 days or more, responsiveness was still high (0.92–0.99). The least responsive measurement for this time period was the UE CAT (0.92) while the PI CAT showed the highest responsiveness (0.99). For the >6-month time period of 180 days or more, all instruments still showed high responsiveness but the PI CAT was the most responsive (0.97) whereas the UE CAT was the least (0.85). Overall, the PI CAT was consistently the most responsive to change when looking at ES (see Table 4). The 95% CI’s of the effect sizes demonstrates a meaningful difference in measure responsiveness at each follow-up time-point for each instrument, though the CI range for all measures dipped to include potential for a small effect in the 6-month follow-up period.

Standardized response mean

All instruments had high responsiveness as measured by the SRM (1.05–1.63). The 95% CI’s around the SRM were all medium-large, ranging from 0.51–2.18, and reflect the overall larger size of effect as measured by the SRM compared to the ES on every measure at every time-point. In the 3-month follow-up group, the most responsive instrument was the PI CAT (1.63) while the PF CAT was the least responsive instrument (1.05) among the four. The 6-month follow-up showed that the PROMIS UE was the most responsive (1.42) whereas the PI CAT was the least (1.16). In the >3-month follow-up time period of 90 days or more, the PI CAT remained the least responsive instrument (1.09) whereas the qDASH was the most responsive (1.26). However, the UE CAT had the highest SRM (1.43) while the PI CAT had the lowest (1.15) for the >6-month follow up time period of 180 days or more. In general, the UE CAT was the most responsive to change when applying the SRM (see Table 4).

Discussion

The main finding of this study is that the PROMIS Upper Extremity CAT, Physical Function CAT, and Pain Interference CAT are responsive to patient reported functional change in a hand and upper extremity (non-shoulder) orthopaedic population. In addition, the magnitude of the responsiveness for each instrument was large. The three statistical methods (SRM, ES, and paired t-test) that were utilized provided similar results in most instances. However, the external validity of assessing change was poor in the PROMIS PF and UE as well as some follow-up time points of the PROMIS PI and qDASH when mean scores were compared in the subsamples with no-change in condition versus meaningful change.

We tested a traditional time-frame for three-month and six-month follow-up capturing a window of 10 days on either side of the follow-up cut-off. Strict cut-off limits restrict the inclusion of patient scores for those who did not have follow-up visits that fit within the narrow time-frames. The relevance of the sampling cut-offs to the interpretation can be seen with the small sample size (18–20 participants) in the 6-month follow-up group (170–190 days). This restricted sample was the only time-point that resulted in a 95% CI around the effect size that ranged low enough to include potential for a small effect in the interpretation. In contrast, the larger sample sizes in the other follow-up periods resulted in CI’s with medium/large to large effects. We also tested 90 days and beyond and 180 days and beyond as alternative time-frames to test the robustness of these cut-offs to the measure’s responsiveness. Our study findings that comparable effect sizes could be seen across the differing follow-up cut-offs, with minimal exceptions, provides cross-validation for the use of commonly used three and six-month follow-up cut-off points.

It is interesting to note that the time-period in which change scores were the greatest differed for different instruments. For the PROMIS PF, there was little difference between change scores at 3 and 6 month follow-up. For the PROMIS PI, pain interference change was greater at the earlier follow-up points. The PROMIS UE and qDASH similarly showed more change in function at earlier time points. These differences likely represent the greater heterogeneity in patient condition and treatment factors that occur by later measurement periods, but may also reflect the nature of improvement in upper extremity disorders. It may also reflect the low level of functioning and high levels of pain reported by this sample of upper extremity patients at baseline visits.

Prior work on the measurement characteristics of the PROMIS UE, PF, and PI CAT in a hand and upper extremity patient population have demonstrated the validity of these measures while minimizing respondent burden [8, 2932]. Whether or not these PROMIS instruments are able to detect patient reported change in health or function, however, has remained an important albeit open question. This study demonstrates the responsiveness of these three PROMIS instruments. Understanding responsiveness to change is essential in translational research to advance clinical trials, comparative effectiveness studies and most importantly, clinicians’ knowledge in interpreting outcome measures enabling more meaningful interactions with patients.

Limitations

All patients visiting the hand and upper extremity orthopaedic clinic were included in the assessment of responsiveness, and we did not characterize our results based on individual diagnosis or treatments. Differing disease conditions and/or treatments may show different responsiveness indices, and therefore the findings of this study should be considered preliminary. Future work may include investigation of the responsiveness of the PROMIS instruments for individual conditions and treatments. The sample size for the 6-month follow-up was small and results from this time-point may not be as reliable as those with larger samples. We are continuing to collect data from patients and will conduct further study with larger samples and different time frames as data become available. Future work should be performed to analyze upper extremity conditions at varying levels of function, not just change, to see if instruments are as responsive to those with high functioning as to those with lower levels of function. It would also be useful to consider the differences by anchor score, of those reporting varying levels of improvement. The PROMIS PF has been shown to have a ceiling effect especially in relation to items that fall in the upper extremity areas of function. [29, 33] In this patient population, functioning levels were low, so the ceiling effect likely did not impact the results. Both the PROMIS PF and PROMIS UE would benefit from this additional analysis of responsiveness at the upper levels of function in future research, potentially using Rasch modeling based on the distribution of scores rather than the external anchoring.

Conclusions

The PROMIS UE CAT, PF CAT, PI CAT, and qDASH were able to effectively detect change in physical function and pain interference in an orthopaedic hand and upper extremity clinic. The responsiveness of the PROMIS instruments demonstrated by this study adds to the prior rigorous psychometric validation of instruments reported in the literature, and should assist clinicians and researchers to make informed decisions regarding instrument selection in assessing patient reported outcomes in the upper extremity [34].

Abbreviations

ES: 

Effect size

PF CAT: 

PROMIS Physical Function computerized adaptive test

PI CAT: 

PROMIS Pain Interference computerized adaptive test

PRO: 

Patient-reported outcome

PROMIS: 

Patient Reported Outcomes Measurement Information System

qDASH: 

Disabilities of the Arm, Shoulder, and Hand; shortened version

SRM: 

Standardized response mean

UE CAT: 

PROMIS Upper Extremity computerized adaptive test

Declarations

Funding

This study was supported by the University of Utah Department of Orthopaedics Quality Outcomes Research and Assessment (http://QualityOutcomesResearch.com) and the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health under award number U01AR067138. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Authors’ contributions

MH: study oversight, study design, literature review, data acquisition, data processing, data analysis, data interpretation, manuscript drafting, manuscript revision, final approval, funding support. CS: study design, manuscript revision, final approval, funding support. TG: study design, manuscript revision, final approval, funding support. MWV: literature review, data analysis, manuscript drafting, manuscript revision, final approval. JB: literature review, data analysis, manuscript drafting, manuscript revision, final approval. YG: data processing, data analysis, manuscript revision, final approval. AW: data acquisition, manuscript revision, final approval. DH: data acquisition, manuscript revision, final approval. AT: data acquisition, manuscript revision, final approval.

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the intuitional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Institutional Review Board approval was obtained from the University of Utah, approval number 94548.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Department of Orthopaedic Surgery Operations, University of Utah, School of Medicine, 590 Wakara Way, Salt Lake City, UT 84108, USA
(2)
Division of Public Health, University of Utah, School of Medicine, 375 Chipeta Way Ste. A, Salt Lake City, 84108, USA
(3)
Population Health Foundation, University of Utah, 295 Chipeta Way, Williams Building, Room 1C448, Salt Lake City, UT 84132, USA

References

  1. Deutsch, L., Smith, L., Gage, B., Kelleher, C., & Garfinkel, D. (2012). Patient-reported outcomes in performance measurement: Commissioned paper on PRO-based performance measures for healthcare accountable entities. Washington, DC: National Quality Forum.Google Scholar
  2. Revicki, D. F., F. (2016). Editorial: Journal of patient-reported outcomes - aims and scope. Journal of Patient Reported Outcomes.Google Scholar
  3. DeWalt, D. A., Rothrock, N., Yount, S., & Stone, A. A. (2007). Evaluation of item candidates: The PROMIS qualitative item review. Med Care, 45(5 Suppl 1), S12–S21. doi:10.1097/01.mlr.0000254567.79743.e2.View ArticlePubMedPubMed CentralGoogle Scholar
  4. Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., et al. (2010). The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J Clin Epidemiol, 63(11), 1179–1194.View ArticlePubMedPubMed CentralGoogle Scholar
  5. Fries, J. F., Witter, J., Rose, M., Cella, D., Khanna, D., & Morgan-DeWitt, E. (2014). Item response theory, computerized adaptive testing, and PROMIS: Assessment of physical function. J Rheumatol, 41(1), 153–158. doi:10.3899/jrheum.130813.View ArticlePubMedGoogle Scholar
  6. Fries, J., Rose, M., & Krishnan, E. (2011). The PROMIS of better outcome assessment: Responsiveness, floor and ceiling effects, and internet administration. J Rheumatol, 38(8), 1759–1764. doi:10.3899/jrheum.110402.View ArticlePubMedGoogle Scholar
  7. Hays, R. D., Spritzer, K. L., Amtmann, D., Lai, J.-S., DeWitt, E. M., Rothrock, N., et al. (2013). Upper-extremity and mobility subdomains from the patient-reported outcomes measurement information system (PROMIS) adult physical functioning item bank. Arch Phys Med Rehabil, 94(11), 2291–2296.View ArticlePubMedGoogle Scholar
  8. Döring, A.-C., Nota, S. P., Hageman, M. G., & Ring, D. C. (2014). Measurement of upper extremity disability using the patient-reported outcomes measurement information system. J Hand Surg, 39(6), 1160–1165.View ArticleGoogle Scholar
  9. Revicki, D. A., Cella, D., Hays, R. D., Sloan, J. A., Lenderking, W. R., & Aaronson, N. K. (2006). Responsiveness and minimal important differences for patient reported outcomes. Health Qual Life Outcomes, 4, 70. doi:10.1186/1477-7525-4-70. View ArticlePubMedPubMed CentralGoogle Scholar
  10. Husted, J. A., Cook, R. J., Farewell, V. T., & Gladman, D. D. (2000). Methods for assessing responsiveness: A critical review and recommendations. J Clin Epidemiol, 53(5), 459–468.View ArticlePubMedGoogle Scholar
  11. Wyrwich, K., Norquist, J., Lenderking, W., Acaster, S., & Research, I. A. C. o. I. S. f. Q. o. L. (2013). Methods for interpreting change over time in patient-reported outcome measures. Qual Life Res, 22(3), 475–483.View ArticlePubMedGoogle Scholar
  12. Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol, 61(2), 102–109. doi:10.1016/j.jclinepi.2007.03.012.View ArticlePubMedGoogle Scholar
  13. Paatelma, M., Kilpikoski, S., Simonen, R., Heinonen, A., Alen, M., & Videman, T. (2008). Orthopaedic manual therapy, McKenzie method or advice only for low back pain in working adults: A randomized controlled trial with one year follow-up. J Rehabil Med, 40(10), 858–863. https://doi.org/10.2340/16501977-0262.View ArticlePubMedGoogle Scholar
  14. Uchiyama, S., Imaeda, T., Toh, S., Kusunose, K., Sawaizumi, T., Wada, T., et al. (2007). Comparison of responsiveness of the Japanese Society for Surgery of the hand version of the carpal tunnel syndrome instrument to surgical treatment with DASH, SF-36, and physical findings. J Orthop Sci, 12(3), 249–253. https://doi.org/10.1007/s00776-007-1128-z.View ArticlePubMedPubMed CentralGoogle Scholar
  15. Carmont, M. R., Silbernagel, K. G., Nilsson-Helander, K., Mei-Dan, O., Karlsson, J., & Maffulli, N. (2013). Cross cultural adaptation of the Achilles tendon Total rupture score with reliability, validity and responsiveness evaluation. Knee Surg Sports Traumatol Arthrosc, 21(6), 1356–1360. https://doi.org/10.1007/s00167-012-2146-8.View ArticlePubMedGoogle Scholar
  16. Landauer, F., Wimmer, C., & Behensky, H. (2003). Estimating the final outcome of brace treatment for idiopathic thoracic scoliosis at 6-month follow-up. Pediatr Rehabil, 6(3–4), 201–207. https://doi.org/10.1080/13638490310001636817.View ArticlePubMedGoogle Scholar
  17. Little, D. G., & MacDonald, D. (1994). The use of the percentage change in Oswestry disability index score as an outcome measure in lumbar spinal surgery. Spine, 19(19), 2139–2143.View ArticlePubMedGoogle Scholar
  18. Cornell, C. N., Levine, D., O'Doherty, J., & Lyden, J. (1998). Unipolar versus bipolar hemiarthroplasty for the treatment of femoral neck fractures in the elderly. Clin Orthop Relat Res, (348), 67–71.Google Scholar
  19. Kotsis, S. V., & Chung, K. C. (2005). Responsiveness of the Michigan hand outcomes questionnaire and the disabilities of the arm, shoulder and hand questionnaire in carpal tunnel surgery. J Hand Surg, 30(1), 81–86. https://doi.org/10.1016/j.jhsa.2004.10.006.View ArticleGoogle Scholar
  20. MacDermid, J. C., Richards, R. S., Donner, A., Bellamy, N., & Roth, J. H. (2000). Responsiveness of the short form-36, disability of the arm, shoulder, and hand questionnaire, patient-rated wrist evaluation, and physical impairment measurements in evaluating recovery after a distal radius fracture. J Hand Surg, 25(2), 330–340. https://doi.org/10.1053/jhsu.2000.jhsu25a0330.View ArticleGoogle Scholar
  21. Beaton, D. E., Wright, J. G., & Katz, J. N. (2005). Development of the QuickDASH: Comparison of three item-reduction approaches. J Bone Joint Surg (Am Vol), 87(5), 1038–1046.Google Scholar
  22. Gershon, R. C., Rothrock, N., Hanrahan, R., Bass, M., & Cella, D. (2010). The use of PROMIS and assessment center to deliver patient-reported outcome measures in clinical research. J Appl Meas, 11(3), 304–314.PubMedPubMed CentralGoogle Scholar
  23. Rose, M., Bjorner, J. B., Gandek, B., Bruce, B., Fries, J. F., & Ware, J. E. (2014). The PROMIS physical function item Bank was calibrated to a standardized metric and shown to improve measurement efficiency. J Clin Epidemiol, 67(5), 516–526. https://doi.org/10.1016/j.jclinepi.2013.10.024.View ArticlePubMedPubMed CentralGoogle Scholar
  24. Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials, 10(4), 407–415.View ArticlePubMedGoogle Scholar
  25. Gummesson, C., Atroshi, I., & Ekdahl, C. (2003). The disabilities of the arm, shoulder and hand (DASH) outcome questionnaire: Longitudinal construct validity and measuring self-rated health change after surgery. BMC Musculoskelet Disord, 4, 11–11. https://doi.org/10.1186/1471-2474-4-11.View ArticlePubMedPubMed CentralGoogle Scholar
  26. Juniper, E. F., Guyatt, G. H., Feeny, D. H., Ferrie, P. J., Griffith, L. E., & Townsend, M. (1996). Measuring quality of life in children with asthma. Qual Life Res, 5(1), 35–46.View ArticlePubMedGoogle Scholar
  27. Corp, I. B. M. (2015). SPSS statistics for windows. Armonk, NY: IBM Corp.Google Scholar
  28. Development Core Team, R. (2010). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
  29. Hung, M., Clegg, D. O., Greene, T., & Saltzman, C. L. (2011). Evaluation of the PROMIS physical function item bank in orthopaedic patients. J Orthop Res, 29(6), 947–953.View ArticlePubMedGoogle Scholar
  30. Hung, M., Hon, S. D., Franklin, J. D., Kendall, R. W., Lawrence, B. D., Neese, A., et al. (2014). Psychometric properties of the PROMIS physical function item bank in patients with spinal disorders. Spine, 39(2), 158–163.View ArticlePubMedGoogle Scholar
  31. Hung, M., Stuart, A. R., Higgins, T. F., Saltzman, C. L., & Kubiak, E. N. (2014). Computerized adaptive testing using the PROMIS physical function item Bank reduces test burden with less ceiling effects compared with the short musculoskeletal function assessment in Orthopaedic trauma patients. J Orthop Trauma, 28(8), 439–443.View ArticlePubMedGoogle Scholar
  32. Morgan, J. H., Kallen, M. A., Okike, K., Lee, O. C., & Vrahas, M. S. (2015). PROMIS physical function computer adaptive test compared with other upper extremity outcome measures in the evaluation of proximal Humerus fractures in patients older than 60 years. J Orthop Trauma, 29(6), 257–263.View ArticlePubMedGoogle Scholar
  33. Rose, M., Bjorner, J. B., Becker, J., Fries, J., & Ware, J. (2008). Evaluation of a preliminary physical function item bank supported the expected advantages of the patient-reported outcomes measurement information system (PROMIS). J Clin Epidemiol, 61(1), 17–33.View ArticlePubMedGoogle Scholar
  34. Hung, M., Voss, M. W., Bounsanga, J., Crum, A. B., & Tyser, A. R. (2016). Examination of the PROMIS upper extremity item bank. J Hand Surg, 1–5. https://doi.org/10.1016/j.jht.2016.10.008.

Copyright

© The Author(s) 2017

Advertisement