Development of a patient-reported outcome measure (PROM) and change measure for use in early recovery following hip or knee replacement

Strickland, Louise H.; Murray, David W.; Pandit, Hemant G.; Jenkinson, Crispin

doi:10.1186/s41687-020-00262-1

Research
Open access
Published: 07 November 2020

Development of a patient-reported outcome measure (PROM) and change measure for use in early recovery following hip or knee replacement

Louise H. Strickland ORCID: orcid.org/0000-0003-4486-1868¹,
David W. Murray¹,
Hemant G. Pandit^1,2 &
…
Crispin Jenkinson³

Journal of Patient-Reported Outcomes volume 4, Article number: 91 (2020) Cite this article

3318 Accesses
2 Citations
8 Altmetric
Metrics details

Abstract

Background

Hip and knee replacement are effective procedures for end-stage arthritis that has not responded to medical management. However, until now, there have been no validated, patient-reported tools to measure early recovery in this growing patient population. The process of development and psychometric evaluation of the Oxford Arthroplasty Early Recovery Score (OARS), a 14-item patient-reported outcome measure (PROM) measuring health status, and the Oxford Arthroplasty Early Change Score (OACS) a 14-item measure to assess change during the first 6 weeks following surgery is reported.

Patients and methods

A five-phased, best practice, iterative approach was used. From a literature based starting point, qualitative interviews with orthopaedic healthcare professionals, were then performed ascertaining if and how clinicians would use such a PROM and change measure. Analysis of in-depth patient-interviews in phase one identified important patient-reported factors in early recovery which were used to provide questionnaire themes. In Phase two, candidate items from Phase One interviews were generated and pilot questionnaires developed and tested. Exploratory factor analysis with item reduction and final testing of the questionnaires was performed in phase three. Phase Four involved validation testing.

Results

Qualitative interviews (n = 22) with orthopaedic healthcare professionals, helped determine views of potential users, and guide structure. In Phase One, factors from patient interviews (n = 30) were used to find questionnaire themes and generate items. Pilot questionnaires were developed and tested in Phase Two. Items were refined in the context of cognitive debrief interviews (n = 34) for potential inclusion in the final tools. Final testing of questionnaire properties with item reduction (n = 168) was carried out in phase three. Validation of the OARS and OACS was performed in phase four. Both measures were administered to consecutive patients (n = 155) in an independent cohort. Validity and reliability were assessed. Psychometric testing showed positive results, in terms of internal consistency and sensitivity to change, content validity and relevance to patients and clinicians. In addition, these measures have been found to be acceptable to patients throughout early recovery with validation across the 6 week period.

Conclusions

These brief, easy-to-use tools could be of great use in assessing recovery pathways and interventions in arthroplasty surgery.

Background

The incidence of arthritis is increasing [1]. The World Health Organisation (WHO) has identified arthritis as one of the top ten disabling conditions. As the number of people experiencing arthritis increases, thus the number of patients requiring surgical intervention increases. It is estimated that approximately 200,000 hip and knee replacements are performed in the United Kingdom (UK) annually [2], with the number being 1 million in the United States (US) [3, 4]. This number is anticipated to continue to grow significantly over the next 10 years [5]. Despite increases in the frequency of this procedure, the way we measure recovery has not changed in recent years.

Optimising perioperative recovery is critical to enhance patient care, ensure timely discharge from the patient, clinician and hospital perspective and improve short and long-term outcomes after surgery [6]. However, until now, there has been debate about how to measure recovery with previously used measures in early recovery not patient-reported.

Prior to commencing this study, a systematic literature review was performed to evaluate the need for an early recovery PROM or change measure in this patient population [7]. The most important finding from this review is that whilst 15 instruments were identified to assess postoperative recovery, none were found to fulfil all quality criteria [8] and be valid for assessing early postoperative recovery in the hip or knee arthroplasty population. This specifically revealed that previously used measures were found to be inappropriate to accurately evaluate the quality of recovery and lacked precision. Only seven out of the 15 instruments included any orthopaedic patients in their development. Within those seven, less than 15% of those patients were orthopaedic. Thus limiting the applicability of these instruments as it is likely that recovery factors important to patients undergoing orthopaedic surgery are significantly different to recovery factors in patients undergoing other types of surgery. Being able to measure patient-reported outcomes following arthroplasty could be of great benefit in clinical trials involving medication, care pathways and implant selection. It could also potentially work to optimise routine care by allowing provision of appropriate, safe, timely care and interventions.

The process of development for a PROM and a measure to determine postoperative change since surgery was therefore begun with initial qualitative work being performed to facilitate concept understanding and item generation. The Food and Drug Administration (FDA) guidelines [9] provide a thorough outline by which new patient-reported outcome measures (PROMs) should be developed. Item generation comes directly from patient statements and from the patient population the tools are being designed to serve. Throughout the entire process including item generation, selection of candidate items, wording changes and item reduction, a detailed item tracking matrix was maintained. The item tracking matrix provides ease of identification in item modifications, direct patient sources, and a record of item deletions.

Methods and phases

The Oxford Arthroplasty Early Recovery Score (OARS) and the Oxford Arthroplasty Early Change Score (OACS) were developed and tested through mixed methods research study and was carried out across two stages (five phases) in strict accordance the Food and Drug Administration (FDA) guide [9] for best practice in PROM development (Fig. 1).

Stage one: item generation and initial questionnaire development

Planning phase

The initial planning phase, consisted of exploratory semi-structured interviews (n = 22) to explore orthopaedic healthcare professionals’ experience and perspective of early recovery for patients undergoing total hip arthroplasty (THA) or total knee arthroplasty (TKA). These were used to guide structure and layout of the questionnaires.

Design

In the planning phase interviews, semi-structured interviews were utilised to explore the experience and perspective of the early recovery period by healthcare providers caring for patients undergoing THA or TKA. These interviews, were guided by a list of interview prompts, which facilitated further exploration of areas of the topic that need to be covered by the interviewer [10]. The interview guides were standardised and consisted of open-ended questions and prompts. These were developed by the research team and patient partners.

Analysis

An in-depth pragmatic thematic analysis method [11] was utilised. Thematic analysis facilitates identification of themes or commonalities in interview transcripts. It helps organise and understand data. In this research developing a patient-reported recovery measure, it is important to fully explore and understand the themes that are of importance in postoperative arthroplasty recovery for patients and healthcare providers. These interviews provided background and clinician perspective to the possible new PROM.

Phase one

Phase one of the study, patients undergoing THA or TKA were interviewed (n = 30) during the early perioperative period between the day of surgery and discharge from the surgeons care between 6 and 8 weeks. A conceptual model was utilised when developing the new tools [12]. In addition, results from the phase one qualitative findings were considered for making decisions about item reduction.

Design

Phase one interviews, consisted of semi-structured interviews to explore the experience and perspective of patients during the early recovery period undergoing THA or TKA. These interviews, were guided by a list of interview prompts, which facilitated further exploration of areas of the topic that need to be covered by the interviewer [10]. The interview guides were standardised and consisted of open-ended questions and prompts. These were developed by the research team and patient partners.

Analysis

As in the planning phase, an in-depth pragmatic thematic analysis method [11] was utilised. Thematic analysis facilitates identification of themes or commonalities in interview transcripts. It helps organise and understand data. Coding is the technique by which themes are identified and organised. This method of analysis was chosen for several reasons. It was vital for this research developing a patient-reported recovery measure to fully explore and understand the themes that are of importance in postoperative arthroplasty recovery for patients and healthcare providers. As the tool was being designed for use in both clinical and home settings (the latter after discharge), the tool needed to be meaningful and effective in a real world setting [13]. Following immediate exact word-for-word transcription, interviews were anonymised to remove any participant identifiable data.

These transcribed interviews were then imported into NVivo software (NVivo qualitative data analysis Software; QSR International Pty Ltd. Version 11, 2015) and analysis performed. Themes that were important to patients undergoing hip or knee replacement were recorded. This process is known as coding in qualitative research [14]. Initial coding of the interviews was performed independently by two reviewers, the researcher and an expert colleague in qualitative research, to ensure thorough coverage of the work. Interviews were coded based on the patients’ words and context. The interviews were analysed in an iterative ongoing basis. This technique is designed to elucidate any new themes that may emerge as the study is being performed and allows for the iterative process of interview adaptation to occur. If any new areas come to light during the earlier interviews, they can be added into subsequent interviews as interview prompts. This too helps to ensure full coverage of the concept being explored [13]. The sample size for participants was guided by data saturation [15, 16] which is the time at which subsequent interviews did not produce any new themes. Interviews were coded based on the participants’ words and context. This important part of the analysis was performed independently by two researchers and discussed. Any unresolved concerns were taken to a third researcher for further resolution.

Item generation came directly from patient statements and from the patient population the tools are being designed to serve. Following thematic analysis and coding of the qualitative interviews, a list of potential sample items were created for each theme. These statements included items from all patient interviews and all themes from the Phase One analyses. These were reviewed by the expert and patient panel which included two surgeons, two nurses, an anaesthetist, two hip and knee arthroplasty patients, a psychometrician with a particular interest in patient related outcomes, and one patient caregiver. The purpose of this panel was to review and evaluate potential items. In addition, ideas and suggestions for layout and response options were also discussed.

Phase two

The first iteration of the OARS contained 18 items. The items in this PROM covered all aspects of the early recovery period. The first iteration of the OACS contained 25 items. This change measure included items designed to cover the concepts of early recovery and the change that may occur during the first 6 weeks. The items covered all themes reported by patients.

Design

The candidate questionnaires were tested once during the patients’ hospital stay and cognitive debrief interviews (n = 34) were used to assess items for face and content validity. Validity of an outcome measure is the extent to which it measures what it claims to measure. This is assessed through consideration and evaluation of several different aspects, including content and construct validity [17]. Changes were made accordingly to the questionnaire.

Analysis

Patients were requested to complete the questionnaire in the context of cognitive debriefing. These techniques allow the interviewer to determine the meaning a participant gives to questions and why they selected particular response options [9, 18]. All participants received both the OARS and OACS at the time of testing. These draft questionnaires were administered once to 34 patients in the early postoperative period during hospitalisation (days 0 to 8). The participants were then asked to discuss the items, the reason for their answers and the meaning they attributed to them. During these interviews, participants were asked to discuss how thorough they felt coverage was of the topic of recovery after TKA or THA. In addition, they were asked if the questions were easily understandable and if they were relevant to their particular situation. These interviews were audio recorded and transcribed verbatim. This led to the first version of test questionnaires being developed that were refined in the following phases.

Prior to testing, a translatability assessment (TA) was performed on the two new measures. TA has been recognised as an important part of the questionnaire development process [9]. It provides insight into what extent the items in the questionnaires can be translated into other languages [19]. This is of particular importance for use in cross cultural trials. Changes may be made to the wording of some items as a consequence of this procedure. In addition, a concept elaboration document (CED) was created to fully define and clarify question items and the meanings attributed to them. This was developed in combination with the author [20] and specialist translators, to provide specific detail regarding the explicit line-by-line meaning of items and concepts, providing clarification of each line of the questionnaire [21].

Stage two: item reduction and scale generation. Testing reliability and validity

Phase three

Final questionnaire development and testing was performed in phase three.

Design

Patients (n = 168) were given questionnaires on days 1,2,3,7,14 and 6 weeks following either hip or knee replacement surgery. They were administered on days 1, 2, 3, 7 and 14 in the early postoperative period and also at 6 weeks following surgery. Exploratory factor analysis (EFA) was used to explore the dataset and determine what latent underlying constructs are being measured [22]. EFA evaluates the scale properties and aids in removing non-response level or are not internally consistent [23, 24]. All of the items for the OARS and OACS were put into a factor analytic model (Varimax with Kaiser Rotation). Varimax rotation was selected as it facilitates data pattern interpretation. EFA was performed on the most populous time point (day 1 testing). Only factors which gained an eigenvalue of > 1 were retained. Selection and decision making on the number of factors to be retained can be determined by multiple means of testing including eigenvalues and factor loadings [25]. In participants who were discharged prior to day three, questionnaires with stamped self-addressed envelopes were provided. The specific testing time points were chosen to maximise the information acquired from participants and also to provide thorough coverage of the early recovery period.

Analysis

In phase three, both exploratory factor analysis and item reduction were performed using SPSS 25 software. Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO) and Bartlett’s Test of Sphericity tests were performed to determine if the data was appropriate for factor analysis. Frequency tables were created and examined for floor and ceiling effects. An initial principal component analysis (PCA) was undertaken to determine if any of the items were not suitable for analysis or outliers. This process sorts variables into factors and indexes the amount of variance from each. This number is called the eigenvalue and, as previously mentioned, values above one are considered statistically significant and meaningful. Descriptive statistics and frequency tables reported. Exploratory factor analysis (EFA) was carried out on the most populous time point to determine what constructs underlie the data and to determine redundancy in any of the items. PCA with Varimax rotation was performed on days 1, 7, 14 and at 6 weeks. PCA is utilised to reduce the data into a smaller number of components. Varimax rotation was selected as it facilitates data pattern interpretation. Item reduction was reported. In this group, those with stronger factor correlations (above 0.5) are considered to have loaded on those factors [26]. The weaker, or non-loading, items can then be considered for removal. Internal consistency was also reported using the alpha statistic, known as Cronbach’s alpha, indicates the extent to which there is a pattern of responses to items. It is a commonly used statistical test for this purpose [27], with scores above 0.7 considered acceptable.

Phase four

Validation of questionnaires responsiveness and sensitivity to change were measured in phase four. These are important, closely related qualities in validating outcome measures, particularly if they are potentially being used in clinical decision making and trials [28]. The results from the OARS and OACS were evaluated for responsiveness, a measure’s ability to detect clinically important changes and sensitivity to change over time, a statistical feature of a measure. The initial testing time point was measured and compared with the means of additional testing points through 6 weeks. Construct validity is the extent to which a questionnaire measures what it claims to measure [29]. Comparison and correlation of the previously hypothesised dimensions of the SF-36v2 were made in relation to the new OARS to assess construct validity [29].

Design

The two final questionnaire versions were distributed to consecutive patients (n = 155) in a cohort of hip and knee replacement patients. They were again administered on days 1, 2, 3, 7 and 14 and also at 6 weeks following surgery. In addition a widely used, validated, generic health measure the Short Form-36 version 2 Acute (SF-36v2), United Kingdom (English) [30] was given to participants on days 7, 14 and 6 weeks. This self-administered questionnaire covers eight domains of both physical and mental health and has been used during the validation of other disease specific health measures across a wide range of conditions [31, 32]. The SF-36v2 Acute has a recall period of 1 week and therefore made it appropriate for use in evaluating the new OARS and OACS. Prior hypotheses for correlations were considered. These included that the highest correlations would be found between OARS domains and SF-36 v2 Acute dimensions between the following: OARS pain with the SF-36v2 Acute bodily pain; OARS nausea and feeling unwell with SF-36v2 Acute domain of general health; OARS fatigue and sleep with SF-36v2 Acute vitality; and OARS improving function and mobility with SF-36v2 Acute physical functioning.

Analysis

In phase four, validation and reliability was tested including scale generation and testing scale properties, descriptive statistics and frequency tables, internal consistency and construct validity. The SF-36, a previously validated generic health measure, was administered alongside the newly developed OARS and OACS to provide comparison and correlation for construct validity for the new measures. In addition, responsiveness and sensitivity to change were reported. The initial testing time point was measured and compared with the means of additional testing points through 6 weeks.

SPSS 25 software was used for analyses in phases three and four (IBM Corp. Released 2017. IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY: IBM Corp.). Recommended scoring algorithms were utilised for the SF-36v2 (Quality Metric Health Outcomes™ Scoring Software 5.0; 2016).

Sample size calculations and considerations

In testing of psychometric properties, larger samples are often considered desirable [33]. Sample sizes based on five to ten times the number of respondents as items are often quoted in the literature [34, 35]. This guideline was used in the testing phases for both OARS and OACS.

Results

Study samples

Participant demographic and surgical characteristics for each of the study phases are presented in Table 1. The planning phase participants are presented in Table 2. Participants ranged in age from 20 to 92 years of age. These participants represented a range of ethnicities, duration of diseases and lower limb joint replacements. In the planning phase, these participants represented a range of healthcare careers, and years of experience caring for orthopaedic patients undergoing lower limb joint replacement.

Table 1 Participant characteristics

Development of a patient-reported outcome measure (PROM) and change measure for use in early recovery following hip or knee replacement

Abstract

Background

Patients and methods

Results

Conclusions

Background

Methods and phases

Stage one: item generation and initial questionnaire development

Planning phase

Design

Analysis

Phase one

Design

Analysis

Phase two

Design

Analysis

Stage two: item reduction and scale generation. Testing reliability and validity

Phase three

Design

Analysis

Phase four

Design

Analysis

Sample size calculations and considerations

Results

Study samples

Conceptual framework

Stage one: item generation and initial questionnaire development

Planning phase

Phase one

Phase two

Stage two: item reduction and scale generation. Testing reliability and validity

Phase three

Oxford arthroplasty early recovery score (OARS)

Oxford arthroplasty early change score (OACS)

Internal consistency

Phase four

Scale generation and testing scale properties

Internal consistency reliability

Responsiveness and sensitivity to change

Construct validity

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords