PROMIS® Health Organization (PHO) 2021 Conference Abstracts

© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. 1* Intentionally omitted

Objective: Examine the dimensionality of the Impact Stratification Score (ISS) and support for its single total score, and evaluate the psychometric properties of ISS items. Methods: The sample of 1677 chiropractic patients being treated for chronic lower back pain and chronic neck pain, had an average age of 49, 71% female, and 90% White. Study participants completed the PROMIS-29 v2.1 profile survey that contains the 9 ISS items. The ISS was evaluated using item-rest correlations, Cronbach's alpha, factor analysis (i.e., correlated factors and bifactor models), and item response theory (IRT). Reliability indices and item properties were evaluated from bifactor and IRT models, respectively. Results: Item-rest correlations were high (0.64-0.84) with a Cronbach's alpha of 0.93. Eigenvalues suggested the possibility of two factors corresponding to physical function and pain interference/ intensity. Bifactor model results indicated that data were essentially unidimensional, primarily reflecting one general construct (i.e., impact) and that after accounting for 'impact' very little reliable variance remained in the two group factors. General impact scores were reliable (omegaH = 0.73). IRT models showed that items were strong indicators of impact and provided information across a wide range of the impact continuum and offer the possibility of a shorter 8-item ISS. Finally, it appears that different aspects of pain interference occur prior to losses in physical function. Conclusions: This study presents evidence that the ISS is sufficiently unidimensional, covers a range of chronic pain impact and is a reliable measure. Insights are obtained into the sequence of chronic pain impacts on patients' lives.
Objective: Despite conservative management of de Quervain's Tenosynovitis (DQT), many patients continue to suffer from recalcitrant symptoms necessitating surgical intervention. PROMIS scores at time of diagnosis might provide insights into success of non-operative management and predict necessity for surgical release. Methods: Patients presenting to a tertiary academic medical center from 2014-2019 with a sole diagnosis of DQT were identified. Patients < 18 years old or that had other diagnosis were excluded. Patients were separated by treatment: physical therapy, injections, surgery or combinations thereof. Chi-square analysis was performed to identify confounding variables or demographic factors that affect treatment strategy. A multi-factor ANOVA analysis was performed to identify patterns in presenting PROMIS scores (PPS) and selection of initial treatment. Patient groups were then reorganized by the most invasive treatment pursued, the analysis was repeated, and t-test analysis confirmed statistical differences. Patients without a PPS were excluded from statistical tests involving PROMIS. Results: Of the 1529 patients who met inclusion/exclusion criteria, 729 of which had PPS. For initial treatment, 119 (7.8%) patients chose PT, 831 (54.3%) chose an injection, 129 (8.4%) chose surgery, and 450 (29.4%) had no intervention. Of the patients who received treatment 85 (7.9%) had only PT, 695 (64.4%) received at least one injection during treatment, and 299 (27.7%) eventually had surgery. Significant differences in PPS between patients of initial treatment group were not of clinically important difference. However, patients that eventually required surgery had significantly lower PF (p = 3.751e-08), higher PI (p = 1.431e-08) and higher PD (p = 0.0146) when compared to those that only had injections. Conclusions: PROMIS survey results could be used to identify patients that are likely to fail non-operative intervention for DQT. Survey response rates were much higher from patients choosing more invasive interventions and older patients tended to choose more invasive treatments as their initial management. While there were no clinically significant differences in PPS between patients choosing PT, injection, or surgery as their initial management, patients that eventually Objective: Legacy haemophilia outcome measures may be too long and sometimes have floor-or ceiling effects and irrelevant questions. Patient Reported Outcomes Measurement Information System (PROMIS) item banks use Computer Adaptive Tests (CAT) to enable more efficient outcome assessment than legacy instruments. The aim of this study was to evaluate the feasibility, measurement properties and relevance of nine PROMIS CATs and short forms (SFs) in persons with haemophilia (PWH). Methods: Dutch adult PWH completed nine PROMIS item banks electronically as CATs or SFs: 'physical function' , 'pain interference' , 'depression' , 'anxiety' , 'ability to participate in social roles and activities' , 'satisfaction with social roles and activities' , 'fatigue' , 'self-efficacy for managing medications and treatment' and 'self-efficacy for managing symptoms' . Feasibility was assessed by number of items answered per CAT and floor-/ceiling effects for all measures. Construct validity was studied by testing hypotheses about the relationship of PROMIS items banks with the legacy instruments Haemophilia Activities List,  and expected differences between subgroups (known-group validity). The reliability of the CATs was evaluated by calculating the proportion of T-scores with an SE ≤ 3.2. Relevance of item banks was determined by proportions of limited scores. Results: Overall, 142/373 of invited PWH (mean age 47 years [range 18-79], 49% severe haemophilia, 46% received prophylaxis) responded. For the CATs, mean number of items answered per item bank varied from 5 (range 3-12) to 9 (range 5-12), with floor effects in 'pain interference' (26% lowest scores) and 'depression' (18% lowest scores). Construct validity and reliability in PWH were good for 'physical function' , 'pain interference' , 'satisfaction with social roles and activities' and 'fatigue' . Limited scores were most prevalent in the CATs 'pain interference' (33%) and 'physical function' (38%). The selfefficacy SFs with 8 items showed ceiling effects (22-28% maximum scores) and showed no relation with the legacy instruments. Conclusions: The PROMIS CATs 'physical function' , 'pain interference' and 'satisfaction with social roles and activities' and 'fatigue' are feasible and valid tools in PWH and preferred to the legacy instruments based on fewer items and less floor-and ceiling effects. Background: Sjogren's Syndrome (SS) is an autoimmune disease affecting the exocrine glands that has considerable impact on healthrelated quality of life (HRQL). The Patient Reported Outcome Measurement Information System (PROMIS) provides universal HRQL instruments, but has not been previously implemented in SS. Methods: A cross-sectional evaluation was performed on completed questionnaires of consecutive adult patients during visits to a multidisciplinary Sjogren's clinic between March 2018-February 2020. Questionnaires included PROMIS short-forms (depression 4a, anxiety 4a, fatigue 8a, physical function 4a, pain interference 8a (PI), sleep disturbance 4a (sleep), participation in social roles and activities 8a) and the European League Against Rheumatism (EULAR) Sjogren's Syndrome Patient Reported Index (ESSPRI). Patients were either classified J Patient Rep Outcomes 2021, 5(Suppl 1):90 as SS by 2016 ACR/EULAR criteria or otherwise labeled as sicca not otherwise specified (NOS) and used as a comparison group. Descriptive statistics were calculated for disease-related and sociodemographic variables and Pearson correlation was used to evaluate the relationship between subdomains of the ESSPRI and PROMIS. Uni-and multivariable linear regression (MVR) models were used to evaluate predictors of PROMIS fatigue, PI, and social participation. Results: 227 SS patients and 85 patients with sicca NOS were included. Mean (SD) PROMIS T-scores for PI (56.9 (11.0)), fatigue (57.2 (10.6)), and physical function (44.2 (9.8)) in SS patients were at least ½ SD worse than US population normative values. Among SS patients PROMIS PI (r = 0.72) and fatigue (r = 0.80) highly correlated with respective ESS-PRI pain and fatigue sub-domains. Fatigue (β = − 0.610, p < 0.001) and PI (β = − 0.185, p < 0.001), but not dryness or mood disturbance, were the strongest predictors of social participation in MVR in this SS cohort. Conclusions: In our SS cohort, PROMIS PI and fatigue scores correlated highly with respective ESSPRI domains. Fatigue but not dryness was found to be the strongest predictor of social participation. Given the ability of PROMIS instruments to evaluate physical, mental, and social function that would otherwise not be ascertained through the ESSPRI, these questionnaires should be considered as supplement in evaluation of SS.
Objective: Frequently used disease specific Patient Reported Outcome Measures (PROMs) in pediatric haemophilia are experienced as a burden due to their length and sometimes irrelevant questions. Patient Reported Outcomes Measurement Information System (PROMIS) item banks using short forms (SF) or Computerized Adaptive Testing (CAT) could solve this problem. The objective of this study is to assess the psychometric properties and feasibility of eight PROMIS item banks within a clinical sample of boys with haemophilia. Methods: In this multicenter study, all boys with haemophilia (mild, moderate, severe hemophilia A/B, aged 8-17 years) from six Dutch Haemophilia Treatment Centers will be invited to participate. For assessment of convergent validity the PROMIS item bank T-scores will be compared to subscales of the Haemophilia Quality of Life Questionnaire for Children (HaemoQoL) and to the Pediatric Haemophilia Activities List (PedHAL) by using Pearson's r with Normative data, at which r ≥ 0.70 is considered acceptable (Table 1). To ensure a power of > 0.8, a sample of n ≥ 64 is needed. Reliability of the PROMIS item banks was expressed as standard error of theta (SE(θ)), at which an SE(θ) < 0.32 corresponds to a reliability of 0.90. The proportion of reliable (SE(θ) < 0.32) T-scores within each item bank will be reported. Regarding feasibility, the number of completed items will be reported. Results: Regarding convergent validity, r is hypothesized between 0.5-0.9 for all correlations between the domains mentioned in Table 1. The proportion reliable T-scores is expected to be good for all PROMIS item banks, based on Dutch studies in children from the general population and a clinical sample (Juvenile Idiopathic Arthritis). We expect the PROMIS item banks to be more feasible in terms of number of items completed. Conclusions: When the pediatric PROMIS item banks display good convergent validity with disease-specific legacy instruments, good internal consistency and feasibility in a clinical sample of Dutch boys with haemophilia, PROMIS can be used in research and clinical care with lower questionnaire-related burden. Objective: One core advantage of PROMIS measures is that each estimate of the latent trait is associated with a standard error, reflecting uncertainty in the measurement. Such uncertainty needs to be acknowledged and quantified, in particular when assessing individual patients over time. In this study, we use plausible values to analyze true scores rather than observed scores. We then analyze the probability of true within-individual change and illustrate the use of plausible values in the analysis of real-world PRO data. Methods: We used a freely available dataset of stable and exacerbated COPD patients (N = 185), [1] which provided individual's physical function and fatigue PROMIS T-scores over a course of 21 weeks. At each measurement, we imputed 1000 plausible values from a normal approximation to the PROMIS T-scores' posterior distribution. Plausible values were then used to calculate probability of true change from baseline and the previous assessment, on individual and sample level. We also compared 4-item, 8/10-item short forms and computer-adaptive test in their performance to determine true change with 80%, 90% and 95% certainty across the T-Score metric. Results: We observed that at the end of the study, in the exacerbated group, 47.5% of participants achieved a certain (T-Score Difference t1-t2 < 0, p > 95%) improvement in fatigue from baseline compared to 26.5% in the stable group. Comparison of short forms and CATs of physical function and fatigue suggests that CATs have the most favourable properties, with a constant theta change of approximately 5 points reflecting a 95% probability of true improvement. For short forms, theta change associated with 95% certainty of true change can vary considerably depending on the T-score. Conclusions: Plausible values offer a flexible way to include measurement error in analysis of individuals and on a group level, and offer a useful complement to existing distribution-based approaches by providing an assessment of probability of true change. This method facilitates ease of interpretation, and allows for a finer-grained comparison of improvement or decline than analysis of observed scores. References [1] Yount, S.E., Atwood, C., Donohue, J., Hays, R.D., Irwin, D., Leidy, N.K., Liu, H., Spritzer, K.L. and DeWalt, D.A. (2019). Responsiveness of PROMIS to change in chronic obstructive pulmonary disease. Journal of Patient-Reported Outcomes, 3(1). J Patient Rep Outcomes 2021, 5(Suppl 1):90 inter-country comparisons, potentially leading to systematically different physical function scores. Individuals with the same 'true' underlying physical ability would score systematically different due to specific cultural contexts or language differences. Therefore, we investigated these items in three general population samples and assessed the validity of their German and Spanish translations. Methods: We collected P.F. data from 3601 persons from the general population in the USA, Argentina, and Germany. DIF was assessed with logistic ordinal regression models, and Nagelkerkes' pseudo R 2change of > 0.02 was chosen as the critical cutoff value indicating DIF. The impact of DIF on item scores and the T-scores were examined by inspecting both the item characteristic curves (ICCs) and test characteristic curves (TCCs). Results: We included 1001 participants from Argentina (M theta = 0.23, SD theta = 0.79; M age = 35.6; 51% female), 1000 from Germany (M theta = 0.11, SD theta = 1.02; M age = 44.9; 52% female), and 1600 from the U.S. (M theta = 0.00, SD theta = 1.22; M age = 44.3; 58% female). 2 (Germany vs. the U.S.) respectively 4 (Argentina vs. the U.S.) out of 35 items were flagged for DIF. Most of the items that showed DIF had R 2 values that only marginally exceeded the critical value of 0.02. All these items showed uniform DIF. The TCCs suggested that the magnitude and impact of DIF on the test-scores was negligible for all items. After correcting for potential DIF, both Germany (M theta difference = 0.098, t(2393.4) = 2.57, p = 0.0103) and Argentina (M theta difference = 0.281, t(2598.2) = 7.91, p < 0.001) had slightly higher scores than the U.S. Conclusions: Our study adds to the evidence that PROMIS physical functioning items are universally applicable across general populations from Argentina, Germany, and the U.S. Objective: The purpose of this study was to translate and linguistically validate four PROMIS Pediatric item banks (Anxiety, Depressive Symptoms, Peer Relationships, Upper Extremity Function), the Pediatric Proflie-25, Pediatric Global Health Scale 7 + 2 and Parent Proxy counterparts into Norwegian highlighting linguistic issues encountered during the process. Methods: We translated 108 PROMIS Pediatric items and 109 Parent Proxy items using the FACIT methodology -a standardized iterative process of forward-and back-translation, expert review, harmonization, and cognitive interviewing. The translation team were native Norwegian-speakers from Norway. 15 Norwegian-speaking parent-child dyads from the general population assessed the relevance, understandability, and appropriateness of the translations. A pragmatic qualitative analysis of cognitive interviews determined the linguistic equivalence of each translation and provided insight into the relevance of the concepts for each population. Results: The study sample consisted of 15 native Norwegian-speaking children (7 girls, 8 boys) with a mean age of 13 (8-17) and 15 native Norwegian-speaking adults (11 women, 4 men) with a mean age of 42 (33-47) in Oslo, Norway. Revisions to particular concepts were made to Pediatric items where respondent commentary revealed misunderstandings (Pediatric Profile-25: "pay attention", "one block", Pediatric Global Health Scale: "rate", "mood") and to corresponding Parent Proxy items to maintain consistency. One additional revision was required to the Parent Proxy Global Health: "feel sad". Upon completion of the cognitive interview analysis, translations were reviewed by the Norwegian PROMIS National Center and collaborators in Oslo to further refine items' verbiage.

Conclusions:
The Norwegian language PROMIS Pediatric and Parent Proxy items are conceptually equivalent to the English source. Concurrent assessment of children's and parents' item interpretation confirmed consistent understanding between pediatric and proxy populations. Inclusion of PROMIS National Centers in the translation process was instrumental in fine tuning particular nuances for both populations and is recommended in future translation work. These Norwegian Pediatric and Parent Proxy items are acceptable for use in international research, clinical trials and practice. Objective: The purpose of this study was to translate and linguistically validate five adult PROMIS item banks (Cognitive Function, Cognitive Function -Abilities, Itch -Interference, and two Self-Efficacy for Managing Chronic Conditions item banks: Managing Medications/ Treatment and Managing Symptoms) in Dutch-Flemish and report on challenges and solutions encountered during the process. Methods: We translated 115 adult PROMIS items using the FACIT methodology, which is a standardized iterative process of forwardand back-translation, expert review, harmonization, and cognitive interviewing. The translation team consisted of native Dutch-speaking linguists from Belgium and the Netherlands. As an additional quality measure, prior to cognitive interviews the Dutch-Flemish PROMIS National Center (PNC) reviewed all translations to confirm fluency, harmonization with previous translations, and offer suggestions relating to the items' usage in clinical settings. Eighteen Dutch-speaking participants from the general population evaluated the relevance, comprehensibility, and appropriateness of the items. We conducted qualitative analysis of cognitive interviews to evaluate the linguistic equivalence of each translated item and provide insight into the relevance of the concepts. Results: The sample consisted of 18 native Dutch-speaking adults (8 women, 10 men) from Belgium and the Netherlands with a mean age of 49 (18-75) years. During the translation phase, the concepts "I was tired of people asking" (Itch -Interference), "I can manage, " and "manage my symptoms" (Self-Efficacy item banks: Managing Medications/ Treatment and Managing Symptoms) required adjustments to convey the intended meaning more accurately and to harmonize with existing translations. Cognitive interviews revealed that of the 115 items translated, only one required revision (Cognitive Function: "My thinking has been foggy"). The remaining 114 items required no revisions, and all items were found to be relevant. Conclusions: The Dutch-Flemish PROMIS item banks are considered conceptually equivalent to the English. Short forms are ready for use in international research, clinical trials, and practice. Full banks will be validated with Dutch/Flemish patients before implementation as CAT. Inclusion of the Dutch-Flemish PNC is recommended for harmonization with existing translations and to maintain the link between linguistic choices and applied usage in clinical settings.

Objective:
We sought to investigate the health, socioeconomic, and behavioral impacts of COVID-19 among an ethnically diverse population of COVID-19 survivors in Texas, and to compare effects in Latinos versus non-Latinos. Methods: In December 2020, we surveyed (in English or Spanish) patients who had had COVID-19 infection 3-9 months earlier. Measures included 1) the PROMIS-29 + 2 health profile, 2) the CAIR Pandemic Impact Questionnaire (C-PIQ), and 3) items addressing social determinants of health. Bivariate analyses included chi-square tests, Wilcoxon rank-sum tests, and T-tests; generalized linear models were conducted for multivariable analyses. Latinos more commonly reported impacts of COVID-19 on social determinants of health such as finances (53% versus 21%, p = 0.003) and conflict within the home or family (18% versus 7%, p = 0.01). Multivariable regression analyses suggested that ethnic disparities in Depression, Anxiety, Seep Disturbance, and Cognitive Function were partially attributed to financial concerns. Conversely, Latinos had significantly higher C-PIC scores (CAIR Growth Score = 8.2[4.7] versus 5.3[4.2], p = 0.003), with higher scores on items such as "COVID-19 strengthened your relationships, " "increased appreciation of life, " and "created spiritual change. " Conclusions: COVID-19 has detrimental but differential impacts on patient-reported outcomes and social determinants of health among Latinos compared with non-Latinos, but Latinos may experience more post-traumatic growth. These findings highlight the ongoing need to address health disparities in not only infection, but also recovery from COVID-19. Furthermore, as Latinos also reported some more positive impacts due to the COVID-19 pandemic, future research should examine whether personality characteristics (e.g. resilience) mediate the impact of COVID-19 on health-related quality of life. Objective: Myasthenia Gravis (MG) characterized by generalized weakness commonly due to autoantibodies blocking acetylcholine receptors (AchR) resulting in symptoms like ptosis, diplopia, dysphagia, and dysarthria. The MG-QoL, specifically designed for MG, has historically been used to measure this population's QoL. To our knowledge, PROMIS has not been evaluated for use in patients with MG. In our evaluation of PROMIS in this population we expect PROMIS anxiety, depression, fatigue, social roles, physical function, and cognitive function scores will be strongly correlated with MG-QoL scores in patients with MG. Also, in clinical subgroups with significant differences in MG-QoL scores, strongly correlated PROMIS scores will be expected to show significant differences. Methods: Starting June 2018, 8 PROMIS domains have been collected as part of routine clinical practice in our neurology clinics. Measures are automatically assigned by the electronic health record and are usually completed by tablet while in waiting rooms. Subjects completed measures before March 2020 and have an MG billing code. Other data elements were extracted from the e-record (i.e., demographics) or by abstraction (i.e., comorbid conditions). Pearson correlations were calculated between scores. Differences in scores between clinical subgroups was evaluated using linear regression. Correlations are interpreted using Cohen's effect sizes and coefficients are considered statistically significant if p < 0.05. Results: Data collection is complete for 200/360 patients (131/200 have multiple visits) from 12 clinical sites and 37 providers. The average age is 65 and 54% are female. The MGQoL and PROMIS scores had medium strength correlations for cognitive function, sleep disturbance, and depression, and strong correlations for anxiety, fatigue, pain interference, social roles, and physical function. The strongest correlations are social roles and fatigue. All correlations had a p < 0.0001. In relevant clinical subgroups, whenever MGQoL was statically significant PROMIS physical function and social roles were also statistically significant, except for the pyridostigmine treatment category. Conversely, several subgroups had statistically significant PROMIS scores but not MGQoL scores. Conclusions: PROMIS and MGQoL measure many of the same constructs, with strong corrleations between MGQoL and many PROMIS domains. PROMIS shows more differences in QoL in clinically important subgroups of interest than MGQoL in patients with MG.
Objective: To determine if three pediatric Patient-Reported Outcomes Measurement Information System (PROMIS) questionnaires have sufficient measurement property evidence to be recommended for use in routine outcome monitoring as part of clinical care. Methods: We assessed the (1) PROMIS Parent Proxy short form v1.0cognitive function 7a questionnaire for ages 8 to 17 years; (2) PROMIS Parent Proxy Scale v1.0-Global health 7 + 2 questionnaire for ages 5 to 17 years; and (3) PROMIS Pediatric Scale -Global health 7 for ages 8 to 17 years using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) guidelines to (i) identify studies that evaluated the measurement properties of these questionnaires, (ii) evaluate the methodological quality of the included studies using the COSMIN Risk of Bias checklist, (iii) determine the sufficiency of each measurement property using COSMIN criteria for Good Measurement Properties, and (iv) assess the overall quality of evidence for each measurement property using modified GRADE criteria to determine if these outcome measurement instruments (OMIs) have sufficient evidence to be recommended for use. We searched the J Patient Rep Outcomes 2021, 5(Suppl 1):90 HealthMeasures website, MEDLINE, Embase, PsycINFO, and Web of Science and Google Scholar to identify eligible studies that assessed any of nine different aspects of reliability, validity, and responsiveness in participants < 18 years or caregivers of this age, as appropriate for each questionnaire. Results: Across the 6 measurement property studies we included in this review, there were 4818 children and 5459 parents. The three PROMIS OMIs had "high quality of evidence" for sufficient structural validity and internal consistency but "low quality of evidence" for sufficient content validity, meeting COSMIN's minimum standard for recommending their use. The quality of evidence was downgraded due risk of bias from the reporting of methods in the content validity studies. These findings apply to children ages 8 to 17 years, except the PROMIS Parent Proxy Scale v1.0 -Global Health 7 + 2 which is recommended for ages 5 to 17 years. Conclusions: The PROMIS OMIs assessed in this review measure their intended constructs but only for their intended age group. We recommend that future research follows the COSMIN measurement properties reporting guide to avoid reporting biases.

Objective:
The translation process is a complex, multi-stage, coordinated process in which many participants actively participate following the FACIT methodology. The authors aimed to develop an IT solution supporting the translation process and its effectiveness and efficiency. Methods: The authors focused on supporting translating and creating the Translation Item History (TIH). The electronic TIH allows omitting sending back and forth spreadsheet files and speeds up the team's work. Each Translation Team Member (TTM) is considered a registered user since the platform enhances team formation and monitoring. IT tools used to build up the application consisted of an open-source database engine-PostgreSQL; Spring-an open-source backend framework for Java; Angular-open-source web application framework. Results: The PROMIS translation platform becomes the only place to perform the translation tasks, and at the same time, it tracks and organizes every step of the translation process. The platform facilitates supervision over the process. Items are assigned to appropriate TTMs with their roles during the process steps. The user logs in, selects the correct translation, fills in a simple form, and carries out the task according to the protocol. The platform notifies each TTM response and prompts another active TTM to the following task. TTMs can observe all tasks in the dashboard as the list of issues to be done. Introduced mechanism enhances the performance of multiple translations simultaneously and speeds up the process. The system does not allow access to the translated text or unauthorized changes to it. The application generates and stores the Item Translation History as a spreadsheet and other necessary files about the process. Its use allows obtaining the final questionnaire in a standardized form as a PDF file. Oncology Study Group-Outcomes Questionnaire) was designed and validated for metastatic spine tumor patients, the use of general symptom-based PROMs, such as PROMIS (Patient-Reported Outcomes Measurement Information System) domains, may reduce both patient and physician burden and improve interdisciplinary care if shown to be concurrently valid. Methods: Metastatic spine tumor patients from 1/2017 to 4/2021 at a single academic medical center were asked to complete PROMIS PF (Physical Function), PI (Pain Interference), and Depression domains and the SOSG-OQ. Only patients who completed both the SOSG-OQ and PROMIS instruments were included in the analysis. Spearman correlation (ρ) coefficients were calculated. Patients missing a single question in the SOSG-OQ were excluded from the correlation analysis of the corresponding section. Results: A total of 87 unique visits, representing 67 patients met our inclusion criteria. A majority were men (50; 57%) and Caucasian (78; 90%), and the average age was 64 years (range: 34-87). There were 12 different types of tumors reported, with multiple myeloma, breast cancer, and prostate cancer representing 24 (28%), 22 (25%), and 11 (13%), respectively. Additional cancers included lung, colon, renal cell, thyroid, esophageal, non-Hodgkin's lymphoma, large B cell lymphoma, plasmacytoma, and metastatic spindle cell sarcoma. SOSG-OQ was strongly correlated with PROMIS PI (ρ = 0.83) and moderately correlated with both PROMIS PF (ρ = 0.75) and PROMIS Depression (ρ = 0.57). Conclusions: PROMIS PF, PI, and Depression appear to capture similar clinical insight as the SOSG-OQ. Spine surgeons can consider using these PROMIS domains in lieu of the SOSG-OQ in metastatic spine tumor patients.

P34 Impact of COVID-19/social distancing on PROMIS pediatric peer relationship T-Scores in children with Legg-Calve-Perthes Disease
Ruchita Iyer 1 , Molly F McGuire 1 , Angel A Valencia 2 , Terri Beckwith 1 , Chan-Hee Jo 1 J Patient Rep Outcomes 2021, 5(Suppl 1):90 deformity and progression from active to healed stage of the disease. Little is known about how the severity of deformity correlates with patient-reported quality of life measures at the healed stage of LCPD. The purpose of this study was to determine if the severity of femoral head deformity correlates with the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical, Mental, and Social health measures at the healed stage. Methods: We retrospectively analyzed 62 patients (45 male, 17 female) from a single institution who met the following eligibility criteria: unilateral LCPD in the healed stage, age > 11 (i.e. adolescent or older), and completion of 6 PROMIS Pediatric Short Form v2.0 measures: Mobility 8a, Pain Interference 8a, Fatigue 10a, Anxiety 8a, Depressive Symptoms 8a, and Peer Relationships 8a. We excluded patients who had surgery within 2 years of the survey. We used a continuous femoral head deformity score called Spherical Deviation Score (SDS) to assess the deformity on X-rays. Statistical analyses included Spearman's Correlation to assess the relationship between the deformity and the PROMIS measures, and sub-analysis for age, gender, BMI, and history of surgery. ICC for intra-rater reliability of SDS measurements was also performed. Results: 62 patients had a mean age at time of diagnosis of 7.9 ± 2.7 years (range 2-14.4) and a mean age at the time of survey of 14.4 ± 2.3 years (range 11-21). We observed significant correlation between the deformity (SDS) and patient-reported mobility (r = − 0.4 p = 0.002), pain interference (r = 0.3 p = 0.009), fatigue (r = 0.3 p = 0.01), anxiety (r = 0.5 p < 0.001), and depressive symptoms (r = 0.4 p < 0.001). No significant correlation was observed between the deformity and peer relationships (p = 0.3). SDS measurements showed excellent intra-rater reliability (ICC = 0.92). Conclusions: Femoral head deformity correlated significantly with PROMIS physical (mobility, fatigue, pain interference) and mental health (anxiety, depressive symptoms) measures but not social health measure (peer relationships). These findings are clinically relevant as the severity of femoral head deformity is associated with patientreported anxiety and depressive symptoms in LCPD. Objective: Many self-report measures including the NIH Toolbox Emotion Battery (NIHTB-EB) and PROMIS seek to maximize precision while balancing burden. Although the current stopping rules for the computer adaptive testing (CAT) administration of these assessments are effective for some test takers, they can be burdensome for highfunctioning individuals. Simultaneously, they yield inadequate reliability for some clinical populations. We evaluated four potential CAT stopping rules to increase reliability while minimizing burden. Methods: We conducted simulations for general and clinical pediatric samples using 17 NIHTB-EB item banks, three of which are equivalent to PROMIS banks. The current CAT stopping rules terminate the test if ≥ four items have been administered, the standard error (SE) of the EAP score estimate is < 0.3, or a maximum of 12 items have been administered. The simulations considered the addition of a Standard Error (SE)-change rule (SE threshold for interim stopping reduced to 0.224) and a reduction of the maximum number of items, as well as examination of six-and eight-item fixed-length CATs. Simulees were grouped by the number of items administered and reliability achieved by each set of rules [Reliability < 0.85, 0.85 Reliability < 0.90, 0.90 Reliability < 0.95, 0.95 Reliability]. Results: Relative to the current rules, the SE-change rule minimally reduced average response burden (− 0.59 items general, − 1.2 items clinical). Although this rule increased the proportion of simulations achieving empirical reliability > 0.95 (+ 8.2% general, + 9.6% clinical), the average percentage of simulations achieving empirical reliability < 0.85 reached an unacceptable level (34.4% general, 20.9% clinical). Similarly, empirical reliability > 0.95 increased for six-item (+ 1.1% general, + 2.5% clinical) CAT; however, the percentage of simulations achieving reliability < 0.85 was excessively high (43.9% general, 31.4% clinical). Conversely, neither the eight-item CAT nor the reduced-maximum rule increased reliability < 0.85 relative to the current rules (eight-item: + 4.2% general, + 3.5% clinical; reduced-maximum: + 4.2% general + 3.6% clinical). The reduced-maximum rule also minimized burden by not always administering eight items (7.22 items general, 6.99 items clinical). Conclusions: Each condition has potential advantages and disadvantages for specific research and clinical uses. We determined that the reduced maximum rule best balanced burden and precision for combined research and/or clinical use.

O38
Optimizing the efficiency of computerized adaptive tests using real data: a machine learning approach. Objective: To reduce administrative burden and increase efficiency of Computerized Adaptive Tests (CAT) our objective is to develop an additional stopping rule for CATs based on change in SE of a person's estimated score (θ) after each administered item. Methods: In April 2020 and November 2020 Patient-Reported Outcomes Measurement Information System (PROMIS) CATs were administered (n-range: 3212-3429). The stopping rules consisted of a standard error of measurement (SE(θ)) ≤ 0.32; 90% reliability) or a maximum of 12 items administered. Data on item selection, item responses, θ and SE(θ) of each step within the CAT were extracted. Datasplit into a training/test set. Using a machine-learning procedure the efficiency ((1-SE(θ) 2 )/n items ) was maximized against the change in SE(θ) in the training set to determine the optimal change in SE(θ) to be used as a stopping rule. This stopping rule was subsequently applied to the test set (in addition to standard stopping rules) and the amount of participants reliably estimated (SE(θ) ≤ 0.32), average test length and relative efficiency were compared to using only the original stopping parameters. We applied this procedure to Anxiety (low α parameters) and Depressive Symptoms (high α parameters) CATs, as it is likely that the optimal stopping rule of change in SE(θ) is influenced by the discrimination parameters within the item response theory (IRT) model. Results: Preliminary results show that on Depressive Symptoms/ Anxiety, respectively 1193(35.9%)/1369(39.7%) of participants had 12 items administrated of which 33.9%/18.9% were due to floor effects. For these floor effects a change in SE of 0.01 would reduce the amount of items administrated from 12(SE(θ) = 0.588/(SE(θ) = 0.566)) to 4(SE(θ ) = 0.619)/5(SE(θ) = 0.592), which results in a difference of T-score estimates of 3.3(32.0vs35.3)/ 3.4(31.9vs34.3). Conclusions: Optimizing CAT efficiency by adding an additional stopping rule based on the change in SE(θ), may reduce the burden of PROMIS administration, while retaining precise, reliable measurements. Further optimizing the stopping rule will likely result in better trade-offs with fewer negative consequences to T-score estimates.
Objective: The American Psychiatric Association (APA) recently selected new level-2 DSM-V instruments for monitoring with shorter administration times, consisting of several Patient-Reported Outcomes Measurement Information System (PROMIS) measures for adults (aged 18 +) and children and adolescents (aged 8-18). To increase the age range of these measures the PROMIS initiative developed Early Childhood measures for measuring mental health in very young children (aged 1-5). These measures were translated to Dutch by forward and backward translations and cognitive debriefing. The objective of this study is to investigate the psychometric properties of four recently (2019) developed PROMIS Early Childhood (PROMIS EC) measures for assessing Anxiety, Depression, Irritability (anger) and Sleep Disturbance in the Dutch general population. Methods: Secondary data analyses will be performed on data collected in 2020 -2021 from a study that assessed the consequences of the COVID-19 outbreak on very young children. The Anxiety, Depression, Irritability and Sleep Disturbance complete item banks will be administered to parents of young (aged 1-5) children (n = ~ 1300). To assess structural validity of each item bank a graded response model (GRM) will be fitted to the data after assessing the following assumptions: Unidimensionality through CFA (CFI > 0.95, TLI > 0.95, RMSEA < 0.10), local independence by residual correlations (r < 0.20) and monotonicity by Mokken analysis (H > 0.50, H i > 0.30). Item fit of the GRM models will be inspected with S-X 2 , where p < 0.001 indicates misfit. Additionally, percentage of participants reliably measured will be assessed using the standard error of measurement (SEM) < 0.32 as a criterion (which equals a reliability of 0.90). If possible, differential item functioning (DIF) analyses will be performed between the Dutch and U.S. model. Results: Translations were successful. Validation results will be presented at the conference. Conclusions: After initial validation of these item banks, they can be implemented as CAT within the Netherlands to provide new measures to reliably and validly assess mental health in very young children (aged 1-5). Objective: Patient-reported measures of health-related quality of life (HRQOL) are collected across healthcare systems to track patient conditions, evaluate change over time, and inform health policy. Many systems additionally collect construct-specific patient-reported outcome measures (PROMs). As patients and clinicians are inundated with surveys and data, efforts should be made to tailor survey administration to individual patient needs. Our study evaluated the ability of utilizing items on a measure of HRQOL to identify patients who may require additional screening.

Methods:
A cross-sectional study was conducted of patients who completed PROMIS Global Health (GH) as part of routine care in a large healthcare system from 1/1/2016-12/31/2018. Additional constructspecific surveys were also routinely collected in some clinical centers. Receiver operating characteristic analysis was used to identify optimal thresholds for PROMIS-GH items predicting clinically meaningful thresholds on construct-specific PROMs: PHQ-9 score ≥ 10, Neuro-QoL Cognitive Function, PROMIS Physical Function, and Social Role Satisfaction T-score < 40, PROMIS Anxiety, Fatigue, Sleep Disturbance, and Pain Interference T-score > 60. Results: Patients completed 1,085,599 PROMIS-GH surveys, with between 8,832 (for Neuro-QoL cognitive function) and 182,000 (PHQ-9) additionally completing one of the above construct-specific surveys. Scores ≤ 3 on PROMIS-GH item 10 (emotional problems) had 94.7% sensitivity (area under the curve (AUC) 0.867) for identifying patients with meaningful anxiety on PROMIS Anxiety and 90.0% sensitivity (AUC 0.820) for identifying patients with moderate-severe depressive symptoms on PHQ-9. Similarly high sensitivity and AUC were demonstrated for PROMIS-GH items assessing mental and physical health, ability to carry out social and physical activities, fatigue, and pain to identify poor scores in their corresponding construct-specific PROMs. Expectedly, worst performance was seen with the PROMIS-GH fatigue item when used to screen for poor PROMIS Sleep Disturbance scores (sensitivity 83.8%, AUC 0.712).

Conclusions:
Our study provides preliminary support for the ability of utilizing PROMIS Global Health items as screening tools to identify patients who would most benefit from additional construct-specific PROMs. Through directing PROMs to patients for whom they are most applicable, survey burden is reduced for the majority of patients, allowing a more efficient and targeted use of PROMs to improve healthcare decision-making. Objective: Sustainable health, a comprehensive state of physical, mental and social well-being that is attained and maintained throughout life, is affected by numerous factors. These include diet quality, physical activity, sleep and physical, mental, and social health. Patient-Reported Outcomes Measurement Information System (PROMIS), a platform that includes valid, reliable and standardized questionnaires, may be used to identify and manage several of these factors. The objective of this study was to evaluate 1) the acceptability of using PROMIS measures in a University Wellness Program, and 2) the relationship between PROMIS physical, mental and social measures with lifestyle factors such as diet quality, physical activity and stress in the French-Canadian university community. Methods: All students (n = 2000; native French-speaking) will complete questionnaires on a web-based platform. The psychosocial questionnaires are tailored to each individual through a computerized adaptive testing platform, which uses algorithms to adapt the questions presented to the individual according to the answers provided for each question. During this stage, questionnaires related to sociodemographics, diet quality, physical activity, stress and COVID-19 will also be completed to determine any associations with PROMIS measures. Acceptability will be measured using open-ended questions to students about the value of completing measures. Pearson's correlations between the PROMIS measures, Nutrient-Rich Foods (NRF9.3) Index, an indicator of diet quality, physical activity, stress and sleep will be performed.

Results:
The study started later than expected due to COVID in April 2021 and results will be presented using data collected until October 2021. It is expected that the PROMIS questionnaires will provide good validity (moderate to high correlations; r > 0.5). Furthermore, it is J Patient Rep Outcomes 2021, 5(Suppl 1):90 anticipated that several PROMIS measures will be associated with lifestyle factors such as diet, physical activity and sleep. Conclusions: These results will provide evidence for the possible benefits of using PROMIS measures in university students. PROMIS may be used to support the development of interventions in the framework of services provided by the university to help students take charge of their physical, social and mental health and well-being.
Objective: Parkinson's disease, a neurological movement disorder traditionally characterized by motor disturbance (tremor, rigidity, gait disturbance), but also impacts cognitive function, independence and self-care. Symptom management-based pharmacotherapy is the most common intervention and medication regimens can be complicated, burdensome, and change with disease progression. Cognitive decline in people with Parkinson's Disease (PwPD) may reduce the capacity for independent medication management. Clinicians and patient's appreciation of cognitive abilities might be inaccurate. Accurate identification of cognitive function might enhance care and outcomes by employing appropriate medication adherence strategies in PwPD. This study identifies the relationship between patient-reported medication management capabilities and patient-reported metrics of cognition with quantitative measures of cognitive ability. Methods: Retrospective review of data collected through routine care of PwPD that were evaluated by standardized validated multidimensional computerized cognitive assessment battery (CAB, Neu-roTrax) and completed patient reported outcomes ( Results: 90 PwPD, 64% male, average age 73 ± 9 years. Significant correlations were determined by regression analysis with p < 0.05 for the following metrics: MM vs GCS (r 2 = 0.41), MM vs AC (r 2 = 0.29). 31% of PwPD sampled had low confidence in managing medications, with 37% of males and 21% of females having low confidence in managing medications. Conclusions: Increasing cognitive impairment in PwPD was associated with less confidence in effective and safe self-management of medications. Quantified measures of cognitive performance were more strongly associated with self-medication management efficacy than were perception of cognitive abilities in PwPD. Additional risk factors for impaired medication management in PwPD include low GCS and male gender. CAB in conjunction with MM PRO can provide value added information in care of PwPD. Recognizing the need for and incorporating strategies to assure effective adherence to treatment regimens can enhance care in PwPD.

Objective: Multiple Sclerosis (MS) is an autoimmune disease characterized by relapses, progression, physical disability and MRI changes.
Increasing disease impact is associated with physical disability, cognitive impairment, psychological impact and impaired social functioning. Traditional approach to patient care in MS focuses on identifying and treating the physical symptoms of MS with Disease Modifying Therapies (DMT), however the relationship of this to the overall patient experience remains uncertain. Patient reported outcomes (PROs) evaluating psychological and social functioning may provide value added information to identify critical patient-centric aspects that impact quality of life (QoL) and allow unrecognized opportunities to enhance outcomes and satisfaction. This study explores the impact of psychological, social and physical functioning on meaning and purpose in people with Multiple Sclerosis (PwMS). Methods: Retrospective chart review of data collected through routine care of PwMS that completed PROs including: PROMIS Meaning and Purpose-Short Form 4a (MP), Patient Determined Disease Steps (PDDS), Neuro-QoL Ability to Participate in Social Roles and Activities -Short Form (SR), and Hospital Anxiety and Depression Scale (HADS). Results: 345 PwMS, 73% female, average age 50.6 ± 11.7 years. Significant correlations were determined by regression analysis with p < 0.01: MP&PDDS (r 2 = 0.07), MP&HADS-A (r 2 = 0.17), MP&HADS-D (r 2 = 0.40), and MP&SR (r 2 = 0.26). Conclusions: Meaning and purpose in PwMS is more closely correlated to psychological or social factors, rather than the physical disability. PRO physical disability (PDDS) demonstrated subtle negative correlation with MP, but SR and HADS both had large effect on MP, indicating that social and psychological function in PwMS may be a large contributor to patient QoL than previously anticipated. Enhanced understanding of such impact, identifying those with such impact and addressing these needs might provide unique opportunities to improve care, outcomes and satisfaction. Objective: Multiple Sclerosis (MS) is a chronic disease for which there are multiple disease modifying therapies and symptomatic medications. Impaired self-efficacy of medication management can result in sub-optimal outcomes. Cognitive impairment in people with MS (PwMS) can impact multiple cognitive domains (CD) to varying degrees and combinations. The relationship of cognitive impairment across multiple CD to medication management in PwMS remains uncertain. Impaired medication management might adversely impact well-being, as well as social and physical functioning. Improved awareness of patient centric impaired self-efficacy of medication management might provide proactive opportunities for intervention. This study explores the relationships between self-efficacy of managing medication and cognition function in PwMS. Methods: Retrospective chart review of PwMS who underwent standardized multi-domain computerized cognitive testing (CAB, Ntrax) and completed patient reported outcomes (PRO) including PROMIS Self-Efficacy for Managing Medication and Treatments (MM-4). CAB includes 7 cognitive domains: memory (Mem), executive function (Exe), attention (Att), information processing speed (Inf ), visual spatial (Vis), verbal function (Ver), motor skills (Mot) as well as a global cognitive summary score (GCS). Results: 338 PwMS (74% female, age = 50.6 ± 11.7 years) Regression modeling showed the following relationships between MM-4: GCS (r 2 = 0.16, p < 0.05), Mem (r 2 = 0.01, p < 0.05), Exe (r 2 = 0.32, p < 0.05), Vis (r 2 = 0.05 p < 0.05), Ver (r 2 = 0.07, p < 0.05), Att (r 2 = 0.29, p < 0.05), Inf (r 2 = 0.21, p < 0.05), Mot (r 2 = 0.08 p < 0.05). Conclusions: Increasing cognitive impairment is associated with worse self-efficacy for managing medication and treatment. Progressive impairment of specific CDs are associated with progressive impairment of MM-4. Executive function shows the most significant relationship with MM-4 followed by attention and information processing. Incorporation of CAB and MM-4 into routine care can provide value added patient centric information that might offer opportunities to enhance care and outcomes in PwMS. J Patient Rep Outcomes 2021, 5(Suppl 1):90 Evaluation. We compared CAT, RP, and CT score (1) ranges; (2) correlations with gold standard; (3) root mean square differences (RMSDs) vs. gold standard; (4) mean SEs, mean reliabilities; (5) clinical vs. non-clinical mean differences, Cohen's D effect sizes; and (6) item exposure, reflecting content coverage. Results: For Anxiety: (1) CAT score ranges (clinical/non-clinical = 38.6-83.7/36.1-82.6) were "better" (greater) than ).

Objective:
The Robert H. Lurie Comprehensive Cancer Center symptom assessment included 5 Patient-Reported Outcome Measurement Information System (PROMIS) domains measured using computer adaptive tests (CATs): Anxiety, Depression, Fatigue, Pain Interference, and Physical Function. When completed longitudinally, these assessments provide personal and group-level symptom trajectories (improving, declining, static) that can inform clinical care and research. We describe symptom status and change in a cohort of oncology outpatients and identify subgroups by their trajectories. Methods: Sample. Following initial implementation of the assessment in routine clinical care, a convenience sample of 141 patients completed baseline (T1) and 3 subsequent assessments (T2-T4), each separated by 30 days or more. Using latent growth curve modeling (LGCM), we estimated symptom trajectories per domain, determining individual patient starting values (intercepts) and change rates (slopes), then summarized them at the group level. With growth mixture modeling (GMM), we investigated intercept/slope variability, identifying whether a single group (class) or multiple classes better accounted for observed variability. When the preferred solution was multi-class, we re-estimated symptom trajectories per class and compared class symptom characteristics. Results: For all symptoms, assuming population homogeneity, we estimated common-class T1 starting values and change rates. With Pain Interference, the T1 start value was T-score = 49.6; change was essentially static (slope = − 0.01). However, with GMM we identified 2 distinct pain-associated patient classes and re-estimated their unique pain intercept/slope values: Class 1 (n = 91) had better (43.8/0.21) vs. Class 2's (n = 50) worse pain status (59.8/− 0.41). At T1, classes differed in pain status by 16.0 T-score points; by T4 this difference decreased but remained considerable (11.4 T-score points). We re-estimated class-specific intercept/slope values for other symptoms evaluated: Class 2 reported significantly worse anxiety, depression, fatigue, and physical function status at T1 through T4. Conclusions: These data, collected in routine cancer care, present an exciting opportunity to evaluate longitudinal patient-reported symptoms across a priority set of health domains. LGCM and GMM offer flexible methods for longitudinally characterizing domain status and change. They can be applied to investigate patient classes by clinical factors (e.g., cancer type, time since diagnosis, intervention) and, given available data, classes might be described by demographic and clinical status. The aim of this study is to examine the test-retest reliability, measurement error (Smallest Detectable Change (SDC)), responsiveness, and Minimal Important Change (MIC) of the DF-PROMIS-PF, DF-PROMIS-UE and DF-PROMIS-PI item bank administered as Computerized Adaptive Test (CAT) in patients receiving physical therapy. Methods: Adult (> 18 y) patients with musculoskeletal disorders of the lower back, neck or upper extremity from 8 primary care clinics will be included in the study. At admission (T0), a questionnaire with demographic and clinical characteristics, the PROMIS CATs and standard used legacy questionnaires (Quebec Back Pain Disability Scale (QBPDS) for low back pain, Neck Disability Index (NDI) for neck pain and Disability of the Shoulder Arm or Hand questionnaire (DASH) for upper extremity disorders) will be administered. After 3 to 14 days (T1), the PROMIS CATs and anchor questions that measure change on the construct will be administered. At discharge (T2) the PROMIS CATs, legacy questionnaires and anchor questions will be repeated. Patients will be classified as "unchanged", "deteriorated" or "improved" based on the response on the anchor questions. The test-retest reliability of each PROMIS CAT will be determined by calculating the Intraclass Correlation Coefficient (ICC 2,1 ) of the PROMIS CAT T-scores for "unchanged" patients between T0 and T1. Standard Error of Measurement (SEM) and SDC will be calculated as parameters of measurement error: SEM agreement = √(σ 2 measurement + σ 2 residual ) and SDC = 1.96x√2 × SEM. Responsiveness will be determined by testing a priory described hypotheses of expected correlations between changes in PROMIS CAT scores and changes in legacy PROM scores. Responsiveness will be considered sufficient when at least 75% of the hypotheses will not be rejected. The MIC will be calculated using predictive modelling. Results: We aim to include at least 150 participants for each disorder (low back, neck or upper extremity). Initial results will be presented at the conference. Conclusions: This is the first study to examine the test-retest reliability and responsiveness of PROMIS CATs in primary care physical therapy in The Netherlands. Objective: The Patient-Reported Outcomes Measurement Information System (PROMIS ® ) has invested much effort in the development of self-report tools to enable the measurement of individuals with high reliability. Measuring an individual's health status and symptoms reliably does not necessarily mean that changes in these symptoms are equally captured with high reliability. High levels of reliability for individual change, however, are essential for detecting individual trends over time, such as symptom recovery following surgery. Using PROMIS pain measures collected in a post-operative period, we examined three aspects that may contribute to reliable change scores: measurement frequency, test length, and static versus adaptive testing. Methods: Over almost 3 weeks following hernia surgery, 98 male patients competed daily diary versions of PROMIS pain interference and pain behavior short-forms. Based on these data, post-hoc simulations were conducted with the aim of comparing the ability of different strategies to achieve high reliability (i.e., > 0.9). Our simulations varied a) the number of measurement occasions over the study period (sampling density), b) the number of items (test length), c) and the mode of administration (i.e., static short-form vs. computer-adaptive testing [CAT]). Using a growth-curve modeling approach, observed change scores were compared to the best approximation of "real" (i.e., latent) change. Results: When all pain interference or pain behavior items from all days of the study period were used, observed change scores showed near perfect reliability (i.e., approaching 1.0). The number of items and the number of measurement occasions both contributed to the reliability of observed change scores. In contrast to previous findings, CAT administration was generally superior to short-forms in achieving high reliability. Conclusions: Various factors influence the reliability of change scores, including the sampling density, test lengths, and mode of administration. Further research should aim at identifying items that are best-suited for measuring change and, if required, add these items to existing PROMIS item banks. Objective: Triage treatment by physical therapists is an evolving service to improve diagnosis and outcomes in primary care. A challenge for health systems is to document outcomes of this service across a population. A potential outcome of primary care physical therapy (PC-PT) is to improve physical function across a population. However, current models of utilization focus on diagnosis rather than patient needs, as defined by the PROMIS Physical Function measure. The purpose of this study was to examine the association of recommendations from PC-PT for further physical therapy in primary care patients with musculoskeletal problems. Methods: Patient records from Jan 2021 to April 2021 were requested from an evolving database to assess PC-PT in primary care (n = 383). PC-PTs were trained to use the PROMIS PF computer adaptive measure at intake to quickly assess perceptions of physical function. Training included interpreting the PROMIS PF measure in addition to other diagnostic decisions. Initial analysis was univariate (i.e. chi-square), followed by logistic regression, the outcome for both was referral to further outpatient PT. The predictor variables included: PROMIS PF severity (Very Low PF (< 40), Low PF (40.1-50), or Above Average PF (> 50.1)), age, gender, acuity of symptoms (acute, subacute, chronic), and area of injury (spine, extremity, other). Results: Of 383 patients, 301 had complete data on all noted variables. A total of 40.5% (122/301) were recommended for physical therapy by the PC-PT. Chi square analysis showed no significant associations between recommendations for PT with gender p = 0.46), acuity categories (p = 0.07), or area of injury (p = 0.09). However, there was a strong association of PT referral with PROMIS PF categories (p < 0.001). The logistic regression analysis showed that age (p = 0.04), acuity (p = 0.07) and PROMIS PF (p = < 0.001) categories influenced J Patient Rep Outcomes 2021, 5(Suppl 1):90 the recommendation of further physical therapy by the PC-PT. The accuracy when these three variables were included in the model was 67.1%. Conclusions: PC-PT decisions are consistent with patient needs as defined by the PROMIS PF measure severity when recommending further physical therapy services following a primary care visit with the PC-PT. To improve population health outcomes, specialized programs may be needed to address patient needs (i.e. low PF) in addition to specific diagnostic categories.

O53 PROMIS Physical Function severity is associated with physical therapy recommendations in primary care
Objective: Aggregate review of PRO data is necessary for clinical application and research and is only successful if there is access to robust datasets. However, it isn't enough to have the data, it must be put to work. Developing a standardized data request system that is nimble enough to adjust to the changing needs of the requestor, with access to data that was previously stuck in inaccessible silos, takes forethought. Only by planning ahead can systems be designed that are easy to use and manage and will produce data that can inform clinical decision making. Methods: Transitioning from simple email requests to a standardized request process involved the use of a service desk software program. Once the single point of contact system was in place it was easier to collect required information, track requests and support regulatory requirements for research requests. Consolidation was an important component of this project as data is pulled from multiple locations into one warehouse. Decisions about which components to include were based on previous data requests and review of similar systems across the enterprise. Even with standardization, is often necessary to clarify requests. Having an integrated communication platform allows the analyst to exchange ideas, monitor changes and suggest tactics so the resulting data meets the needs of the requestor. Results: Initial data requests were for administration metrics and patient PROMIS scores. After a year, with the introduction of monthly collection reporting, the majority of requests switched to longitudinal PROMIS and other PRO scores anchored by medical interventions or events. In 2020, 41 requests for data came through the system. 93% were for research or quality improvement initiatives and the rest for a variety of administrative evaluations. In the first quarter of 2021, all requests have been for research. Conclusions: Clinical PRO data is typically not as clean as that collected as part of a research protocol. Having a standardized request system that guides the requestor and supports the data analyst is key to producing results that can yield new insight into how to improve clinical outcomes and value in healthcare.

Aligning significant individual change with patient-perceived meaningful change on the PROMIS Physical Function 10a
John Devin Peipert 1 , Ron Hays 2 , David Cella 1 Objective: Patient-reported outcome measures (PROMs) are powerful tools that can facilitate person-centered care by highlighting individuals' experience of illness. Little is known about the utility of implementing PROMs in the clinical care of patients with systemic lupus erythematosus (SLE), a chronic systemic autoimmune condition. This qualitative study aimed to evaluate the benefits and challenges of integrating PROMs into the routine clinical care of SLE from the perspective of patients and physicians participating in a multi-center longitudinal study. Methods: SLE outpatients and treating rheumatologists participating in a longitudinal study of the implementation of PROMIS computerized adaptive tests in clinical care were invited to participate in focus groups and structured interviews. Focus groups of patients were conducted in-person and semi-structured interviews of physician were conducted via video teleconference. Patients and physicians were queried on the utility, benefits, challenges, and ideal implementation of PROMs in clinical care. All sessions were audio recorded and transcribed verbatim. Transcripts were reviewed to construct and refine a codebook using a comparison and consensus approach and a thematic analysis was performed. Results: Twelve patients and 8 rheumatologists participated in focus groups and interviews. Patients and physicians reflected on the value of PROMs in facilitating communication and strengthening therapeutic relationships by highlighting and validating the patient experience of SLE. Patients found that PROMs enabled self-monitoring, but noted that the surveys were most useful when reviewed and discussed with their rheumatologists. Physicians believed PROMs promoted patient engagement and awareness, and emphasized their role in drawing attention to emotional health issues that might otherwise have been unaddressed. Both patients and physicians suggested that ideal clinical implementation of PROMs requires integration with the electronic health record, detailed guidance on score interpretation and population norms, and survey customization options. Conclusions: SLE patients and rheumatologists participating in a longitudinal study of the implementation of PROMs in clinical care found that PROMs enhanced the care of SLE primarily by facilitating patientphysician communication and promoting patient self-reflection and validation. Optimal implementation of PROMs in routine SLE care requires physician engagement, easily interpretable scores, and integration with existing clinical platforms. Objective: Patient-reported outcome measures (PROMs) are powerful tools that can highlight the patient experience of illness. Although PROMs are standard metrics in SLE clinical research, they are not routinely integrated into the clinical care of this systemic condition. The aim of this study was to assess the feasibility and impact of implementing web-based PROMs in the routine clinical care of outpatients with SLE. Methods: Outpatients fulfilling SLE classification criteria were enrolled in this longitudinal cohort study at two academic medical centers. Subjects completed PROMIS computerized adaptive tests at enrollment and prior to two consecutive routinely scheduled rheumatology visits using the ArthritisPower research registry mobile or web-based application. Score reports were shared with patients and providers before visits. Patients and rheumatologists completed post-visit surveys evaluating the utility of PROMs in the clinical encounters. Results: A total of 105 SLE patients and 17 rheumatologists participated in the study. Subjects completed PROMs in 159 of 184 eligible encounters (86%, 95% CI 81 -91) prior to study suspension due to the COVID-19 pandemic. Following baseline surveys, PROMs were completed for 90% (95% CI 82 -95) of visit 1's and 82% (95% CI 72 -90) of visit 2's. Nearly all PROMs (93%) were completed remotely. Patients and rheumatologists reported that PROMs were useful (91% and 83% of encounters respectively) and improved communication (86% and 72%). Rheumatologists found that PROMs impacted patient management in 51% of visits, primarily by guiding conversations (84%), but also by influencing medication changes (15%) and prompting referrals (10%). There was no statistically significant difference in visit length before (mean = 19.5 min) and after (mean = 20.4 min) implementation of PROMs (p = 0.52). Health-related quality of life and disease activity did not change significantly after implementation of PROMs, but patient activation improved in 14/23 (61%) of participants with low baseline activation levels. Conclusions: The remote capture and subsequent integration of PROMs into clinical care was feasible in this diverse cohort of SLE outpatients. PROMs were useful to patients and rheumatologists, and promoted patient-centered care primarily by facilitating communication. Further studies are needed to clarify the impact of clinical integration of PROMs on activation and SLE-related outcomes. J Patient Rep Outcomes 2021, 5(Suppl 1):90 to r = 0.596). A linear regression using the KUS as the independent variable explained a modest proportion of the variance of the PROMIS scores: Anxiety R 2 = 0.074, Depressive symptoms R 2 = 0.049, Physical function R 2 = 0.160, Fatigue R 2 = 0.144, Peer relationships R 2 = 0.050, Pain interference R 2 = 0.105. Conclusions: This study explored the relationship between paediatric PROMIS-25 and a utility measure. A limitation of the study is the size of the population; however, the data collection provided the possibility of a preliminary investigation. Further exploration of the relationship between PROMIS-25 and utility measurements would benefit from a larger population and the calibration of the utility weights generated by different utility instruments. References [1] Chen C., Stevens S., Rowen D., Ratcliffe J. From KIDSCREEN-10 to CHU9D: creating a unique mapping algorithm for application in economic evaluation. Health and Quality of Life Outcomes 2014, 12:134. Objective: Stroke patients often have "hidden deficits" that impair their health-related quality of life (hrQoL) such as fatigue, cognitive symptoms, and depression and which are best measured using patient self-report. To better understand and optimize outcomes of stroke survivors, the Cleveland Clinic Cerebrovascular Center began collection of patient-reported outcomes (PROs) within the ambulatory clinics in late 2008. We describe our experience with collection and clinical utilization of PROs in the cerebrovascular clinics of a large healthcare system. Methods: Implementation occurred as part of a larger patient-entered data initiative within Cleveland Clinic. PROs were initially collected through an internally developed patient data collection platform and were migrated to Epic tools in November 2019. Patient questionnaires included PHQ-9 depression screen, a sleep apnea scale, PROMIS Global Health and computer adaptive testing versions of PROMIS physical function, fatigue, pain interference, sleep disturbance, satisfaction with social roles and NeuroQoL cognitive function. Clinicians also record clinical information regarding patients' cerebrovascular disease in structured fields within the EHR. T-scores can be viewed graphically over time in Epic's Synopsis reports. Score percentiles are automatically inserted into documentation templates. Results: Since starting data collection in 2008, PROs have been collected in 39,863 visits, representing 22,542 unique patients. Completion rates have consistently been over 50%. Patients who complete questionnaires are younger (58.7 [SD 15.8] vs. 62.0 [SD 15.8], P < 0.001) and have lower clinician-reported disability scores (mean modified Rankin scale 1.13 [SD 1.13] vs. 1.39 [SD 1.26], P < 0.001). The majority (58.9%) of patients have at least one score ≥ 1 SD worse than the US population mean and 38.3% have 2 + scores besides PROMIS physical function that are ≥ 1 SD worse than the population mean. There is wide variability in severity of symptoms among patients with similar clinician-reported disability and neurological deficits. We will provide examples along with actions that can be taken based on PROMIS scores. Conclusions: PRO collection in a cerebrovascular clinic is feasible. They have dramatically improved our understanding of the health status of our stroke patients and has informed clinical management. Development of evidence-based interventions for PRO scores will further improve their usefulness in ambulatory stroke care.
Objective: To assess demographic and clinical/surgical characteristics associated with improvement in PROMIS physical function, and pain interference and contrast them with predictors of the typical legacy measures of ODI, NDI, and NRS pain intensity after spine surgery for degenerative conditions. Methods: 727 degenerative lumbar and cervical spine surgery patients with preoperative and 12-month follow-up PROMIS data who underwent spine surgery at a single institution were analyzed. Demographic (age, gender, race, smoking status, education, insurance, liability claim, employment status), clinical/surgical characteristics (preop opioid use, comorbidities, procedure, revision status), and preoperative outcome scores were entered as predictors of 12-month PROMIS (PF and PI) and legacy measures (ODI/NDI, NRS axial pain, NRS extremity pain). The 4-item PROMIS short forms were used to assess PF and PI. Predictor importance, coefficients, and overall model R 2 values are presented. Results: As expected, the baseline scores associated with each outcome had the highest predictor importance. Other predictors that were significant for both 12-month PF (R 2 = 0.38) and ODI/NDI (R 2 = 0.37) included preop opioid use, preop PROMIS depression, employment status, comorbidity count, and education. BMI, smoking status, and age were only significant predictors of PROMIS PF while race, revision status, and pain intensity were only significant for ODI/ NDI. Procedure, liability status, lumbar vs cervical, gender, and insurance status were not significant predictors of either outcome in these models. Significant predictors of PROMIS PI (R 2 = 0.28) were preop score, preop opioid use, employment, education, PROMIS preop depression, smoking status, comorbidities, and race. Significant predictors of NRS pain scores were similar with a few differences. Conclusions: Previous research shows PROMIS measures are reliable, valid, and responsive in spine surgery patients. As PROMIS measures are now being used to evaluate surgical outcomes more frequently through their incorporation into standard hospital data collection, registries, and trials, it is important to understand the patient demographic and clinical/surgical characteristics that are associated with PROMIS outcomes after spine surgery. Predictors of PROMIS and legacy measures were similar but not identical. Contrasting these predictors of PROMIS outcomes with legacy measures aids surgeons and researchers in understanding how these PROMIS measures are similar but distinct from legacy outcomes.