Skip to main content

The Norwegian PROMIS-29: psychometric validation in the general population for Norway



The Patient Reported Outcome Measurement Information System profile instruments include “high information” items drawn from large item banks following the application of modern psychometric criteria. The shortest adult profile, PROMIS-29, looks set to replace existing short-form instruments in research and clinical practice. The objective of this study was to undertake the first psychometric evaluation of the Norwegian PROMIS-29, following a postal survey of a random sample of 12,790 Norwegians identified through the National Registry of the Norwegian Tax Administration. Confirmatory factor analysis was used to assess structural validity. Fit to the Rasch partial credit model and differential item functioning (DIF) were assessed in relation to age, gender, and education. PROMIS-29 scores were compared to those for the EQ-5D-5L and the Self-assessed Comorbidity Questionnaire (SCQ), for purposes of assessing validity based on a priori hypotheses.


There were 3200 (25.9%) respondents with a mean age (SD) of 51 (20.7, range 18 to 97 years) and 55% were female. The PROMIS-29 showed satisfactory structural validity and acceptable fit to Rasch model including unidimensionality, and measurement invariance across age and education levels. One pain interference item had uniform DIF for gender but splitting gave satisfactory fit. Domain reliability estimates ranged from 0.85 to 0.95. Correlations between PROMIS-29 domain, SCQ and EQ-5D scores were largely as expected, the largest being for scores assessing very similar aspects of health.


The Norwegian version of the PROMIS-29 is a reliable and valid generic self-reported measure of health in the Norwegian general population. The instrument is recommended for further application, but the analysis should be replicated and responsiveness to change assessed in future studies before it can be recommended for clinical and health services evaluation in Norway.


The US National Institutes of Health (NIH) Patient Reported Outcomes Measurement Information System (PROMIS®) is the most important development in the field of health status measurement, following the advent of short-form generic instruments over three decades ago [1]. PROMIS unifies measurement through standardized measures with broad applicability across health problems in clinical practice, research, and quality measurement [2]. The system builds on recent scientific advances including item response theory (IRT) and computer adaptive testing (CAT), resulting in higher precision and lower respondent burden respectively. Standardization, based on common metrics, allows for comparisons across domains, across health problems, and with the general population [2]. PROMIS measures are freely available and have widespread application internationally [3, 4].

PROMIS IRT-calibrated item banks assess aspects of physical, mental, and social health and include over 300 measures for adults and children [4]. This approach promotes flexibility in the selection of domains and items of relevance to specific health problems or populations [5]. PROMIS items within an item bank can be administered by short form fixed questionnaires (4–10 items) or CAT (4–12 items), with the former contributing to profiles.

The PROMIS-29 adult profile is a brief generic health measure comprising 29-items from the PROMIS domains of anxiety, depression, fatigue, pain (intensity and interference), physical function, sleep disturbance, satisfaction with participation in social roles (social participation) [2]. The PROMIS-29 has had rapid uptake since it became available in the last decade, including translation into over 40 languages [2], evaluation of measurement properties in different countries and populations [6,7,8], and application in research, including randomized controlled trials [9,10,11]. The instrument has also been used in crosswalks or mapping to other widely used PROMs including the EuroQol EQ-5D [12]. The inclusion of an extra domain of cognitive function-abilities, or its imputation using PROMIS-29 data, also makes it suitable for economic evaluation through the inclusion of values for health states in the form of PROPr [3, 13].

The present study describes the evaluation of the Norwegian-language version of the PROMIS-29, following a postal survey of the general population for Norway. The measure was assessed for data quality, structural validity, fit of the seven domains to the IRT partial credit model, differential item functioning (DIF), internal consistency and convergent validity through comparisons with scores for the EQ-5D and a comorbidity questionnaire.


Data collection

This study was based on data from a national sample of Norwegians aged 18 years and over. Published Norwegian surveys [14,15,16,17,18], informed the sample size and quota sampling for seven age groups and sex. The random sample of 12,790 adults aged 18 years and over, were selected from the Norwegian Tax Administration registry (Folkeregisteret). They were sent a postal questionnaire and reply-paid envelope addressed to the Norwegian Institute of Public Health on December 15, 2019. An accompanying letter explained the study purpose and that respondents would be included in a lottery of ten prizes each to the value of 1000 Euros.

The Regional Committee for Medical and Research Ethics stated that the study did not need ethical board approval and a Data Protection Impact Assessment was approved by the Institute on the 16th October 2019.

The questionnaire included the Norwegian version of the PROMIS-29 as distributed by the PROMIS Health Organization [19]. Translations of PROMIS measures follow FACIT universal methodology, an iterative process of forward- and back-translation, expert review, harmonization and cognitive interviewing [1]. Each domain comprises four items with five-point descriptive scales, except for pain intensity which has a 0–10 numerical rating scale. The sum of the item responses for each multi-item domain are converted to T-scores where a score of 50 is the average for the US general population with a standard deviation of 10 [2, 19]. Higher scores represent more of a domain. Therefore, for physical function, higher scores represent better health whereas for anxiety, higher scores represent poorer health.

The questionnaire also included the Norwegian EQ-5D-5L which includes five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) with five levels [20]. Health states are transformed to a single index using a scoring algorithm derived from valuation tasks undertaken with general population samples. An algorithm is not yet available for Norway and hence, recommendations from the Norwegian Medicines Agency [21] were followed, including the use of the UK value set [22] and mapping [23]. Scores for the EQ-5D index range from -0.59 to 1, where 1 is the best possible health state. In addition to the five dimensions, the EQ VAS, assesses self-rated health on a vertical visual analogue scale, with endpoints labelled “Best imaginable health state” (100) and “Worst imaginable health state” (0). The presence of health problems was assessed by the Self-administered Comorbidity Questionnaire (SCQ), which lists thirteen medical conditions and up to three other non-specified medical problems [24]. Osteo- and rheumatoid arthritis are listed separately but scored as one. Respondents are asked if they have a condition, if they are receiving treatment for it, and if it limits their activities. All items use yes/no responses and are scored one for the former, giving a score range of 0 to 45, the latter equivalent to 15 conditions being present, treated, and limiting activities. The Norwegian version underwent two independent forward-backwards translations in accordance with recommendations for PROMs translation [25]. Background questions included age, gender, and education level.

Statistical analysis

Statistical analysis followed an a priori analysis plan with explicit hypotheses. Missing data and floor and ceiling effects were assessed at the item and domain level. Confirmatory factor analysis (CFA) with robust weighted least squares (WLSMV) appropriate for categorical data [26, 27], was used to assess the structural validity of the PROMIS-29, or the extent to which the item scores adequately contribute to the seven domains [28]. Model fit was assessed by the Root Mean Square Error Approximation (RMSEA, acceptable fit if < 0.06), the Comparative Fit Index (CFI, acceptable fit if > 0.95, poor fit if < 0.90, otherwise marginal) and the Tucker Lewis Index (TLI, acceptable fit if > 0.95, poor fit if < 0.90, otherwise marginal) [27, 29].

The unidimensionality of each domain was tested using the partial credit model, which extends the Rasch model for polytomous items, and, hence has separable item and person parameters, sufficient statistics and conjoint additivity permitting item and person comparisons [30]. Overall and item fit statistics were used to assess whether items within the domains fitted the one-dimensional model. Item fit was assessed with the χ2 statistic, standardized residuals, which should be between ± 2.5, and item characteristic curves. Local independence, a further assumption of Rasch models, was assessed through examination of the residual correlation matrix with coefficients of ≥ 0.2 indicating redundancy among items [31, 32].

Domain invariance was assessed through uniform and non-uniform differential item functioning (DIF) for age (6 categories), gender, and education level (3 categories); differences of ≥ 0.5 logits in item difficulties were considered meaningful [33, 34].

Internal consistency was assessed by Cronbach’s alpha [35] and the person separation index (PSI) [36]. These are similarly interpreted, but PSI uses the logit value (linear person estimate) or, proportion of error free variance of the distribution of person estimates relative to the sum of this variance and the error variance in these estimates. Reliability estimates of 0.7 and 0.90 deemed necessary for group and individual comparisons respectively [37].

Hypothesis testing was used to further assess the convergent validity of the PROMIS-29 domain scores through comparisons with those for the EQ-5D and SCQ. Inclusion of EQ-5D item data meant that Spearman correlation was used. Criteria for expected levels of correlation followed those used in a systematic review of generic PROMs [38]. First, correlations ≥ 0.60 were expected for scores assessing the same construct: anxiety and depression and EQ-5D anxiety/depression; pain interference/intensity and EQ-5D pain/discomfort; physical function and EQ-5D mobility, usual activities; social participation and EQ-5D usual activities. Second, correlations < 0.60 and ≥ 0.30 for instruments assessing largely related but dissimilar constructs: fatigue and EQ-5D anxiety/depression; pain interference and EQ-5D mobility, usual activities; physical function and EQ-5D self-care, pain/discomfort; social participation and EQ-5D mobility. This level was also expected for correlations between all PROMIS-29 domain scores and those for the EQ-5D index and EQ VAS. Third, correlations < 0.50 and ≥ 0.20 for scores assessing moderately related but dissimilar constructs: anxiety/depression and EQ-5D usual activities, pain/discomfort; fatigue and remaining EQ-5D scores; sleep disturbance and EQ-5D usual activities, pain/discomfort, anxiety/depression; pain intensity and EQ-5D mobility, usual activities, anxiety/depression; social participation, pain interference and EQ-5D self-care, anxiety/depression; social participation and EQ-5D pain/discomfort. Fourth, correlations < 0.30 were expected for scores assessing weakly related or unrelated constructs: anxiety/depression and EQ-5D mobility, self-care; pain intensity and EQ-5D self-care; physical function and EQ-5D anxiety/depression; sleep disturbance and EQ-5D mobility, self-care.

Different studies using a variety of approaches to assessing multimorbidity, including simple counts, have found that higher levels of multimorbidity are associated with poorer health [39]. One third of SCQ scores comprise activity limitations and correlations of up to 0.4 have been found with SF-36 scores [24]. The great majority of SCQ items relate to somatic health problems, and hence, correlations in the range < 0.5 and ≥ 0.20 were expected for PROMIS-29 domains of physical function, social participation, pain interference/intensity. Lower correlations < 0.3 were expected for the remaining domains. EQ-5D domains comprise single items, and hence, compared to the PROMIS-29, lower correlations in the same range were expected with SCQ scores. Slightly higher correlations were expected for the EQ-5D index and EQ VAS scores which assess health more generally.

Statistical analyses were undertaken using RUMM2020 v4.1 (Rumm Laboratory, Perth, Western Australia), Mplus version 7 (Muthe’n & Muthe’n, Los Angeles, CA) and Stata version 15.0 (StataCorp LLC, College Station, TX).


Data collection

Of the 12,790 questionnaires mailed, 426 were returned as incorrectly addressed, and one person had died. Of the remainder, 3,200 (25.9%) returned a questionnaire that was at least partly completed. The mean age (SD) was 51 (20.7) and ages ranged from 18 to 97 years (Table 1). There were approximately 10% more female respondents than men, and 247 to 698 respondents across seven age categories; the lowest number of respondents was for 80 years and above and the highest was for those 18–29 years of age. Compared to general population data available from Statistics Norway from the time of the data collection [40] survey respondents were also over-represented for the youngest and oldest age groups, highest education level, and married/domestic partner (Table 1).

Table 1 Respondent characteristics (n = 3200) compared to the general population

Distribution of scores

Levels of missing data for the PROMIS-29 ranged from 0.3 to 3.4% for items relating to sleep and anxiety respectively (Table 2). The four anxiety items had the highest levels of missing data for any domain. Floor or ceiling effects, indicative of the best possible health, were apparent and over 70% for ten items. For the PROMIS-29 domains, 71% of respondents had the best possible physical function, the other domains ranging from 7.5 to 54.2% for sleep disturbance and depression respectively.

Table 2 Descriptives for PROMIS-29 items and domains, and reliability (Cronbach’s alpha)

Psychometric evaluation

Figure 1 shows the results of the CFA and the fit indices, which indicate that the seven-factor model met criteria for model fit (RMSEA = 0.059 [0.057–0.060], CFI = 0.987, TLI = 0.985). Correlations between the seven domains ranged from 0.36 to 0.89.

Fig. 1
figure 1

Confirmatory factor analysis

The p values for the chi-square statistics in Table 3 show that the PROMIS-29 items and domains fit the Rasch unidimensional model. Moreover, the results were highly consistent with no disordered thresholds for any item, and correlations between item residuals did not suggest any lack of local independence. Additional file 1 includes the item characteristic curves for these items. There was no evidence of age or education DIF and only the pain interference item, “How much did pain interfere with your household chores?”, was affected by uniform DIF relating to gender (> 0.5 logits), indicating that compared to males, females gave responses showing more severe impact across the scale. This item was split to create gender-specific versions of the same item which gave satisfactory model fit.

Table 3 Rasch analysis for the seven domains of the Norwegian PROMIS-29

The correlations with the EQ-5D were largely consistent with a priori hypotheses. Correlations ≥ 0.60 were found for PROMIS domain scores and those for the EQ-5D assessing the same construct, the highest being for those relating to pain. More moderate correlations for domain and EQ-5D scores assessing largely related but dissimilar constructs were found in the range 0.47 to 0.55. Correlations with the EQ-5D index scores were considerably higher than the expected upper level of 0.6 for the two PROMIS domains relating to pain interference and pain intensity. They were also slightly higher than this level for physical function and social participation.

Table 4 also shows that PROMIS-29 domain and EQ-5D scores had statistically significant associations with those for the SCQ, the highest being for domains relating most to physical health which were largely above the expected range of < 0.50 and ≥ 0.20, and particularly for pain domains. Correlations for the EQ-5D item scores were, as expected, slightly lower, except for anxiety/depression. The correlation for the EQ-5D index scores were higher than those for EQ-5D items and PROMIS domains. The EQ-VAS correlation was lower than expected, and below that for the PROMIS-29 domains that relate most to physical health. Overall, 53 (83%) of the 64 correlations for the PROMIS-29 were within the hypothesized range.

Table 4 Spearman correlation coefficients between PROMIS-29, EQ-5D-5L and SCQ scores (n = 2936)


The PROMIS-29 performed satisfactorily in relation to measurement criteria widely recommended in the evaluation of PROMs including classical and modern psychometric methods [28]. Levels of missing data were low across the 29 items, but many items show high ceiling effects denoting the highest possible levels of health, which meant that the domain scores for all but the sleep disturbance domain, were highly skewed. This follows previous findings for general populations from France, Germany and the UK [7, 41]. Short-form instruments such as the PROMIS-29, include the most important health domains and items of general relevance across sick and healthy populations, and hence, skewed data towards positive health was not unexpected in this population. Highly skewed PROMs data is common for general population samples [14,15,16]. In a comparison of data from Germany, Poland, South Korea, and USA, the 5L version of the EQ-5D reported here, was found to have ceiling effects in the range of 48 to 97% and 35 to 61% for item and index scores respectively [42]. Skewed data might be also expected in younger age groups with more minor health problems. Given the potential supplementary information that they offer, additional PROMIS short-forms, item banks and/or condition-specific instruments should be considered for application alongside short-form generic instruments.

CFA showed that the Norwegian PROMIS-29 had good evidence for structural validity including the presence of the seven domains. Rasch analysis further confirmed unidimensionality of the seven domains which had acceptable levels of reliability, with all domains close to, or meeting the more stringent criterion of 0.9 [37]. This follows the findings of the developers and similar testing in general populations for other countries [7, 41]. The instrument was not affected by DIF for age and education levels but as was found previously [41], females and males were found to respond differently to one of the items within the pain interference domain. At 0.5 logits, this is considered a large effect [34]. DIF has greater implications for domains that comprise few items, including those within the PROMIS-29. It is recommended that the domain of pain interference is analysed separately for gender [41]. Several of the fit residuals were outside of the ± 2.5 range but this was a large sample size which can make them unreliable [43].

The great majority of the correlations for the convergent validity of the PROMIS-29 were as hypothesized and met the criterion of 75% [28]. The remainder were all higher than expected. The EQ-5D is the most widely tested and applied generic PROM suitable for use in economic evaluation [20, 44], and hence, comparisons by means of expected correlations with the PROMIS-29, increase our understanding of the latter in terms of its validity as a short-form generic health profile. Given their general focus, criteria for expected levels of correlation followed those used in a systematic review [38] and psychometric testing of generic PROMs [44]. The criteria, in terms of the range of correlations, are overlapping which takes consideration of different approaches to assessing health constructs and their operationalization, through items and scaling. For example, PROMIS-29 uses multi-item scales with several domain scores, whereas the EQ-5D uses single items that form an index based on preferences or values for health states obtained from the general population [20].

Domain scores that assess the same or very similar constructs had correlations exceeding the expected level of 0.6. The levels of correlation were highest for those assessing aspects of pain. The PROMIS-29 domain of pain interference assesses the effect of pain on daily activities, and arguably has the greatest overlap with the any of the EQ-5D dimensions. The EQ-5D assesses anxiety and depression through a single item, whereas PROMIS-29 has two separate domains which are highly correlated, but as this and other studies have found, are distinct [7, 41]. Previous studies have also found acceptable levels of correlation between PROMIS-29 scores and those for other legacy instruments including the SF-36 [41, 45]. The consistent association with the SCQ scores provides further empirical support for the convergent validity of the PROMIS-29 [41]. Furthermore, it supports its potential use as a measure of quality of care for people with multimorbidity and for the development of systems for identifying individuals at risk of deterioration [46, 47].

Strengths and limitations

The study was comparable in scope and size to existing European studies that have assessed the measurement properties of the PROMIS-29 in the general population [7, 41]. This secured more than an adequate sample size for the application of CFA and the Rasch partial credit model. The latter has been widely applied in the field of health measurement and while the graded response model has been more widely used for PROMIS measures [2], the Rasch partial credit model has had considerable application in Europe, including the PROMIS-29 [41]. It is encouraging that the PROMIS-29 domains demonstrate adequate fit to both models.

Previous studies have included the SF-36, an establish generic health profile, for purposes of assessing the validity of the PROMIS-29 [41, 45]. The current study included the EQ-5D, which is the most widely tested and used PROM suitable for use in economic evaluation [20, 44]. In common with these studies, this was a cross-sectional design, and hence, responsiveness to changes in health was not assessed. The survey was conducted three months before the COVID-19 pandemic in Norway and a one-year follow-up survey that included the PROMIS-29, was implemented to assess the impact of the pandemic on the health of the Norwegian general population. It is anticipated that PROMIS measures including the PROMIS-29, will have increasing use in Norway. The PROMIS-57 has evidence for measurement properties in a smaller Norwegian general population sample recruited through mainstream and social media [48] and is being used in a long-term follow-up of COVID-19 outpatients [49]. Several item banks and short forms have been translated for children with national applications including the Norwegian Pandemic Register [50] and Child Hip Register [51].

National data from Statistics Norway shows that the sample cannot be considered fully representative of the general population. It is uncertain whether a more representative sample would have influenced the findings of the psychometric analyses, but there was no evidence for DIF across age groups and education levels. The response rate of 26% would have increased had a reminder been used, but this would have proved costly with over 9,000 non-respondents.


In conclusion, the Norwegian-language PROMIS-29 has evidence for acceptable measurement properties including reliability and validity, in a large sample of the Norwegian general population. Subject to further testing including responsiveness to change, it may be suitable for applications where a short-form profile measure of health is required that offers more detailed information than the EQ-5D. However, this study only assessed a limited range of measurement properties in the general population. Further testing is recommended in patient populations along with an evaluation of responsiveness to changes in health.

Availability of data and materials

The dataset(s) supporting the conclusions of this article will be available for download from the Norwegian Centre for Research Data (



Comparative Fit Index


Computer adaptive testing


Differential item functioning


Item response theory


National Institutes of Health


Person separation index


Patient Reported Outcomes Measurement Information System


Root mean square error approximation


Standard deviation


Tuckey Lewis Index


Robust weighted least squares


  1. Alonso J, Bartlett SJ, Rose M, Aaronson NK, Chaplin JE, Efficace F, Leplège A, Lu A, Tulsky DS, Raat H, Ravens-Sieberer U, Revicki D, Terwee CB, Valderas JM, Cella D, Forrest CB, PROMIS International Group (2013) The case for an international patient-reported outcomes measurement information system (PROMIS®) initiative. Health Qual Life Outcomes 11:210.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Cella D, Choi SW, Condon DM, Schalet B, Hays RD, Rothrock NE, Yount S, Cook KF, Gershon RC, Amtmann D, DeWalt DA, Pilkonis PA, Stone AA, Weinfurt K, Reeve BB (2019) PROMIS® adult health profiles: efficient short-form measures of seven health domains. Value Health 22:537–544.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Dewitt B, Jalal H, Hanmer J (2020) Computing PROPr utility scores for PROMIS® profile instruments. Value Health 23:370–378.

    Article  PubMed  Google Scholar 

  4. HealthMeasures PROMIS Accessed 12 Feb 2021

  5. Hjollund NHI, Valderas JM, Kyte D, Calvert MJ (2019) Health data processes: a framework for analyzing and discussing efficient use and reuse of health data with a focus on patient-reported outcome measures. J Med Internet Res.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Kwakkenbos L, Thombs BD, Khanna D, Carrier ME, Baron M, Furst DE, Gottesman K, van den Hoogen F, Malcarne VL, Mayes MD, Mouthon L, Nielson WR, Poiraudeau S, Riggs R, Sauvé M, Wigley F, Hudson M, Bartlett SJ, Investigators SPIN (2017) Performance of the patient-reported outcomes measurement information system-29 in scleroderma: a scleroderma patient-centered intervention network cohort study. Rheumatology (Oxford) 56:1302–1311.

    Article  Google Scholar 

  7. Fischer F, Gibbons C, Coste J, Valderas JM, Rose M, Leplège A (2018) Measurement invariance and general population reference values of the PROMIS Profile 29 in the UK, France, and Germany. Qual Life Res 27:999–1014.

    Article  PubMed  Google Scholar 

  8. Khutok K, Janwantanakul P, Jensen MP, Kanlayanaphotporn R (2021) Responsiveness of the PROMIS-29 scales in individuals with chronic low back pain. Spine (Phila Pa 1976) 46:107–113.

    Article  Google Scholar 

  9. Hageman PA, Mroz JE, Yoerger MA, Pullen CH (2019) Weight loss is associated with improved quality of life among rural women completers of a web-based lifestyle intervention. PLoS ONE.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Licciardone JC, Pandya V (2020) Feasibility trial of an eHealth intervention for health-related quality of life: implications for managing patients with chronic pain during the COVID-19 pandemic. Healthcare (Basel) 8:381.

    Article  Google Scholar 

  11. McGregor G, Sandhu H, Bruce J, Sheehan B, McWilliams D, Yeung J, Jones C, Lara B, Smith J, Ji C, Fairbrother E, Ennis S, Heine P, Alleyne S, Guck J, Padfield E, Potter R, Mason J, Lall R, Seers K, Underwood M (2021) Rehabilitation exercise and psycholoGical support after covid-19 InfectioN’ (REGAIN): a structured summary of a study protocol for a randomised controlled trial. Trials 22:8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Hartman JD, Craig BM (2018) Comparing and transforming PROMIS utility values to the EQ-5D. Qual Life Res 27:725–733.

    Article  PubMed  Google Scholar 

  13. Dewitt B, Feeny D, Fischhoff B, Cella D, Hays RD, Hess R, Pilkonis PA, Revicki DA, Roberts MS, Tsevat J, Yu L, Hanmer J (2018) Estimation of a preference-based summary score for the patient-reported outcomes measurement information system: the PROMIS®-preference (PROPr) scoring system. Med Decis Mak 38:683–698.

    Article  Google Scholar 

  14. Stavem K, Augestad LA, Kristiansen IS, Rand K (2018) General population norms for the EQ-5D-3 L in Norway: comparison of postal and web surveys. Health Qual Life Outcomes 16:204.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Garratt AM, Stavem K (2017) Measurement properties and normative data for the Norwegian SF-36: results from a general population survey. Health Qual Life Outcomes 15:51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Jacobsen EL, Bye A, Aass N, Fosså SD, Grotmol KS, Kaasa S, Loge JH, Moum T, Hjermstad MJ (2018) Norwegian reference values for the Short-Form Health Survey 36: development over time. Qual Life Res 27:1201–1212.

    Article  PubMed  Google Scholar 

  17. Bjertnaes O, Iversen HH, Holmboe O, Danielsen K, Garratt A (2016) The Universal Patient Centeredness Questionnaire: reliability and validity of a one-page questionnaire following surveys in three patient populations. Patient Relat Outcome Meas 7:55–62.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Garratt AM, Bjaertnes ØA, Krogstad U, Gulbrandsen P (2005) The OutPatient Experiences Questionnaire (OPEQ): data quality, reliability, and validity in patients attending 52 Norwegian hospitals. Qual Saf Health Care 14:433–437.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Patient Reported Outcomes Measurement Information System (2020) PROMIS adult profile scoring manual. Accessed 10 Feb 2021

  20. Devlin NJ, Brooks R (2017) EQ-5D and the EuroQol group: past, present and future. App Health Econ Health Policy 15:127–137.

    Article  Google Scholar 

  21. Statens Legemiddelverk (2018) Guidelines for the submission of documentation for single technology assessment (STA) of pharmaceuticals. Accessed 20 Nov 2020

  22. Dolan P (1997) Modeling valuations for EuroQol health states. Med Care 35:1095–1108

    Article  CAS  Google Scholar 

  23. van Hout B, Janssen MF, Feng YS, Kohlmann T, Busschbach J, Golicki D, Lloyd A, Scalone L, Kind P, Pickard AS (2012) Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets. Value Health 15:708–715.

    Article  PubMed  Google Scholar 

  24. Sangha O, Stucki G, Liang MH, Fossel AH, Katz JN (2003) The Self-Administered Comorbidity Questionnaire: a new method to assess comorbidity for clinical and health services research. Arthritis Rheum 49:156–163.

    Article  PubMed  Google Scholar 

  25. Wild D, Grove A, Martin M, Eremenco S, McElroy S, Verjee-Lorenz A, Erikson P, ISPOR Task Force for Translation and Cultural Adaptation (2005) Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: report of the ISPOR task force for translation and cultural adaptation. Value Health 8:94–104.

    Article  PubMed  Google Scholar 

  26. Brown TA (2006) Confirmatory factor analysis for applied research. The Guilford Press, New York

    Google Scholar 

  27. Muthén LK, Muthén BO (1998–2015) Mplus User’s Guide, 7th edn. Muthén LK, Muthén: Los Angeles

  28. Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HCW (2018) Terwee CB (2018) COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res 27:1147–1157.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Hu L, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Model Multi-discip J 6:1–55

    Article  Google Scholar 

  30. Masters GN (1982) A Rasch model for partial credit scoring. Psychometrika 47:149–174

    Article  Google Scholar 

  31. Baghaei P (2008) Local dependency and Rasch measures. Rasch Meas Trans 21:1105–1106

    Google Scholar 

  32. Christensen KB, Makransky G, Horton M (2017) Critical values for Yen’s Q3: identification of local dependence in the Rasch model using residual correlations. Appl Psychol Meas 41:178–194

    Article  Google Scholar 

  33. Teresi JA, Fleishman JA (2007) Differential item functioning and health assessment. Qual Life Res 16(Suppl 1):33–42.

    Article  PubMed  Google Scholar 

  34. Rouquette A, Hardouin JB, Vanhaesebrouck A, Sébille V, Coste J (2019) Differential Item Functioning (DIF) in composite health measurement scale: recommendations for characterizing DIF with meaningful consequences within the Rasch model framework. PLoS ONE.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Cronbach L (1951) Coefficient alpha and the internal structure of tests. Psychometrika 6:297–334

    Article  Google Scholar 

  36. Wright BD, Masters GN (1982) Rating scale analysis. MESA Press, Chicago

    Google Scholar 

  37. Nunnally JC, Bernstein ICH (1994) Psychometric theory, 3rd edn. McGraw-Hill, New York

    Google Scholar 

  38. Chiarotto A, Terwee CB, Kamper SJ, Boers M, Ostelo RW (2018) Evidence on the measurement properties of health-related quality of life instruments is largely missing in patients with low back pain: a systematic review. J Clin Epidemiol 102:23–37.

    Article  PubMed  Google Scholar 

  39. Lee ES, Koh HL, Ho EQ, Teo SH, Wong FY, Ryan BL, Fortin M, Stewart M (2021) Systematic review on the instruments used for measuring the association of the level of multimorbidity and clinically important outcomes. BMJ Open.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Statistics Norway. Accessed 27 May 2021

  41. Coste J, Rouquette A, Valderas JM, Rose M, Leplège A (2019) The French PROMIS-29. Psychometric validation and population reference values. Rev Epidemiol Sante Publique 66:317–324.

    Article  Google Scholar 

  42. Golicki D, Niewada M (2017) EQ-5D-5L Polish population norms. Arch Med Sci 13:191–200.

    Article  PubMed  Google Scholar 

  43. Müller M (2020) Item fit statistics for Rasch analysis: can we trust them? J Stat Distrib Appl 7:5.

    Article  Google Scholar 

  44. Garratt AM, Furunes H, Hellum C, Solberg T, Brox JI, Storheim K, Johnsen LG (2021) Evaluation of the EQ-5D-3L and 5L versions in low back pain patients. Health Qual Life Outcomes 28:19.

    Article  Google Scholar 

  45. Rawang P, Janwantanakul P, Correia H, Jensen MP, Kanlayanaphotporn R (2020) Cross-cultural adaptation, reliability, and construct validity of the Thai version of the Patient-Reported Outcomes Measurement Information System-29 in individuals with chronic low back pain. Qual Life Res 29:793–803.

    Article  PubMed  Google Scholar 

  46. Valderas JM, Gangannagaripalli J, Nolte E, Boyd CM, Roland M, Sarria-Santamera A, Jones E, Rijken M (2019) Quality of care assessment for people with multimorbidity. J Intern Med 285:289–300.

    Article  CAS  PubMed  Google Scholar 

  47. Rijken M, Valderas JM, Heins M, Schellevis F, Korevaar J (2020) Identifying high-need patients with multimorbidity from their illness perceptions and personal resources to manage their health and care: a longitudinal study. BMC Fam Pract 21:75.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Rimehaug SA, Kaat AJ, Nordvik JE, Klokkerud M, Robinson HS (2021) Psychometric properties of the PROMIS-57 questionnaire, Norwegian version. Qual Life Res.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Garratt AM, Ghanima W, Einvik G, Stavem K (2021) Quality of life after COVID-19 without hospitalisation: good overall, but reduced in some dimensions. J Infect 82(5):186–230

    Article  CAS  Google Scholar 

  50. Buanes EA (2020) Pasientrapporterte data frå pasientar med Covid-19 i Norsk intensiv- og pandemiregister, Norsk intensiv og pandemiregister. Accessed 15 June 2021

  51. Gundersen T, Wiig O, Hunstock S, Pedersen DR, Holen K, Rasmussen H, Fenstad AM, Kroken G (2020) Nasjonalt Barnehofteregister årsrapport for 2019 med plan for forbedringstiltak. Accessed 15 June 2021

Download references


Inger Paulsrud and Kjetil Telle helped with the survey administration.


The study was funded by the Norwegian Research Council (Project Number: 262673). José M Valderas was supported by the National Institute for Health Research financed Applied Research Collaboration, South West Peninsula, UK. The views expressed in this publication are those of the authors and not necessarily those of the supporting institutions.

Author information

Authors and Affiliations



AMG was responsible for the study design and data collection. The four authors conceived the analysis plan. AMG, JC and, AR undertook statistical analysis. AMG wrote the first draft and all authors contributed to this and successive drafts.  All authors read and approved the final manuscript.

Corresponding author

Correspondence to Andrew M. Garratt.

Ethics declarations

Ethics approval and consent to participate

The Regional Committee for Medical and Research Ethics stated that the study did not require their approval. The Data Protection Impact Assessment was approved by the Norwegian Institute of Public Health on the 16th October 2019.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.


The PROMIS Health Organization (PHO) retains copyrights for all PROMIS material. The PHO works with PROMIS National Centers (PNC) to develop, standardize and facilitate access to PROMIS instruments world-wide. Only the PNC in any particular country and the PNC of the US (also called ‘‘PHO central office’’), have the right to distribute the PROMIS materials in that particular country. Andrew Garratt is the PNC representative for Norway and should be contacted for accessing Norwegian PROMIS materials:

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

. PROMIS-29 item characteristic curves.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garratt, A.M., Coste, J., Rouquette, A. et al. The Norwegian PROMIS-29: psychometric validation in the general population for Norway. J Patient Rep Outcomes 5, 86 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: