- Short report
- Open Access
Bifactor model of the CASP-12’s general factor for measuring quality of life in older patients
© The Author(s) 2018
- Received: 13 June 2018
- Accepted: 19 October 2018
- Published: 4 December 2018
Patients’ subscores on quality of life (QoL) measures can provide diagnostic information about strengths and weaknesses of respondents’ performance in specific areas. Such diagnostics may help with identification of potential at-risk individuals. Subscores may also help with modifying extant care-treatment programs, particularly those among patient-preferred specific functionalities . The Control, Autonomy, Self-realization and Pleasure (CASP) measure is one, popular QoL measure example with such subscore potential, which will be of focal interest in the current short report .
The CASP builds on psychology needs-satisfaction models to emphasize wellbeing across its four titled domains . The shortened version of the original CASP-19 scale, was designed specifically for use in the Survey of Health, Ageing and Retirement in Europe (SHARE) study (CASP-12) , representing two combined factors: 1)Control/Autonomy, and 2) Self-realization/Pleasure. Extant psychometric studies of the CASP-12 have been limited by classical measurement approaches. For example, the proposed combination of CASP’s first two subscales for greater stability contradicts the retention of its other, two shorter subscales exhibiting higher internal reliabilities. Also, proposed combining (or, parceling) of items for fitting unidimensional prediction models potentiates further upward-bias from subdomain-criterion relations.
The current short report’s primary aim is to psychometrically inspect the CASP-12 with modern measurement’s item response theory (IRT). This is important, because increasing usage is potentially unproductive due to incomplete inspection of the CASP’s internal psychometric structure, such as general factor strength and substantive multidimensionality . This limits, among other things, the CASP-12’s equating across studies that use different subsets of items, as well as hindering the CASP’s expansion to new items when CASP-12’s core-pool has not been IRT-calibrated. The current study will identify and extending initial findings from SHARE’s older-adult general population and examine CASP-12’s uni- /multi -dimensionality in a patient-specific sample from the Irish Longitudinal Study on Ageing (TILDA) .
Since the early, 1990’s days of QoL research, investigators have generally agreed that physical, mental, and social health subdomains are inseparable, that is, QoL is a fairly broad construct . As mentioned in this author’s earlier IRT evaluation of another health measure– “broader constructs are stabilized with broad factors” . As the CASP’s author reassures researchers that “those who simply require a single index” may sum the CASP-12, it is important to first-determine if unidimensional usage in prediction models is reasonably unbiased by ignoring subdomains. As the CASP constructor’s concluded, “…strength of the inter-domain correlations…. confirm our belief that QoL is a unitary phenomenon which is the product of the interactions between the domains” . This interpretation of general QoL as-caused by inter-domain interactions is important, because it contradicts the commonly accepted second-order CASP model, which hierarchically represents general QoL as causally preceding variation on its four specific domains (control, autonomy, self-realization, pleasure). If, instead, the CASP’s general QoL factor is correctly interpreted as ‘emerging’ from diverse manifestations represented by subdomains, then within-domain variation may be more accurately viewed more-so as nuisance variation that can and should be statistically treated as such in the measurement of QoL [9, 10]. For example, Sexton and others’ have suggested to covary residuals for CASP’s negatively worded items “arising from method effects” . Fitting this alternative view, the bifactor model is a viable competitor to the second-order hierarchical model that will be empirically compared on model-data fit, as well as aligning more closely with CASP’s theoretical conceptualization as a unitary assessment of QoL.
As CASP’s original author, Hyde, recently stated – “It has proven to be a…multidimensional instrument” . The primary aim of the current study is to examine the substantiveness of such multidimensionality, which should be well-admitted in the context of QoL assessment among older patients. The next section details the samples and analyses conducted to report findings from the CASP’s psychometric inspection with IRT .
The CASP-12 self-report QoL instrument comprises twelve items. Each item is scored on a 4-point Likert-type scale, with descriptive anchors provided for each response option: 1 (‘Often’), 2 (‘Sometimes’), 3 (‘Not often’), and 4 (‘Never’). Higher CASP total-scores (CASPTOT) are interpreted as better QoL, with a possible range of: 12–48. In this short report, we denote CASP total-scores as CASPTOT. CASP subscales are abbreviated as Control(Con), Autonomy(Aut), Self-Realization(SR), and Pleasure(Pleas); For the CASP-12v.3 model’s two-factor structure examined here, we denote combined subscales as CASP(Con/Aut) and CASP(SR/Pleas), respectively.
A retrospective-observational study was conducted using archival data from the Survey of Health, Ageing and Retirement in Europe (SHARE), originally collected with interview methodology. The most recent, cross-sectional SHARE administration of the CASP in SHARE (Wave 6 [W6]) was obtained for current analyses.1 Sample1 participants were respondents to the latest cross-section of SHARE’s questionnaire, fielded in 2015. Participants are drawn from a representative sample of community-adults aged > − 50 years, residing in Europe (N = 63,669). Sample2 participants respondents to the latest cross-section of TILDA’s questionnaire, fielded in 2015. Participants are drawn from a representative sample of community-adults aged > − 50 years, residing in Ireland(N = 4993).
Preliminary analyses, including editing, missingness, and summary statistics were conducted. Latent variable modeling, including item-calibration and model-comparisons was conducted in IRT-PRO v4.1 . Marginal maximum likelihood (MML) estimation with Bock-Aitken expectation-maximization (BA-EM) algorithm was employed for all models. Item parameters and standard errors were estimated using the supplemented-EM algorithm. IRTPRO default values for convergence criteria (E-step = 1e-005; M-step = 1e-006; cycles = 500) and quadrature node details (points = 49; θ range = − 6, 6) were implemented in estimations. As in many IRT-based studies, likelihood-ratio tests were used to test hypotheses.
Summary Sample Characteristics
SHARE W6 Sample
TILDA W3 Sample
n = 63,669
n = 4993
Age M (SD)
Four models of CASP were compared for global fit indices – 1) Unidimensional(1-DIM), 2) CASP-12 v.3’s two-factor (2-DIM), 3) A bifactor with two specific factors specified by the CASP-12 v.3, and 4) Finally, because the combining of factors was aimed at preserving individual-difference indicators on narrower-specific QoL constructs (CASP subdomains), bifactor extension with random-intercepts was added (BiFactorRand-Intcpt) to compare if the content specificity adequately captures idiosyncratic response biases (e.g., careless responding to reverse-score items).
Model-comparisons began with the unidimensional-baseline and currently used CASP-12 v3. model, with the latter and more complex model expectedly fitting better (Δχ2  = 147,456.13, p < .001). Consequently, the more complex bifactor model also exhibited significantly better fit than the v3. two-factor structure (Δχ2  = 12,019.17, p < .001).
Summary Item-Factor Loadings and Comparative Global Model-Fit Indices
Having identified a bifactor best-fitting model to CASP-12 responses, suggesting retention of the general QoL factor, testing proceeded with inspection of reliability for both CASPTOT and its subscales (can subscales be used?). First and foremost, coefficient alpha (α) is not an indicator of unidimensionality and, often, is a poor indicator of reliability . This is verified in our current sample by rejection of tau-equivalency assumptions, ∆X2(12) = 3462.08, p < .01. Instead, the CASP’s item-covariance structure supports congeneric reliability (ρ), which protects against coefficient α’s underestimation. Here, CASPTOT was estimated as ρ = .77. Subscale reliabilities were estimated at ρ = .68(Con/Aut) and ρ = .84(SR/Pleas).
An alternative reliability index when multidimensionality’s impact is uncertain is coefficient omega (ω), which indexes the proportion of variance in CASPTOT scores attributable to all common sources of variance. Here, CASPTOT was estimated as ω = .91. Subscale omegas were estimated at ω = .77(Con/Aut) and ω = .91(SR/Pleas).
We may further index the unique variance after factoring out all other sources of systematic variance. Here, CASPTOT was estimated as ωHier = .83. Consequently, we may subtract ωHier from the previous ω value to obtain an estimate of the reliable variance in CASPTOT scores that is due to the subdomains. That is, ω(.92) - (.83)ωHier = .09, indicating that 9% of the reliable variance in CASPTOT scores is due to the subdomains. Furthermore, the subscales’ ωHier were estimated at ωHier = .37(Con/Aut), and ωHier = .04(SR/Pleas). These substantially lower values after residualizing-out CASPTOT implies that much of the ‘precision’ inferred from using CASP subdomains as specific QoL constructs is mostly ‘borrowed’ from the reliability of CASPTOT’s general QoL factor. This finding is supported by further evidence from Haberman’s 4-step procedure for determining the relative-improvement from using only subscale items to estimate reliability compared to all CASP items. In the current data, lower reliabilities were found for subscale-only items, implying that there is a relative-decrement (rather than improvement) in subscale reliability if CASP items from other subdomains are ignored. Next, we examine the cross-validation of the CASP’s bifactor representation in an independent sample specific to a patient population, as well as compare CASP’s unidimensional indices across samples.
Summary Unidimensionality Indices for CASP by Study Sample
SHARE W6 Sample
TILDA W3 Sample
ECV / ECVNew
.74 / .74
.64 / .64
ω / ωHier
.92 / .80
.93 / .77
IECV (# items > .80)
This study examined the widely used CASP-12 QoL measure using IRT to examine the general factor’s robustness to multidimensionality, as well as the usefulness for subdomains’ as narrower individual-differences indicators.
In this first-IRT inspection of CASP’s psychometric properties, the CASP-12’s general QoL factor was found to be well-specified by a bifactor model for specifying subdomains/content homogeneity as sources of nuisance variance. Furthermore, the CASP-12’s total score (general factor) exhibited acceptably high reliability in older populations across both broader community-dwellers, as well as among narrower-patient respondents. In contrast, the CASP-12’s specific subfactors were found to exhibit unacceptably low reliability, suggesting only CASP-12’s global score is currently appropriate for substantive interpretation and meaningful use . Finally, the CASP’s original 12-item measure was identified as-having a potentially useful, 5-item subset for succinct indexing of QoL-unitary scores for future researchers’ use in structural-estimation models.
Secondary use of de-identified data information permitted waiver of necessary study ethical review.
No funding support was received in the write-up of the enclosed short report.
The solo-submitting author provided singular-exclusive contributions to the enclosed submitted short report, including data curation, technical analyses, and substantive writeup and submission formatting
Matthew J. Kerry received his doctorate in Quantitative Psych. from the Georgia Institute of Technology in Spring, 2015. Since, he continues to serve as a post-doctoral research affiliate of The Swiss Federal Institute of Technology (ETH – Zürich). His research interests span general quantitative methods and patient- / provider- reported outcomes (P-PROs) psychometric validation via item response theory (IRT) modeling.
The author formally declares to have no conflicts of interest or commitment in the preparation or submission of this short report manuscript.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Humphrey, L., Willgoss, T., Trigg, A., Meysner, S., Kane, M., Dickinson, S., & Kitchen, H. (2017). A comparison of three methods to generate a conceptual understanding of a disease based on the patients’ perspective. Journal of patient-reported outcomes., 1(1), 9.View ArticlePubMedPubMed CentralGoogle Scholar
- Hyde, M., Wiggins, R. D., Higgs, P., & Blane, D. B. (2003). A measure of quality of life in early old age: The theory, development and properties of a needs satisfaction model (CASP 19). Aging Mental and Health, 7(3), 186–194.View ArticleGoogle Scholar
- Maslow, A. H. (1968). Toward a psychology of being. New York: Van Nostrand.Google Scholar
- Börsch-Supan, A., Brugiavini, A., Jürges, H., Kapteyn, A., Mackenbach, J., Siegrist, J., & Weber, G. (2008). First results from the survey of health, ageing, and retirement in Europe (2004–2007): Starting the longitudinal dimension. Mannheim Research Institute for the Economics of Aging. Mannheim: Germany.Google Scholar
- Kim, G. R., Netuveli, G., Blane, D., Peasey, A., Malyutina, S., Simonova, G., & Pikhart, H. (2015). Psychometric properties and confirmatory factor analysis of the CASP-19, a measure of quality of life in early old age: The HAPIEE study. Aging Mental and Health, 19(7), 595–609.View ArticleGoogle Scholar
- TILDA. (2018). The Irish Longitudinal study on Ageing (TILDA) Wave 3, 2014–2015. [dataset]. Version 3.1. Irish Social Science Data Archive. SN:0053–04. www.ucd.ie/issda/data/tilda/wave3
- Andersen, R. M., Davidson, P. L., & Ganz, P. A. (1994). Symbiotic relationships of quality of life, health services research and other health research. Quality of Life Research, 3(5), 365–371.View ArticlePubMedGoogle Scholar
- Kerry, M. J., Wang, R., & Bai, J. (2018). Assessment of the readiness for Interprofessional learning scale (RIPLS): An item response theory analysis. Journal of interprofessional care, 32(5), 634–637.View ArticlePubMedGoogle Scholar
- Hyde, M., Higgs, P., Wiggins, R. D., & Blane, D. (2015). A decade of research using the CASP scale: Key findings and future directions. Aging Mental and Health, 19(7), 571–575.View ArticleGoogle Scholar
- Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16(1), 19–31.View ArticlePubMedGoogle Scholar
- Sexton, E., King-Kallimanis, B. L., Conroy, R. M., & Hickey, A. (2013). Psychometric evaluation of the CASP-19 quality of life scale in an older Irish cohort. Quality of Life Research, 22(9), 2549–2559.View ArticlePubMedGoogle Scholar
- Petrillo, J., Cano, S. J., McLeod, L. D., & Coon, C. D. (2015). Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: A comparison of worked examples. Value in Health, 18(1), 25–34.View ArticlePubMedGoogle Scholar
- Cai, L., Thissen, D., & du Toit, S. (2016). IRTPRO [Computer Software]. Lincolnwood, IL: Scientific Software International.Google Scholar
- Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.View ArticleGoogle Scholar
- Dueber, D. M. (2017). Bifactor Indices Calculator: A Microsoft Excel-based tool to calculate various indices relevant to bifactor CFA models. https://doi.org/10.13023/edp.tool.01.
- Brod, M., Højbjerre, L., Pfeiffer, K. M., Sayner, R., Meincke, H. H., & Patrick, D. L. (2017). Development of the weight-related sign and symptom measure. Journal of patient-reported outcomes, 2(1), 17.View ArticlePubMedGoogle Scholar
- Terwee, C. B., Bot, S. D., de Boer, M. R., van der Windt, D. A., Knol, D. L., Dekker, J., et al. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of clinical epidemiology, 60(1), 34–42.View ArticlePubMedGoogle Scholar
- Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52(3), 431–462.View ArticleGoogle Scholar