A total of 2420 articles were retrieved from bibliographic databases and an additional 45 full-text articles were identified through forward and backward citation searches (see Fig. 2). Inter-rater reliability between the reviewers were κ = 0.83 (abstract) and κ = 0.92 (full text). A total of 33 studies were included in the final review (Table 1). Articles that were excluded from the final review did not meet the inclusion criteria as outlined in the methods section, for example they did not study variation across time, or report PROMs scores. The quality of the studies varied from 3 to 7 points (maximum) on the adapted CASP tool, with three articles achieving the maximum score of 7.
Study characteristics
The majority of the literature was published from 2000 [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42], with ten articles published in the last five years [22, 30, 34, 35, 38, 40,41,42,43,44,45]. Seventeen studies were conducted in North America [22, 25, 27, 29, 30, 33,34,35,36,37, 42, 45,46,47,48,49], twelve in Europe [23, 24, 28, 31, 32, 38, 40, 41, 44, 50,51,52], with two studies across both regions [26, 29] and two further studies from Asia [43, 53]. Studies were conducted in patients with five broad disease categories: mental health (n = 8) [28, 31, 35, 40,41,42, 46, 48], musculoskeletal (n = 7) [23, 34, 36, 39, 45, 47, 54], respiratory (n = 5) [26, 27, 32, 33, 51], nervous system (n = 4) [29, 36, 37, 44], and other conditions (n = 8) [22, 25, 30, 38, 43, 49, 50, 53]. Studies sampled mostly adult populations (n = 30), with two studies focusing solely on female adults [36, 49], and the remaining focusing on children [34, 39, 47]. Studies recruited participants in specialist outpatient departments within secondary care (n = 18) [22, 24, 25, 29, 32, 34, 36,37,38,39, 43, 44, 47, 50, 52,53,54,55], primary care and the community (n = 11) [23, 26, 27, 30, 31, 40,41,42, 45, 46, 49]. The two systematic reviews did not have any inclusion criteria relevant to specific settings.
Study designs
Included studies in the literature collected PROMs primarily for the measurement of symptom severity (such as pain, fatigue, stiffness, shortness of breath and affect (emotions)) and functional status, including disability measures. There was a lack of quality of life measures used across the articles. Many of the studies used visual analogue scales (VAS) for pain, fatigue and stiffness [24, 27, 29, 34, 39, 40, 47, 54]. Seven studies used single items on mood, pain and fatigue [23, 30, 36, 38, 39, 44, 45].
The included studies employed a range of only quantitative methodologies and designs, including observational (cross-sectional and cohort) [22,23,24,25,26,27, 29,30,31, 33, 34, 36,37,38,39,40,41,42,43,44,45,46,47,48, 50, 53, 55] and experimental (randomised controlled trials) designs [32, 52, 54]. We did not identify any studies using qualitative or mixed methods, commentaries, or editorials. Many of the studies using observational methods used the Ecological Momentary Assessment (EMA) approach to data collection [22, 23, 25, 30, 31, 36, 37, 40,41,42, 44,45,46, 53, 56, 57]. There were two systematic reviews which focused on methodological approaches to collecting real-time data in two specific conditions, depression and COPD [28, 51]. The majority of studies used a repeated measures design, collecting data from twice to eight times a day[22, 24, 25, 27, 29,30,31, 34, 36,37,38,39, 41, 42, 44, 46, 52,53,54,55,56], with one study collecting data every three to four months over a 27 month period [43]. Three studies were of cross-sectional design collecting data only once [26, 48, 50].
Conceptual model
Two core constructs were identified in our conceptual model from the literature: variation in health outcomes (PROs), and variation in scores (PROMs) (Fig. 3). In addition the model considers two determinants (disease-related biorhythms, and timing of biomedical interventions), one key mediator (psychological status), and two main moderators (individual and environmental factors). The determinants only directly influence variation of outcomes, while the moderators impact on all of the two determinants, two core constructs, and the mediator. Psychological health status has a bidirectional relationship with variation in outcomes (an individual’s overall health has an impact on psychological state, which also influences overall health). All these interactions result in possible sources of variation in scores, and determine how scores are to be interpreted.
Moderators: individual and environmental factors
One of the fundamental determinants of health is the person’s individual characteristics and behaviour. When considering individual factors, part of this can be defined in terms of the demographics (e.g. age, gender) of the population being studied, their personality, motivation, values and preferences. The impact of the concepts of motivation and personality are reinforced with research conducted by Hardt et al. [50] or Graham-Engeland et al. [45], linking personality characteristics such as mood-like traits to the experience of pain. An individual’s level of acceptance or determination changes the way they perceive their outcomes (e.g. symptoms, functional status), for example pain acceptance was seen to buffer expected increases in pain interference and decreases in physical activity in the context of high pain for spinal cord injured patients [30]. Individual thresholds could also determine changes in scores longitudinally, especially in relation to subtle changes in pain that occur for those with high pain thresholds. Multimorbidity adds to the complexity of completion and interpretation of PROMs and was an important concept to consider in the articles. Co-morbid conditions sharing similar symptoms can impact on how patients report on one particular condition, with symptoms in one condition (e.g. pain in rheumatoid arthritis) potentially triggering another condition (e.g. depression) [23, 45].
Environmental determinants of health include both the physical and social environment in which individuals live and work. The physical environment includes the natural setting (e.g. weather, bioenvironmental markers, etc.) and the human setting (urban/rural). For example, temperature changes over the year can impact on symptom status for COPD sufferers exacerbating their symptoms in the winter [32], limiting their participation in activities. Furthermore, cold weather has been associated with a breakthrough of chronic prostatitis/chronic pelvic pain syndrome symptoms in the winter compared to acute symptoms reported in the summer [43]. External rhythms, such as exposure to sunlight or external stimuli, have been linked to variation in outcomes and psychological status with increased sunlight linked to better outcome scores [28, 48], and worsening outcomes for long exposure to external stimuli [38]. Sleep quality was highlighted as a contributing factor to worsening PRO scores due to sleep disruption, triggered by numerous variables such as stress [34, 37, 49] or night-time symptoms [51] and effects on symptoms such as mood upon awakening, fatigue [23, 44, 49], and poor overall functioning [37].
Determinants
Two main sources of outcomes variation are identified: disease-related biorhythms, and timing of health care interventions (including medication). Disease-related biorhythms are the natural cycles of change in the body’s chemistry or function and symptoms [26], related to the health condition, which function in a rhythmic pattern. For example, those with rheumatoid arthritis present a diurnal patterning with regard to their symptoms [23, 54], whilst cortisol levels that affect mood in seasonal affective disorder has a circannual rhythm [48]. These biorhythms govern certain health outcomes such as symptoms and function [26], and ultimately affect health related quality of life.
The timing of medical interventions (such as the dosage and pharmacokinetics of medication) is an important factor to consider as it has significant consequences on the variation in health outcomes, due to both their indications and adverse effects [22, 26, 33, 53]. Cancer treatments have severe effects on individuals’ symptoms and functional ability. Breast cancer patients present a distinct infradian patterning of fatigue levels following chemotherapy treatments, typically highest within 24 to 48 h following treatment [49]. The type of intervention prescribed (whether that be pharmacological or not) for every condition will be different and will have varying levels of impact on an individual’s overall outcome. In some conditions, the time of year an intervention is administered, such as rehabilitation, impacts on overall health outcomes post-completion. For example, Sewell et al. [32] showed that for COPD patients seasonal variations have an important impact on functional performance after pulmonary rehabilitation.
Variation in health outcomes
Variation in health outcomes depends on health conditions, the type of health outcomes (as outlined in the existing models/classification systems on health outcomes), and time (periods). The studied health conditions show cyclical patterns in their effects on health outcomes such as symptom and functional status, and health related quality of life. Individuals with musculoskeletal and nervous system conditions experience a diurnal patterning of symptoms during the day, with fatigue and pain worsening by the end of the day [23, 24, 29, 34, 36, 37, 39, 41, 44, 45, 47, 53, 54]. However, individuals with respiratory conditions experience a different diurnal patterning of symptoms whereby symptoms are worse in the morning and evening [26, 27, 33, 51]. In addition, respiratory conditions have seasonal patterning with individuals reporting increased symptom severity levels over winter months [32].
Functional status, one’s ability to perform daily tasks, varies with health conditions and time [30, 32, 34, 39, 41, 48]. It is apparent that one’s functional status presents a diurnal and infradian rhythmic patterning depending on the health condition. For example, functional performance for COPD patients worsens in the winter months [32], greater functional difficulties are experienced in the mornings and on the days following nights of poorer perceived sleep quality for arthritis sufferers [34, 39].
Although health related quality of life (HRQoL) was not extensively researched in the papers, there was some acknowledgement of the association between HRQoL and the symptoms and functioning experienced by individuals [24,25,26,27, 29, 30, 39, 40, 43, 51, 56] with regard to fluctuations in symptoms and functioning across conditions being associated with lower health related quality of life. It is evident that fluctuating health outcomes has a bi-directional relationship with an individual’s psychological status, in that mood is affected by and affects symptoms, functioning and health-related quality of life.
Mediator: psychological health status
Although psychological health status is also a health outcome, it has been presented as a mediator in this model. The rationale behind this is that psychological health status strongly impacts on and is impacted by all the other concepts in the model. The mental state an individual is in appears to be determined by the two moderators as well as the other health outcomes. The other concepts within the model influence the (non-observable) mediator concept (psychological health), which in turn influences variation in scores. Psychological health status incorporates mood (e.g. emotions), cognition and general psychological and mental functions. An individual’s psychological health status is determined by both the individual and environmental variables. In our model psychological health status is a mediator between variation of PROs and variation in the scores. A change in psychological status resulting in worse outcome scores has been observed for patients with MS [44], arthritis [47], or suffering mental health problems. Variations in mood have been linked to fluctuations in pain, stiffness, and fatigue in children with chronic arthritis [34]. As represented in the model, the relationship between psychological status and variation of outcomes is bidirectional. Bulimic patients, for example, tended to engage more in bulimic behaviour on days where negative emotion is high, and vice versa. In addition, mood measured in a previous month predicted pain severity in the next month [45].
Psychological health status also played a role in the prediction of reduced social activities for children with chronic arthritis demonstrating the link it has with functional status [47], with lower mood and stiffness being a predictor of school attendance. The relationship between psychological status and variation of scores is unidirectional, in that lower mood at the time of completing a PROM impacts on how an individual remembers their experience of their condition, which affects the scores [45]. Psychological health status also fluctuates over time, with research demonstrating a within-person fluctuation over short periods of time [35].
Variation in scores
Variation in scores is dependent on several internal processes an individual uses to complete a measurement tool. Completion of an outcome measurement is reliant on the ability of individuals to appraise their condition which involves a cognitive process. The internal processes (integration) involved for each individual when appraising their condition is influenced by an individual’s cognitive process and their recall. As completion of a PROM requires individuals to reflect on their health, there is a degree of recall involved which impacts on and is impacted by how individuals integrate their experience. All of these concepts then lead to what is completed on the measurement tool and the interpretation of outcome scores.
Within-person variance was commonly observed for different mood disorders in daily and weekly scores, including suicidal ideation [42], eating disorders [55], bipolar and borderline personality disorder [27]. Cognitive decline and an increase in fatigue during the day is observed in MS patients affecting their performance to do tasks [30, 35], with substantial moment-to-moment and day-to-day fluctuations in fatigue severity found in relapse-remitting MS patients [41]. This decline in cognitive function can affect the internal processes involved in responding to an outcome measure, ultimately affecting the PROM score.
The sensitivity of the measurement to detect any changes in outcomes over time, and how change is defined to be clinically important within studies were important issues discussed in the articles [35, 39]. Diaries were more sensitive to daily score changes than measures obtained by patient interview, for pain intensity for cancer patients [48], and for young people with juvenile idiopathic arthritis [22]. The timing of measurements has been shown to be of significant importance, particularly with conditions that affect cognitive performance, such as MS patients demonstrating cognitive fatigue declining as the day progresses [30].
Daily measurements of mood, in one study, impacted the evaluation of health outcomes when measuring efficacy of psychopharmacological or psychological interventions [27]. However, daily measurements can also affect how individuals report their symptoms, for example in one study, pain significantly decreased during the second week of the study, which may have been an unintentional feedback intervention resulting in changes in their appraisals or pain management [26]. Although we do not know if the changes were also due to the fact that pain naturally decreased, thus representing true variation in outcomes.
Recall bias is when patients remember an event or experience incorrectly [57]. Retrospective accounts can lead to misclassification of symptoms [20], and an over-estimation of symptoms [37, 48]. Psychological health status [40], symptoms at the time of recall [23], length of the recall period, and primacy or recency of information [37] all impact on how individuals appraise their condition. A systematic review of studies on major depressive disorders revealed that negative recall bias in these patients exist mostly in the under-reporting of negative affect [34]. Asking patients to summarise their mood over a requested period potentially overlooks clinically meaningful differences in symptom patterns which could be picked up at each moment in time [37]. Although pain scores were higher in the evening and fluctuated across the weeks, pain recall was inaccurate for cancer patients with over-estimation of pain reported from a previous week [48].