Skip to main content

Selection and validation of a classification system for a child-centred preference-based measure of oral health-related quality of life specific to dental caries



Caries Impacts and Experiences Questionnaire for Children (CARIES-QC) is a child-centred caries-specific quality of life measure. This study aimed to select, and validate with children, a classification system for a paediatric condition-specific preference-based measure, based on CARIES-QC.


First, a provisional classification system for a preference-based measure based on CARIES-QC was identified using Rasch analysis, psychometric testing, involvement of children and parents, and the developer of CARIES-QC. Second, qualitative, semi-structured ‘think aloud’ validation interviews were undertaken with a purposive sample of children with dental caries. The interviewer aimed to identify whether items were considered important and easily understood, whether any were overlapping and if any excluded items should be reintroduced. Interview recordings were transcribed verbatim and thematic analysis conducted.


Rasch analysis identified poor item spread for the items ‘cross’ and ‘school’. Items relating to eating were correlated and the better performing items were considered for selection. Children expressed some confusion regarding the items ‘school’ and ‘food stuck’. Parent representatives thought that impacts surrounding toothbrushing (‘brushing’) were encompassed by the item ‘hurt’. Five items were selected from CARIES-QC for inclusion in the provisional classification system; ‘hurt’, ‘annoy’, ‘carefully’, ‘kept awake’ and ‘cried’. Validation interviews were conducted with 20 children aged 5–16 years old. Participants thought the questionnaire was straightforward and covered a range of impacts. Children thought an item about certain foods being ‘hard to eat’ was more relevant than one about having to eat more carefully because of their teeth and so the ‘carefully’ item was replaced with ‘hard to eat’.


Following child-centred modification, the preliminary five-item classification system is considered valid and suitable for use in a valuation survey. The innovative child-centred methods used to both identify and validate the classification system can be applied in the development of other preference-based measures.


Dental caries (tooth decay) is a prevalent oral disease, causing significant negative impacts on the lives of children and young people [1, 2]. Whilst pain is the most common feature of caries, there is a growing body of evidence on the further impacts relating to pain on children’s daily lives [3]. These include time off school, difficulty sleeping, speaking, eating and interference with everyday activities [4, 5]. Furthermore, a number of studies have highlighted links between dental caries and general health, with higher levels of untreated dental caries reported to be associated with reduced weight and poor growth [6, 7].

The wider impacts of caries on society are also substantial. In England, approximately 41,558 children aged up to 16-years were admitted to hospital in 2018–2019 with a diagnosis of dental caries [8]. As a result, dental caries remains the most common reason for children to require hospital admission with an estimated annual cost of £39 million to the National Health Service (NHS) [9].

Dental caries is a largely preventable disease and a range of community-based programmes and clinical strategies have been adopted to reduce the prevalence in children. However, there have been few high quality economic evaluations to determine the cost effectiveness of such programmes [10,11,12]. This creates difficulties for decision-makers and commissioners in determining which interventions to provide within the remit of the NHS [13, 14].

Within child oral health research, this paucity of economic evaluations could be attributed to the lack of a suitable instrument to measure Quality Adjusted Life Years (QALYs) [15]. Presently, only one generic preference-based measure has been used in child oral health research, with limited success; the Child Health Utility 9D (CHU9D) was not found to be sensitive enough to changes in caries status [15, 16]. The lack of use of other measures and the poor psychometric performance of CHU9D suggests that the content of child and adolescent generic preference-based measures may not be appropriate or sensitive for use in oral health research.

There is a clear need for the development of a validated preference-based measure, specifically for children, that is appropriate for measuring treatment benefits for dental caries [15]. This is achievable through the adaptation of a novel child-centred caries-specific oral health-related quality of life (OHRQoL) measure, known as CARIES-QC (Caries Impacts and Experiences Questionnaire for Children) [17]. This measure was developed with involvement of children at every stage, addressing the primary limitation of a number of other measures of OHRQoL [18]. In its current form, CARIES-QC cannot be used to generate QALYs since it is not preference-based. A preference-based measure consists of: a) a classification system that is used to describe the health of all children; and b) a value set used to score all health states described by the classification system.

It is not feasible or practical to gain preference weights from children for all of the twelve items within CARIES-QC [19]. As such, it was necessary to identify a smaller number of items within the measure to form the classification system. Furthermore, the preliminary classification system would require validation with children prior to use in a valuation survey.

This study aimed to identify a classification system for a child-centred preference-based measure using a combination of statistical methodologies and involvement of stakeholders including children, young people and parents. Furthermore, this study sought to validate, and refine where necessary, the preliminary classification system with a sample of children who had experience of dental caries using a qualitative approach. The valuation of the classification system to generate a preference-based measure will be reported elsewhere.


Ethical approval for this study was obtained from Yorkshire and the Humber Research Ethics Committee (Reference: 18/YH/0148).

Identification of the classification system

Several condition-specific preference-based measures have been developed using a staged approach that selects the classification system using a combination of Rasch Analysis, classical psychometric analysis and developer input [20,21,22,23]. The present study builds upon this approach by also incorporating child and parent views. The following stages were used to identify the most appropriate items for a classification system:

  1. 1.

    Rasch Analysis

  2. 2.

    Classical psychometric analysis

  3. 3.

    Patient and Public Involvement (PPI)

  4. 4.

    Developer input

The study team discussed the findings of each approach, particularly where stakeholder views were found to conflict with the results of statistical analyses. Where this occurred, agreement was sought by consensus on which items should be selected for inclusion in the preliminary classification system. The final part of this study involved the validation of the preliminary classification system using a qualitative approach. Revisions to the classification system were undertaken accordingly.


The CARIES-QC is a 12-item measure (Table 1) that seeks children’s assessment of the severity of their caries-related impacts, and has been deemed appropriate for use with 5–16 year-olds. The response format of this measure differs from other measures of OHRQoL in that the three levels (‘not at all’, ‘a bit’ and ‘a lot’) relate to the severity, rather than the frequency, of impacts [18]. It has a simple summative scoring system, whereby the difference between each level is assumed to be equal for each item; a response of ‘a bit’ would be assigned one point, ‘a lot’ would score two points, whilst ‘not at all’ suggests the impact has not been experienced and hence a no points are assigned. This instrument has been reported to have “acceptable validity, reliability and responsiveness” [17, 24]. Furthermore, the involvement of children at every stage during the development of CARIES-QC addresses an acknowledged need to view children as active participants within research [25].

Table 1 The questions within CARIES-QC (excluding the global question), the related items and severity levels

Data set

The data set for this study came from the original validation study for the CARIES-QC measure, which has been published elsewhere [17]. The data were from a sample of 200 children aged 5 to 16 years who had a diagnosis of active dental caries. Children were asked to complete the CARIES-QC measure at three different timepoints: baseline (T0), prior to the start of treatment (T1) and following a course of dental treatment to manage the caries (T2). Whilst all timepoints were used in the original validation of CARIES-QC, the present study used data from timepoint T0 on which to conduct psychometric and Rasch analyses, as this had the highest number of observations with no attrition. A range of clinical data were also collected to establish the number of teeth affected by caries, whether children reported symptoms, and the pattern of caries (i.e. whether it affected the front teeth) [17].

Rasch analysis

Rasch Analysis has been used to convert each participant response onto a latent continuous scale representing the severity of impacts relating to OHRQoL and assesses the spread of responses across the three response levels for each item [20]. Items with a greater spread indicate that the responder is able to distinguish between the item levels and would be stronger candidates for inclusion in the classification system.

In this study, the Rasch Analysis focussed on the spread of items across the three levels (response categories) at logit 0, whereby a greater spread indicated the respondent was able to distinguish between the item levels. Item (χ2) goodness-of-fit statistics were also conducted, with the items having the best fit to the underlying model being the best candidates for inclusion in the classification system. Item fit residual scores were also applied in the same way, with those closest to 0 indicating a better fit to the model.

Differential Item Functioning (DIF) was also assessed to determine whether each item was working the same across respondents of different ages, genders, ethnicities and levels of deprivation according to Index of Multiple Deprivation (IMD) scores [26]. Threshold analyses and assessment of local dependencies were also conducted.

Rasch Analysis was conducted using RUMM2030™ software Version 5.3 (©Rumm Laboratory Pty Ltd., Perth, Australia).

Classical psychometric testing

Classical psychometric analyses were carried out using SPSS® software (IBM Corporation., New York, United States, Version 24) [20]. Exploratory Factor Analysis was carried out to establish the dimensional structure of CARIES-QC. This was followed by four classical analyses in line with other studies of this type [20, 27].

Firstly, analyses to determine the rate of missing data were undertaken to evaluate item feasibility. Items with more than 5% missing data were considered to be poor candidates for inclusion within the classification system [28].

Internal consistency would usually be determined by comparison of the item with its respective domain score, though in the absence of established domains, correlations between each item with the global question and total score were determined using Spearman’s correlation coefficient. Furthermore, correlations between items were assessed to identify items that were capturing the same aspect of quality of life, where one of the items may be selected in the classification system to reflect the wider set of items.

The distribution of responses was also analysed. Floor and ceiling effects were deemed to be present if more than 15% of participants chose the best (‘not at all’) or worst (‘a lot’) responses [29]. It was acknowledged, however, that in a measure with only three response options, most of the items would have some degree of a floor or ceiling effect, or both. Items with strong floor effects were considered to be poor candidates for the classification system given that they would not be able to capture a deterioration in health. Conversely, items with strong ceiling effects were considered for selection as this suggested an ability to capture the impacts of higher disease severity.

The responsiveness of each item was estimated using the Standardised Response Mean (SRM) in line with similar studies [20, 30]. This was determined to be the most appropriate indicator of effect size given the presence of a correlation greater than 0.5 (Pearson correlation coefficient = 0.529) between baseline (T0) and follow-up (T2) scores [31]. The SRM (also known as Cohen’s d) was calculated by dividing the mean score change (follow-up score (T2) minus the baseline score (T0)) by the standard deviation of the change [31]. The SRMs were interpreted using Cohen’s criteria, whereby < 0.2 is deemed inconsequential, 0.2–0.5 is considered small, 0.5–0.8 is considered moderate and above 0.8 is considered large [32, 33]. A higher SRM indicated greater sensitivity to change.

Views of patient and public involvement (PPI) representatives

A panel of children and young people including personal contacts, local schoolchildren and patients from a paediatric dental clinic were invited to give their views at one of two informal meetings held in May and July 2017, to determine their views on the items within CARIES-QC. The panel was comprised of children from a range of ages, genders and ethnicities, with differing experiences of dental caries. This panel was also involved as a steering group for the overall study. These discussions focussed on how important each item was felt to be, whether any items were considered to overlap, and whether any items were felt to be too similar.

Two parent representatives (one mother, one father) were also involved in these discussions, to provide their thoughts on the items within CARIES-QC from their perspectives. The parents were both personal contacts of members of the research team, though had no clinical background. Each parent had two children, one of whom had experience of dental caries. The parent representatives continued to be involved throughout the duration of the study.

CARIES-QC development insights

The fourth stage of this process centred on informal discussions with researchers involved in the development of CARIES-QC. It was important to acknowledge any issues or concerns identified by the research team during the development of this instrument, particularly since children were involved at every stage. Furthermore, it was essential that any difficulties surrounding the use of the instrument in different settings and languages were considered.

The findings from these four steps were discussed by the research team, which involved clinicians, a senior health economist, and the researchers who led the development of CARIES-QC. Each approach was weighted equally (i.e. no single approach provided results that were valued more highly than another). The outcome from this meeting was an agreed preliminary classification system.

Child-centred validation of the preliminary classification system

Validation of the preliminary classification system was undertaken with children and young people who had a diagnosis of dental caries. Potential participants were identified via referral letters received from general dental practitioners at the Paediatric Dental Clinic at the Charles Clifford Dental Hospital, Sheffield. Patients with known diagnoses of dental caries were approached following their initial examination at the dental hospital. A maximum variation purposive sampling approach was used, to ensure participants of different ages, genders, ethnicities and levels of deprivation. Participants were not eligible for inclusion if they were outside of the 5- to 16-year-old age range within which CARIES-QC was developed for. Furthermore, children and parents who were unable to understand spoken and written English language were excluded. A similar approach was used in both formulating the descriptive system and testing the content validity of CARIES-QC [24]. Based on this previous research, it was expected that approximately 20 interviews would be required to reach data saturation.

Parents and children were invited to consent and assent to participate respectively. Qualitative semi-structured interviews were conducted by an experienced qualitative researcher (HJR). A topic guide (see Supplement 1) was used to inform the interviews, which were recorded and transcribed verbatim. Children were asked to ‘think aloud’ whilst completing questions from CARIES-QC within the preliminary classification system whilst the interviewer aimed to determine whether items were considered important, easily understood, and whether any were overlapping [34]. Children were then shown items that were excluded from the preliminary classification system and questioned further to determine whether any should be reintroduced.

Further sociodemographic data, including participant age and ethnicity were also collected. Postcodes were documented to facilitate calculation of the Index of Multiple Deprivation for each participant, given the well-acknowledged relationship between caries experience and socioeconomic deprivation [26, 35]. Clinical caries experience was recorded for each participant, collating the number of decayed, missing and filled primary and permanent teeth, in the form of the dmft and DMFT indices respectively [36].

Simple descriptive statistics were undertaken on the quantitative data. Qualitative data were analysed by two researchers independently (HJR and ZM) using the framework method to inform validation of the classification system, using NVivo 12 (©QSR International Pty Ltd) software for data management. This latter analysis focussed on identifying children’s level of understanding for each item, the amount of importance participants placed upon each item and whether they considered any as redundant or overlapping. PPI representatives for the study were involved in confirming the interpretation of quotes from children and young people were correct. The study team discussed the qualitative findings, which were used to inform modification of the preliminary classification system as required.


Identification of the classification system

Rasch analysis

The 200 participants from the aforementioned CARIES-QC validation dataset were included in the Rasch analysis, which used the partial credit model. The sociodemographic characteristics and caries experience of the participants in this dataset are provided in Table 2. Overall, the CARIES-QC data were found to have a good item (mean 0.385 ± 0.902) and person fit (mean 0.254 ± 0.999) to the Rasch model.

Table 2 Sociodemographic characteristics and caries experience of participants from the original CARIES-QC validation study (dataset used to undertake Rasch analysis and classical psychometric testing in the present study) and the qualitative validation of the preliminary classification system derived from CARIES-QC

Regarding the individual items, none were found to have disordered thresholds (Fig. 1), and none were subjected to local dependency (less than 0.2 above the average correlation) [37].

Fig. 1

Threshold map for the items within CARIES-QC

Table 3 reports the results of the Rasch analysis. The items with the highest spread across the three levels at logit 0 were ‘food stuck’ (1.632), ‘hurt’ (1.605), ‘hard to eat’ (1.585) and ‘cried’ (1.466) respectively. Those with the lowest item spread, and hence candidates for exclusion from the classification system, were ‘cross’ (0.705), ‘one side’ (0.858), ‘school’ (0.894) and ‘brushing’ (0.913) respectively.

Table 3 Summary of key results from Rasch Analysis, classical psychometric testing, involvement of PPI representatives and discussions with the developers of CARIES-QC

Regarding goodness-of-fit, the items ‘food stuck’ and ‘annoy’ did not fit the Rasch model at the 5% significance level (p = 0.036 and p = 0.013 respectively). Conversely, the best-fitting items were ‘hurt’2 = 5.142), ‘carefully’2 = 4.367) and ‘cried’2 = 4.237).

The items ‘annoy’ and ‘carefully’ were found to have high negative item fit residuals (− 1.802 and − 1.801 respectively) and the item ‘cried’ was found to have a high positive fit residual (1.112). Whilst these are notable, and could potentially indicate item redundancy (associated with Item-Total Correlation), a level of +/− 2.5 should normally be reached for this to cause concern.

The items ‘hard to eat’ (0.031) and ‘cross’ (0.021) were found to have uniform differential item functioning (DIF) with regard to age at the 5% level. ‘Hard to eat’ also showed non-uniform DIF (0.014) at this level, as did ‘one side’ (0.049). The item ‘food stuck’ appeared to be working differently for variations in age groups (F = -0.293) and genders (F = -0.126).

Classical psychometric testing

Classical psychometric tests were undertaken on the same dataset used for the Rasch analysis (Table 2).

Principal component factor analysis identified only one factor to be present. This factor accounted for 45.54% of the total variance. The high Kaiser-Meyer-Olkin measure of sampling adequacy result of 0.914 determined that the sample was suitable for factor analysis. The statistically significant Bartlett’s Test of Sphericity provided confirmation that the variables were correlated; a degree of correlation is necessary for factor analysis. A Scree plot and further results from the factor analysis can be seen in Supplement 2.

No items were found to have missing values greater than 5% suggesting there were no issues surrounding feasibility [28].

There were moderate levels of correlation (between 0.3 and 0.5) between most items within CARIES-QC (see Supplement 3). Strong correlations (between 0.5 and 0.9) were found between the item ‘annoy’ and five other items, namely ‘hurt’ (r = 0.59), ‘one side’ (r = 0.58), ‘kept awake’ (r = 0.52), ‘carefully’ (r = 0.55), and ‘cross’ (r = 0.51). Similarly the item ‘carefully’ had strong correlations with four other items, namely ‘hard to eat’ (r = 0.51), ‘one side’ (r = 0.63), ‘annoy’ (r = 0.55), and ‘slowly’ (r = 0.60). This suggests that a smaller number of items within the classification system could reflect what is captured by the wider measure. As the factor analysis did not identify multiple domains within CARIES-QC, correlations were undertaken between each item and the global question and total score at baseline (T0). All items had positive correlations with both the global question and the total score.

Regarding the distribution of responses, ‘food stuck’ was the only item to have a floor effect (32% responded ‘a lot’) without also having a ceiling effect. High ceiling effects were noted for ‘kept awake’ and ‘cross’, with 67% and 59% of respondents reporting no experience of these impacts. A particularly high ceiling effect (82%) was observed in the item ‘school’, suggesting it was possibly misinterpreted by participants.

Data were available for 38 participants at follow-up (timepoint T2) after receipt of treatment. These data were used to calculate the SRM. The SRM for each item can be seen in Table 3. A strong SRM (> 0.8) was found for ‘annoy’ (0.93), followed by moderate effect sizes for ‘food stuck’ (0.68) and ‘hurt’ (0.61). Trivial effect sizes were observed for ‘school’ (0.09) and ‘slowly’ (0.16).

Views of patient and public involvement (PPI) representatives

Children and young people noted that there were multiple items within CARIES-QC relating to eating, and many participants suggested that one item alone could encompass the others on this topic. Children thought the items ‘carefully’ and ‘hard to eat’ had the broadest remit, and that one of these could be considered in place of the rest.

Children expressed some uncertainty about whether the item ‘food stuck’ related to getting food stuck in their teeth in general, or getting food stuck in the holes in their teeth.

Children felt the term ‘annoy’ was too similar to ‘cross’. Older children in particular thought they would be less likely to use the word ‘cross’, and hence would prefer the item ‘annoy’.

Older children thought that their peers would not be likely to admit to crying about their teeth.

Child and parent representatives expressed some confusion about how schoolwork could be affected by teeth. They reasoned that if dental pain was causing the impacts on schoolwork, this may be captured elsewhere under the category of ‘hurt’.

Parent representatives thought that pain related to toothbrushing, could also come under the umbrella term ‘hurt’. They also considered whether ‘hurt’ and ‘annoy’ might mean the same thing, though children and young people disagreed.

CARIES-QC development insights

The item ‘food stuck’ had translatability concerns when translating into other languages. Anecdotal evidence suggests that children may have a varied understanding of the schoolwork item. These two items could be excluded from the classification on this basis.

Children and young people of different ages viewed the concepts of ‘hurt’ and ‘annoy’ to be different during development of CARIES-QC, although both terms were used to describe the physical sensations that they felt. This suggests it may be important to retain both of these items within the preliminary classification system. In the qualitative research undertaken during the development of the CARIES-QC, older children had admitted to crying about their teeth, in contrast to the suggestion made by the PPI representatives.

Discussion of preliminary classification system

The findings from all four steps outlined above were discussed between all members of the study team, and the preliminary classification system was agreed by consensus. A summary of the key discussion points is provided below, based upon the results seen in Table 3.

The items ‘food stuck’ and ‘school’ had issues noted in each of the four steps detailed above, and hence were excluded from the preliminary classification system. As the PPI representatives expressed a need for only one item relating to eating within the classification system, it was felt that eat more ‘carefully’ would encompass this best. This was in part due to its strong correlations with other items regarding impacts and experiences from eating, and its relatively good fit with the Rasch model. Similarly, the item ‘annoy’ was considered important to retain, given its strong correlations with clinical findings. Although parents expressed concerns that ‘annoy’ could be too similar to ‘hurt’, these items appeared to be independent of each other when analysing the data, and in previous qualitative research children considered them to be separate concepts during the development of the measure [38]. The items ‘cried’ and ‘kept awake’ were considered to be key components of the preliminary classification system, in order to represent the worst states.

Table 4 shows the five items that were selected to form the preliminary classification system, and the broad domains represented by each. The preliminary five-item classification system was then ready for validation with children and young people.

Table 4 The preliminary classification system and final validated classification system, with proposed domains

Validation of the preliminary classification system

‘Think aloud’ interviews were conducted with 20 participants, of which 6 were male, and 14 female, before data saturation was reached. Two potential participants declined to take part; one parent felt their child was too shy to participate, whilst the other reported a lack of time.

The sociodemographic characteristics and caries experience of participants is shown in Table 2. The majority of participants (n = 14) were White British, whilst the rest (n = 6) were a variety of different ethnicities. The age of participants ranged from 6 to 15 years with a mean of 10 years. Half of the participants (n = 10) were found to reside in the most deprived areas of England. All children had active dental caries. The mean dmft was 2.85 (SD 3.05; range 0–12) and DMFT was 1.7 (SD 2.88; range 0–11). The mean length of interview was 8 min and 10 s, though this ranged from below 5 min to upwards of 16 min, with the shortest interviews involving younger children.

The qualitative findings arising from the validation of the classification system are described below, with quotes provided to illustrate each aspect, using participant pseudonyms.


Children found the questions relating to the preliminary classification system straightforward to complete and did not appear to experience much difficulty in choosing an answer for each question. Furthermore, they believed the questions covered a range of impacts.

“They’re kind of easy … but they mean a lot” Jenny, 11 years old.

Children were unsure whether their school friends would be able to answer some of the questions that had been removed from the classification system.

On questioning, younger children struggled to make decisions between items and found it difficult to communicate a clear preference for items capturing similar aspects of health:

Both … I like them both” Lucy, 6 years old.

Overlapping items

During the development of the preliminary classification system, parent representatives for the study had raised some concern that the items ‘hurt’ and ‘annoy’ were too similar and potentially overlapping. Nonetheless, these interviews suggest the contrary, as children felt ‘hurt’ and ‘annoy’ described different things, and considered them both to have value.

“I think they’re very different because annoying and hurt are two different meanings” Ali, 13 years old.

Importance of items

Children had conflicting views on the item ‘cried’ relating to the question ‘have you ever cried because of your teeth?’ Those who had experienced this impact placed greater importance on this item:

“‘Cause sometimes if they really hurt, I do cry … ..I actually think that is important” Lucy, 6 years old.

However, those who had never experienced this impact expressed confusion:

“I don’t really know why people would cry about their teeth” Lily, 14 years old.

Appropriateness of items

Children thought the question ‘do you have to eat more carefully because of your teeth?’ did not adequately describe the dietary restrictions resulting from caries. They displayed a clear preference for one of the questions that had been removed from the classification system, which asked whether their teeth made it hard to eat some foods.

If you eat more carefully you can still eat but if you find it hard to eat you can’t really eat much” Leon, 9 years old.

“Because if you have to eat more carefully it’s like how you eat whereas “Does your teeth make it hard to eat some foods?” would like eliminate foods out.” Lily, 14 years old.

Child-centred modification of preliminary classification system

The findings from the qualitative interviews were then used to inform modifications to the preliminary classification system accordingly.

During the validation interviews, children raised some important issues with the item regarding eating more ‘carefully’, particularly that it failed to encompass their dietary limitations due to caries. They expressed a clear preference for the item ‘hard to eat’, and thought this item should be reinserted in the place of the problematic item. The rest of the items within the preliminary classification system were easily understood and considered to be both important and appropriate. Furthermore, children believed the items to be independent of each other, and not overlapping. The final validated classification system can be seen in Table 4.


This paper describes a novel approach to identify a classification system for a paediatric condition-specific preference-based measure from a condition-specific patient-reported outcome measure. The approach taken here builds on the previous approach taken to select items for many condition-specific preference-based measures through the validation of the classification system using qualitative research with children. Furthermore, the methods used to validate the classification system engaged children both as active participants and as experts in their own health.

Children and young people felt that ‘hard to eat’ was a preferable candidate for the classification system compared to ‘carefully’, as it covered the wider impacts of caries on eating. The decision to replace ‘hard to eat’ for ‘carefully’ within the final classification system was well justified, given that the former had actually outperformed ‘carefully’ in a number of tests conducted in the Rasch analysis. Whilst it lacked the strong correlations with so many other items, its relevance and importance to children and young people was prioritised.

Interestingly, children who had not experienced dental pain severe enough to cause them to cry were unable to understand the relevance of this impact. The range of responses surrounding this item, from a sample who all have diagnosed dental caries, confirms previous research highlighting the variation in impacts that children can experience and how many suffer no symptoms at all [39]. Furthermore, the association between the number of carious teeth and the impacts experienced is often not as linear as one might expect [39]. Nonetheless, it is important for a preference-based measure to contain an item such as ‘cried’, since this is an impact that is only experienced by those with the greatest severity of the condition. This item, alongside the item ‘kept awake’, will play an important role in the formation of the worst health state possible within the valuation survey [40, 41].

The systematic and varied approaches used to identify and validate the classification system can be considered one of the strengths of this study. This level of involvement of children and young people is rarely employed in the development of classification systems for paediatric preference-based measures, such as the generic EQ-5D-Y (Euroqol-5 Dimension Youth) and HUI2 (Health Utilities Index 2), or condition-specific measures such as those for atopic dermatitis and asthma [42,43,44,45]. Furthermore, whilst qualitative approaches have been used in the identification of items to form classification systems preference-based measures, particularly for older and younger populations, they have not been used in the validation of classification systems [46,47,48]. This offers many benefits over a quantitative approach, through ensuring that the items within the classification system are considered important to the relevant population. The active involvement of children and young people and the use of a qualitative validation approach could be applied to the future development of paediatric preference-based measures.

Many participants within the validation study lived in areas that were amongst the most deprived in England, which reflects the association between caries prevalence and deprivation [49, 50]. One potential limitation of this study is that it included disproportionately more female participants than males. This does not reflect the wider population, where there is a trend for boys to have a slightly higher prevalence of caries than girls [35]. Similarly, the clinical caries experience (dmft/DMFT) of participants in this study was much higher than the national average of 0.9 [35]. The prevalence of caries in 5-year-old children in Yorkshire and the Humber is known to be greater than the national average (28.5% compared to 24.7% respectively), though this discrepancy is more likely to be explained by the recruitment of participants from a tertiary referral centre. These participants are likely to have been referred to the dental hospital due to the extent of their disease, and resulting symptoms. Whilst this could be considered a limitation of the study due to the lack of representativeness of the sample, it could be argued that those experiencing the impacts described in CARIES-QC would be the most appropriate sample to validate the classification system. Furthermore, this approach ensured that those experiencing the most severe, and perhaps less frequently encountered impacts (e.g. crying) were involved.

This study conducted Rasch analysis and psychometric tests on a dataset with a relatively small sample size, compared to those that have been used in the development of other HRQoL instruments and PBMs, which have seen samples with around 400 to 700 participants being used successfully [20, 51]. Nonetheless, Rasch analysis is known to be sensitive to larger sample sizes, which can cause an increase in the frequency of statistically significant findings, causing difficulties in item reduction [51, 52]. Importantly, the present study was deemed to have a sufficient sample size on which to conduct Factor Analysis.

A range of viewpoints from an interdisciplinary panel were included in the discussions to identify both the preliminary and final classification systems, and hence could be considered a strength of this study. Nonetheless, the reproducibility of this approach is clearly limited, and a different group of researchers may well have selected different items for inclusion in the classification system.

In conclusion, following child-centred modification as detailed above, the preliminary classification system can now be considered valid, since it has been derived taking into account Rasch analyses, classical psychometric tests, PPI and developer input, clinical input, as well as involvement of children with dental caries. The five-item classification system is now suitable for use in a valuation survey with children and young people. This will facilitate generation of QALYs for children with caries, to better inform decision-makers and commissioners regarding the cost-utility of interventions to improve children’s oral health. Furthermore, the innovative methodology used to develop and validate this classification system can be used in the development of other preference-based measures.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.



Caries Impacts and Experiences Questionnaire for Children


Child Health Utility 9 Dimension


Differential Item Functioning


Decayed, missing and filled primary teeth


Decayed, missing and filled permanent teeth


Euroqol 5 Dimension - Youth


Health Utilities Index 2


Index of Multiple Deprivation


National Health Service (UK)


Patient and Public Involvement


Quality Adjusted Life Year


Standardised Response Mean


  1. 1.

    Benjamin, R. M. (2010). Oral health: The silent epidemic. Public Health Reports, 125(2), 158–159.

    Article  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Majewski, R. F., Snyder, C. W., & Bernat, J. E. (1988). Dental emergencies presenting to a children's hospital. ASDC Journal of Dentistry for Children, 55(5), 339–342.

    CAS  PubMed  Google Scholar 

  3. 3.

    Schuch, H. S., Dos Santos Costa, F., Torriani, D. D., Demarco, F. F., & Goettems, M. L. (2015). Oral health-related quality of life of schoolchildren: Impact of clinical and psychosocial variables. International Journal of Paediatric Dentistry, 25(5), 358–365.

    Article  PubMed  Google Scholar 

  4. 4.

    Pau, A., Baxevanos, K., & Croucher, R. (2007). Family structure is associated with oral pain in 12-year-old Greek schoolchildren. International Journal of Paediatric Dentistry, 17(5), 345–351.

    Article  PubMed  Google Scholar 

  5. 5.

    Krisdapong, S., Sheiham, A., & Tsakos, G. (2009). Oral health-related quality of life of 12- and 15-year-old Thai children: Findings from a national survey. Community Dentistry & Oral Epidemiology, 37(6), 509–517.

    Article  Google Scholar 

  6. 6.

    Miller, J., Vaughan-Williams, E., Furlong, R., & Harrison, L. (1982). Dental caries and children's weights. Journal of Epidemiology and Community Health, 36(1), 49.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Acs, G., Shulman, R., Ng, M. W., & Chussid, S. (1999). The effect of dental rehabilitation on the body weight of children with early childhood caries. Pediatric Dentistry, 21(2), 109–113.

    CAS  PubMed  Google Scholar 

  8. 8.

    NHS Digital (2019). Hospital Admitted Patient Care and Adult Critical Care Activity. NHS Digital. Accessed 11 Nov 2019.

  9. 9.

    NHS Digital (2016). Hospital Admitted patient Care Activity, 2015–16. NHS Digital. Accessed 5 Sept 2018.

  10. 10.

    Hettiarachchi, R. M., Kularatna, S., Downes, M. J., Byrnes, J., Kroon, J., Lalloo, R., et al. (2017). The cost-effectiveness of oral health interventions: A systematic review of cost- utility analyses. Community Dentistry and Oral Epidemiology.

  11. 11.

    Tonmukayakul, U., Calache, H., Clark, R., Wasiak, J., & Faggion, C. M. (2015). Systematic review and quality appraisal of economic evaluation publications in dentistry. Journal of Dental Research, 94(10), 1348–1354.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Rogers, H. J., Rodd, H. D., Vermaire, J. H., Stevens, K., Knapp, R., El Yousfi, S., et al. (2019). A systematic review of the quality and scope of economic evaluations in child oral health research. BMC Oral Health, 19(1), 132.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Lord, J., Longworth, L., Singh, J., Onyimadu, O., Fricke, J., Bayliss, S., et al. (2015). Oral health guidance – Economic analysis of oral health promotion approaches for dental teams Birmingham & Brunel Consortium: External assessment Centre.

    Google Scholar 

  14. 14.

    NICE (2013). Guide to the methods of technology appraisal. London: National Institute of Health and Care Excellence.

    Google Scholar 

  15. 15.

    Foster Page, L. A., Beckett, D. M., Cameron, C. M., & Thomson, W. M. (2015). Can the child health utility 9D measure be useful in oral health research? International Journal of Paediatric Dentistry, 25(5), 349–357.

    Article  PubMed  Google Scholar 

  16. 16.

    Chestnutt, I. G., Hutchings, S., Playle, R., Trimmer, S. M., Fitzsimmons, D., Aawar, N., et al. (2017). Seal or varnish? A randomised controlled trial to determine the relative cost and effectiveness of pit and fissure sealant and fluoride varnish in preventing dental decay. Health Technology Assessment, 21(21).

  17. 17.

    Gilchrist, F., Rodd, H. D., Deery, C., & Marshman, Z. (2018). Development and evaluation of CARIES-QC: A caries-specific measure of quality of life for children. BMC Oral Health, 18(1), 202.

    Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Gilchrist, F., Rodd, H., Deery, C., & Marshman, Z. (2014). Assessment of the quality of measures of child oral health-related quality of life. Acta Veterinaria Scandinavica, 14(1), 40.

    Article  Google Scholar 

  19. 19.

    McCabe, C., Stevens, K., Roberts, J., & Brazier, J. (2005). Health state values for the HUI 2 descriptive system: Results from a UK survey. Health Economics, 14(3), 231–244.

    Article  PubMed  Google Scholar 

  20. 20.

    Young, T. A., Yang, Y., Brazier, J. E., & Tsuchiya, A. (2011). The use of Rasch analysis in reducing a large condition-specific instrument for preference valuation: The case of moving from AQLQ to AQL-5D. Medical Decision Making, 31(1), 195–210.

    Article  PubMed  Google Scholar 

  21. 21.

    Mulhern, B., Rowen, D., Brazier, J., Smith, S., Romeo, R., Tait, R., et al. (2013). Development of DEMQOL-U and DEMQOL-PROXY-U: Generation of preference-based indices from DEMQOL and DEMQOL-PROXY for use in economic evaluation. Health Technology Assessment, 17(5), v-xv, 1–v-xv140.

    Article  Google Scholar 

  22. 22.

    King, M. T., Costa, D. S., Aaronson, N. K., Brazier, J. E., Cella, D. F., Fayers, P. M., et al. (2016). QLU-C10D: A health state classification system for a multi-attribute utility measure based on the EORTC QLQ-C30. Quality of Life Research, 25(3), 625–636.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Mulhern, B., Labeit, A., Rowen, D., Knowles, E., Meadows, K., Elliott, J., et al. (2017). Developing preference-based measures for diabetes: DHP-3D and DHP-5D. Diabetic Medicine, 34(9), 1264–1275.

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Gilchrist, F. (2015). Development of a child-centred, caries-specific measure of oral health-related quality of life. University of Sheffield, 2015.

  25. 25.

    Marshman, Z., Gupta, E., Baker, S. R., Robinson, P. G., Owens, J., Rodd, H. D., et al. (2015). Seen and heard: Towards child participation in dental research. International Journal of Paediatric Dentistry, 25(5), 375–382.

    Article  PubMed  Google Scholar 

  26. 26.

    Index of Multiple Deprivation (2015). Index of Multiple Deprivation. Accessed 5 Sept 2018.

  27. 27.

    Young, T., Yang, Y., Brazier, J., Tsuchiya, A., & Coyne, K. (2009). The first stage of developing preference-based measures: Constructing a health-state classification using Rasch analysis. Quality of Life Research, 18(2), 253–265.

    Article  PubMed  Google Scholar 

  28. 28.

    Schafer, J. L. (1999). Multiple imputation: a primer. Statistical Methods in Medical Research, 8(1), 3–15.

    CAS  Article  PubMed  Google Scholar 

  29. 29.

    Terwee, C. B., Bot, S. D., de Boer, M. R., van der Windt, D. A., Knol, D. L., Dekker, J., et al. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology, 60(1), 34–42.

    Article  Google Scholar 

  30. 30.

    Angst, F., Verra, M. L., Lehmann, S., & Aeschlimann, A. (2008). Responsiveness of five condition-specific and generic outcome assessment instruments for chronic pain. BMC Medical Research Methodology, 8(1), 26.

    Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Norman, G. R. (2014). Biostatistics: The Bare Essentials (4th ed.) Shelton, Connecticut: People's Medical House-USA.

  32. 32.

    Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.

    CAS  Article  Google Scholar 

  33. 33.

    Durlak, J. A. (2009). How to select, calculate, and interpret effect sizes. Journal of Pediatric Psychology, 34(9), 917–928.

    Article  PubMed  Google Scholar 

  34. 34.

    Willis, G. B. (2005). Cognitive interviewing [electronic resource] : A tool for improving questionnaire design. Thousand Oaks: SAGE Publications.

    Google Scholar 

  35. 35.

    Pitts, N., Chadwick, B., & Anderson, T. (2015). Children’s Dental Health Survey 2013. Report 2: Dental disease and damage in children, England, Wales and Northern Ireland. NHS Digital. Accessed 11 Nov 2019.

  36. 36.

    Klein, H., & Palmer, C. E. (1940). Community economic status and the dental problem of school children. [article]. Public Health Reports, 55(5), 187–205.

    Article  Google Scholar 

  37. 37.

    Christensen, K., Makransky, G., & Horton, M. (2016). Critical values for yens Q3: Identification of local dependence in the Rasch model using residual correlations. Applied Psychological Measurement, 41.

  38. 38.

    Gilchrist, F., Marshman, Z., Deery, C., & Rodd, H. D. (2015). The impact of dental caries on children and young people: What they have to say? International Journal of Paediatric Dentistry, 25(5), 327–338.

    Article  PubMed  Google Scholar 

  39. 39.

    Tickle, M., Milsom, K., King, D., Kearney-Mitchell, P., & Blinkhorn, A. (2002). The fate of the carious primary teeth of children who regularly attend the general dental service. British Dental Journal, 192(4), 219.

    CAS  Article  PubMed  Google Scholar 

  40. 40.

    Ratcliffe, J., Chen, G., Stevens, K., Bradley, S., Couzner, L., Brazier, J., et al. (2015). Valuing child health utility 9D health states with Young adults: Insights from a time trade off study. Appl Health Econ Health Policy. 13(5):485-92.

  41. 41.

    Ratcliffe, J., Couzner, L., Flynn, T., Sawyer, M., Stevens, K., Brazier, J., et al. (2011). Valuing child health utility 9D health states with a young adolescent sample. Applied Health Economics and Health Policy, 9(1), 15–27.

    Article  PubMed  Google Scholar 

  42. 42.

    Torrance, W. G., Feeny, H. D., Furlong, J. W., Barr, D. R., Zhang, D. Y., & Wang, D. Q. (1996). Multiattribute utility function for a comprehensive health status classification system: Health utilities index mark 2. Medical Care, 34(7), 702–722.

    CAS  Article  PubMed  Google Scholar 

  43. 43.

    Wille, N., Badia, X., Bonsel, G., Burström, K., Cavrini, G., Devlin, N., et al. (2010). Development of the EQ-5D-Y: A child-friendly version of the EQ-5D. Quality of Life Research, 19(6), 875–886.

    Article  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Stevens, K. J., Brazier, J. E., McKenna, S. P., Doward, L. C., & Cork, M. J. (2005). The development of a preference-based measure of health in children with atopic dermatitis. The British Journal of Dermatology, 153(2), 372–377.

    CAS  Article  PubMed  Google Scholar 

  45. 45.

    Chiou, C.-F., Weaver, M. R., Bell, M. A., Lee, T. A., & Krieger, J. W. (2005). Development of the multi-attribute pediatric asthma health outcome measure (PAHOM). International Journal for Quality in Health Care, 17(1), 23–30.

    Article  PubMed  Google Scholar 

  46. 46.

    Stevens, K. (2009). Developing a descriptive system for a new preference based measure of health related quality of life for children. Quality of Life Research,18(8):1105–13.

  47. 47.

    Sutton, E., & Coast, J. (2014). Development of a supportive care measure for economic evaluation of end-of-life care using qualitative methods. Palliative Medicine, 28(2), 151–157.

    Article  PubMed  Google Scholar 

  48. 48.

    Canaway, A., Al-Janabi, H., Kinghorn, P., Bailey, C., & Coast, J. (2017). Development of a measure (ICECAP-close person measure) through qualitative methods to capture the benefits of end-of-life care to those close to the dying for use in economic evaluation. Palliative Medicine, 31(1), 53–62.

    Article  PubMed  Google Scholar 

  49. 49.

    Schwendicke, F., Dörfer, C. E., Schlattmann, P., Page, L. F., Thomson, W. M., & Paris, S. (2015). Socioeconomic inequality and caries: A systematic review and meta-analysis. Journal of Dental Research, 94(1):10–18.

  50. 50.

    Slade, G. D., & Sanders, A. E. (2017). Two decades of persisting income-disparities in dental caries among U.S. children and adolescents. Journal of Public Health Dentistry.

  51. 51.

    McTaggart-Cowan, H. M., Brazier, J. E., & Tsuchiya, A. (2010). Clustering Rasch results: A novel method for developing rheumatoid arthritis states for use in valuation studies. Value in Health, 13(6), 787–795.

    Article  PubMed  Google Scholar 

  52. 52.

    Tesio, L. (2003). Measuring behaviours and perceptions: Rasch analysis as a tool for rehabilitation research. Journal Rehabilitation Medicine, 35(3), 105–115.

    Article  Google Scholar 

Download references


Not applicable.


Helen Rogers is funded by a Doctoral Research Fellowship from the National Institute of Health Research (NIHR). This article presents independent research funded by the NIHR. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.

Author information




HJR conducted psychometric analyses, Rasch analysis, involvement of PPI representatives, discussions with developer team, qualitative interviews with children and analysis of these. FG provided the data collected using CARIES-QC and both FG and DR assisted with the Rasch analysis. All authors were involved in interpreting the results and determining the preliminary classification system. HJR prepared the manuscript, though all authors contributed to its development. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Helen J. Rogers.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was obtained from Yorkshire and the Humber Research Ethics Committee (Reference: 18/YH/0148).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplement 1.

Topic guide used during validation interviews with children and young people

Additional file 2: Supplement 2

Results of Exploratory Factor Analysis.

Additional file 3: Supplement 3.

Table of correlations between items within CARIES-QC.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rogers, H.J., Gilchrist, F., Marshman, Z. et al. Selection and validation of a classification system for a child-centred preference-based measure of oral health-related quality of life specific to dental caries. J Patient Rep Outcomes 4, 105 (2020).

Download citation


  • Caries
  • Children
  • Oral health-related quality of life
  • Utility