Psychometric validation of three new condition-specific questionnaires to assess quality of life, symptoms and treatment satisfaction of patients with aortic aneurysm

Background To evaluate the psychometric properties of three new condition-specific questionnaires designed to assess outcomes amongst patients under pre-operative surveillance for a small abdominal aortic aneurysm (AAA) or who have undergone aneurysm repair. These tools are the Aneurysm-Dependent Quality of Life measure (AneurysmDQoL), the Aneurysm Symptom Rating Questionnaire (AneurysmSRQ) and the Aneurysm Treatment Satisfaction Questionnaire (AneurysmTSQ). Results The questionnaires were sent to 297 patients with abdominal aortic aneurysm (AAA) or who had undergone AAA repair (using open or endovascular technique) sampled from five UK NHS Trusts. Exploratory Factor Analysis was used to examine factor structure together with reliability analysis. A subset of 65 patients completed the questionnaires a second time four months later. One hundred and ninety-seven patients (178 men; 18 women) provided data for analysis (69% response rate): mean age was 75 years (range 60–95). Nineteen were under pre-operative surveillance for AAA and 178 had undergone AAA repair (70 open repair; 104 endovascular repair; 4 uncertain). Exploratory Factor Analysis of the AneurysmDQoL and the AneurysmTSQ each demonstrated a one-factor structure. The AneurysmSRQ demonstrated a six-factor structure (emotional, weight loss, lower limb, cognitive, general malaise and gastrointestinal symptoms) and a one-factor composite symptom scale. All scales have clean factor structures: item loadings above 0.40, no cross-loadings, and no factors with fewer than three items. Internal consistency reliability was excellent (α = 0.869–0.959) and test-retest reliability good (Intraclass correlation coefficient = 0.70–0.88). Conclusions The three new questionnaires have a clear structure and strong reliability and are now ready for use in clinical trials and routine practice, which will allow evaluation of responsiveness to change.


Introduction
An abdominal aortic aneurysm (AAA) is a localised dilation of the lower part of the aorta which, if ruptured, is likely to be fatal [1]. Abdominal aortic aneurysms become more common with age and are more common in men [2]. Over time AAAs tend to expand and as they do so, the risk of rupture increases. Small AAAs (3-5 cm) are monitored using periodic ultrasound surveillance and AAA repair is generally only recommended once the aneurysm reaches 5.5 cm -the point where the risk of rupture outweighs the risk associated with elective repair. In 2009, the phased implementation of a national screening program (National Abdominal Aortic Aneurysm Screening Program NAAASP) began in the UK [3]. The NAAASP currently invites all men for ultrasound AAA screening on reaching 65 years of age. Those found to have an AAA are either enrolled into ongoing surveillance or put forward for repair if already at threshold size [4].
Techniques of AAA repair have evolved significantly in recent years with large numbers now treated using endovascular stent-grafts that allow minimally invasive repair. As a result, surgical mortality has fallen dramatically [5] and markers of surgical quality expanded to include patient reported outcomes (PROs), including symptoms, treatment satisfaction and quality of life (QoL). The impact of being made aware of the condition, the need for ongoing surveillance and the need to take new medications (e.g. statins) each have the potential to impact on QoL. Identifying strengths and deficiencies in care from the patients' perspective can help clinicians strive for even higher quality care rather than simply avoiding morbidity and mortality. The UK Department of Health has, over recent years, undertaken a nationwide initiative to encourage the use of PRO measures (PROMs), both in the surgical specialties in general and in aortic aneurysm surgery specifically. However, until now, no validated aneurysm-specific PROMs exist.
In the absence of a validated aneurysm-specific QoL measure, all previous studies purporting to measure QoL in patients with AAA have used generic tools such as the Medical Outcomes Study Short-Form 36  or the EuroQol-5D (EQ-5D) [6][7][8]. Although often presented as measures of QoL or 'health-related quality of life' (HRQoL), these tools are actually measures of health status (i.e. physical and mental function) rather than true QoL. Quality of life is a much broader concept that incorporates (but is not limited to) how dysfunctional physical and/or mental status and other demands of a condition and its treatment may impact upon patients' lives. To assess the impact of AAA and its treatment on QoL, three new condition-specific questionnaires have been developed using an iterative process of focus groups and in-depth interviews involving patients with AAA [9]. The newly designed questionnaires are: The Aneurysm-Dependent Quality of Life Questionnaire (AneurysmDQoL), the Aneurysm Symptom Rating Questionnaire (AneurysmSRQ) and the Aneurysm Treatment Satisfaction Questionnaire (AneurysmTSQ).

The AneurysmDQoL
The design of the AneurysmDQoL was based upon the Audit of Diabetes-Dependent Quality of Life (ADDQoL)a widely used questionnaire designed for use by people with diabetes [10,11]. The ADDQoL has also been adapted for use by people with many other conditions including renal disease (RDQoL), macular disease (MacDQoL), growth hormone deficiency (HDQoL), hypothyroidism (ThyDQoL) and diabetic retinopathy (RetDQoL) [12][13][14][15][16]. Influenced by the way QoL was conceptualized in the design of the SEI-QoL (Schedule for the Evaluation of Individual Quality of Life) interview methodology [17], the AneurysmDQoL recognizes individual differences in the experience of quality of life. Unlike the majority of QoL and health status tools, respondents indicate if an aspect of life (e.g. work) is not relevant to them and, for those aspects that are of relevance, they are asked to rate not only how much each aspect of their life has been affected by their condition (impact), but also how important they consider this aspect of their life to be for their QoL (importance) (Fig. 1).
Part (a) of each item is scored from − 3 (greatest negative impact) to + 1 (positive impact) and part (b) is scored from 3 to 0 (very important to not at all important). The 'Weighted Impact' (WI) of AAA on that particular aspect of life is then calculated by multiplying the impact score by the importance score (scores range from − 9 to + 3). In this way the AneurysmDQoL is sensitive to the fact that any given aspect of life may have different significance to different individuals and therefore is likely to have varying impact on QoL, and that the importance of a particular aspect of life may change over time even for the same individual. An Average Weighted Impact score (AWI), can be obtained by summing the WI scores and dividing by the number of applicable domains (scores range from − 9 to + 3). Thus, scoring ignores non-applicable domains and gives greater emphasis to domains of greater importance to the individual, providing a highly personalised assessment of the impact of AAA on an individual's QoL.
The AneurysmDQoL includes 24 questions in total: two initial overview items -current QoL and how QoL would be different if respondents had not had an aneurysm, and 22 domain-specific items (e.g. family, household tasks) [9].

The AneurysmSRQ
The AneurysmSRQ is a measure of symptoms associated with AAA and its treatment which assesses the degree to which patients are bothered by applicable symptoms. The format of the questionnaire was developed in earlier work with patients who had hypothyroidism [18] and has since been adapted for other chronic conditions including hypoglycaemia [19]. The AneurysmSRQ comprises 44 items, each divided into two parts. Part (a) asks respondents to indicate if they have experienced the symptom in recent weeks, regardless of the cause. Part (b) asks respondents to indicate how much the symptom bothers them. If the answer to part (a) is 'No' , respondents are asked to go straight to the next symptom. Applicable symptoms are scored on a scale from 1 (not at all) to 4 (a lot).

The AneurysmTSQ
The Aneurysm Treatment Satisfaction Questionnaire (AneurysmTSQ) is designed to measure treatment satisfaction in people living with AAA. Based on the format of the Diabetes Treatment Satisfaction Questionnaire [20] and other -TSQ measures [21][22][23] the AneurysmTSQ explores multiple aspects of treatment satisfaction (e.g. side-effects, convenience). The AneurysmTSQ includes 11 Likert scale items which are rated from 6 (very satisfied) to 0 (very dissatisfied). Items 1 to 7 are designed for patients undergoing surveillance of small AAAs as well as those who have had aneurysm repair (e.g. information, staff support). Items 8 to 11 are designed to assess post-operative aspects of the treatment process (e.g. side-effects, follow-up) and are therefore only suitable for patients following aneurysm repair. Items can be used individually or they can be summed to give a total treatment satisfaction score.
The present paper focuses on the psychometric validation of the new tools, involving examination of their overall structure (including any subscales), internal consistency and test-retest reliability. It reports optimal scoring methods following use in clinical practice or research.

Design
A cross-sectional survey design was used, with patients at various points in the treatment pathway asked to complete the three new questionnaires. At the time of data collection, the national screening program had yet to be fully rolled out, however in anticipation of this, all individuals diagnosed with an AAA were invited to take part. The time-points chosen for completion were: pre-operative intervention (surveillance of AAAs < 5.5 cm) and 6 weeks, 3 months, 6 months, 12 months and > 12 months post-repair.

Participants
Questionnaires were completed by 297 patients diagnosed with AAA purposively sampled from five UK NHS Trusts. Respondents included those under surveillance for a small AAA and those who had undergone an AAA repair. Target sample size was based on a participant to variable ratio of 4:1 (minimum 100). Previous research using -DQoL and -TSQ measures for other conditions have demonstrated strong reliabilities, high factor loadings and a clean structure with such sample sizes e.g. [22][23][24]. Data from 197 patients were available for analysis (69% response rate).
The majority of participants were male. The mean age for women was slightly higher than for men and a higher number of men than women currently, or previously, smoked: men = 72%, women = 45% ( Table 1). The majority of both men and women reported elective EVAR as their AAA treatment (elective: men = 68%, women = 72%; EVAR: men = 51%, women = 78%). Mean time between operation and completing the questionnaire was slightly higher for men than women ( Table 2).
Ethnicity of participants was predominantly White (men = 96%, women = 100%). A subset of patients (n = 65) were asked to complete the questionnaire pack  , these centres were purposefully selected for this study as they carry out significant numbers of both OR and EVAR procedures. In each centre, members of the healthcare team contacted all patients who had undergone aneurysm repair within the preceding 12 months and invited them to participate in the study. Southampton also invited patients who had undergone aneurysm repair between 12 and 24 months prior to the study. Two centres identified a number of patients enrolled in preoperative surveillance of small AAAs. Patients who expressed interest were then sent the questionnaire pack (stamped addressed envelope included for return). The pack included an information sheet, consent form, detailed instructions and the questionnaires.

Analytic approach
Prior to conducting the analyses, the data were screened and relevant assumptions assessed. The correlation matrix was inspected for poorly correlated items (r < 0.3) and MSA (Measure of Sampling Adequacy) coefficients for each variable checked (< 0.6 problematic). The Kaiser-Meyer-Olkin value was checked to ensure it exceeded the recommended value of 0.6 [25,26] and Bartlett's Test of Sphericity [27] examined for statistical significance. Exploratory Factor Analysis (EFA) was conducted in two stages. Initial analyses were carried out using a principal components analysis (PCA: data reduction retaining as much information as possible). Components were considered for retention using three decision rules: Kaiser's criterion (eigenvalues > 1), inspection of the Scree plot and Horn's parallel analysis [28]. Once the components had been identified, a fixed factor EFA was run (examining data structure and underlying construct(s) using shared variance). All initial analyses were run using an oblique (Direct Oblimin) rotation, to allow for potential inter-correlations between components. Factors were considered to be related if they had a correlation coefficient > 0.32. Due to the non-normal distribution of the data, Principal Axis Factoring (PAF) was the chosen method of extraction [29,30]. Examination of the factor loadings focused on identifying the 'cleanest' factor structure (item loadings above 0.40 [31], no or few item cross-loadings [> 0.32], and no factors with fewer than three items) [27]. Factor loading strength was guided by: fair (0.45), good (0.55), very good (0.63), excellent (0.71) [32]. Poorly performing items were dropped from the analysis one at a time and the analysis rerun. All analyses used listwise deletion of cases.
Internal consistency reliability analysis was run to determine how many missing responses can be tolerated when calculating a total scale score [33]. The strongest contributing scale item was dropped and the analysis rerun until alpha < 0.7 or 50% of items had been dropped.

Exploratory factor analysis of the AneurysmDQoL
A complete set of responses from 157 participants (16 under surveillance; 141 post-repair) were included in the analysis of the AneurysmDQoL. Examination of data suitability led to the removal of Item 2: work.
Reasons for this included an MSA value of 0.349 and 160 out of 197 participants reporting work as non-applicable). Principal components analysis suggested a one-factor solution. The forced one-factor EFA found that the 22 items in the AneurysmDQoL accounted for 51.99% of the total variance within the data. Analysis of the factor matrix however revealed Item 15 (finance) loaded at 0.350. After removal of Item 15 (finance) the lowest loading item was Item 23 (value each day). The relatively weak loading (0.472) combined with problems highlighted during the design phase (meaning unclear) led to the removal of Item 23. The rerun forced one-factor 20-item EFA now explained 55.54% of the variance in the data. All items now loaded > 0.5. Cronbach's-α coefficient of internal consistency was very high and test-retest reliability was good. Internal consistency reliability run to determine how many missing responses can be tolerated when calculating a total scale score revealed that the AneurysmDQoL remains reliable (α > 0.85) even if patients omit responses for up to eight core items (items with no non-applicable option). Table 3 demonstrates factor loadings, communalities and reliability coefficients.
To ensure the tool would be suitable for use by preand post-repair patients, analyses were rerun with post-repair patients only (n = 141). A 20-item single-factor solution again revealed the cleanest structure, explaining 55.38% of the variance. All items loaded > 0.48 and Cronbach's-α coefficient of internal consistency was again high (0.959) and test-retest reliability good (0.66). The results of these analyses support the use of a single-scale, 20-item AneurysmDQoL as a measure of the impact of AAA on quality of life for patients under surveillance for a small AAA and for patients following AAA repair.  Table 4.
To assess the importance of each item and clarify whether items not included in the six subscales should be removed from the questionnaire, the frequency and bother ratings of each item were examined (Table 5 and 6). This process demonstrated that while similar patterns of response were not found among these items (therefore not included in the subscales), many patients experienced these symptoms and/or described them as causing moderate or severe bother. It was therefore decided that these items would remain in the questionnaire as stand-alone items. Frequency responses also demonstrate no floor or ceiling effects (> 15% having highest / lowest score) [34].
Though six clear subscales had been identified within the AneurysmSRQ, the potential pragmatic benefits of having a single summable symptom scale were also recognised. In order to identify the maximum number of items that could be combined into a single scale a forced one-factor EFA was run. Examination of the factor loadings led to the removal of 17 low-loading items. The final one-factor solution comprised 24 items. All items loaded > 0.4, and explained 30.15% of variance in the data. This factor was titled the 'Composite Symptom Scale' and provides the broadest single overall score for bother from symptoms. Internal consistency reliability was very high and ICC excellent.
Internal consistency reliability analysis run to determine how many missing responses can be tolerated when calculating the AneurysmSRQ Composite Symptom Scale revealed that the scale remains reliable even if patients omit responses for up to 12 items. Table 7 demonstrates factor loadings, communalities and reliability coefficients.

Exploratory factor analysis of the AneurysmTSQ
The AneurysmTSQ includes 11 items, all of which are suitable for patients who have undergone AAA repair. Items one to seven are also suitable for patients undergoing AAA pre-repair surveillance.

Post-repair patients
Treatment satisfaction data were obtained from 154 post-repair patients. Data suitability checks all proved satisfactory. A PCA of the 11 AneurysmTSQ items suggested a one component solution. The forced-one-factor EFA explained a total of 49.69% of the total variance within the data. Examination of the factor loadings revealed all items loaded > 0.476. Cronbach's alpha was strong and ICC demonstrated excellent test-retest reliability. Details of the factor loadings, communalities and reliability coefficients are presented in Table 8. These results support the use of the 11-item AneurysmTSQ as a measure of total treatment satisfaction for patients who have received post-operative care for an aortic aneurysm.

Pre and post-repair patients
Treatment satisfaction data were obtained from 182 participants. Consistent with the 11-item EFA, PCA using items 1-7 revealed a one-factor solution. The seven-item one-factor solution accounted for 52.83% of the variance within the data. Cronbach's alpha coefficient was strong and test retest reliability was good. Factor loadings, communalities and reliability coefficients are demonstrated in Table 9.
These results demonstrate support for a separate 7-item subscale suitable for use by patients pre-repair surveillance in addition to the full 11-item measure of treatment satisfaction for patients who have received post-operative care for AAA.

Discussion
The psychometric analyses presented here provide detailed information on the structure of the AneurysmD-QoL, AneurysmSRQ and AneurysmTSQ and strongly support their validity for use by patients with AAA.
The content validity of the three new tools was established through an iterative design process involving patients at every stage to ensure all included items were relevant to this patient group and no potentially important items were missing [9]. Strong evidence of the construct validity of each tool has been demonstrated here through psychometric analysis. All three tools have a clear structure, strong internal consistency and good test-retest reliability.
During the validation process, the value of single factor solutions was recognised as they have pragmatic advantages for clinical use. The data supported single factor solutions for both the AneurysmDQoL and the AneurysmTSQ. Analysis of the AneurysmDQoL confirmed that 20 of the initial 23 items could be combined into a single scale with the item 'value each day' requiring removal and the work and finance items being retained in the questionnaire but not included in the scale. It is perhaps not surprising that these particular items stood apart, since the majority of patients undergoing AAA repair have retired from work. Nonetheless, for those who are still in employment, the impact on work and finances could potentially be profound and the decision was therefore taken to retain these two items in the questionnaire as stand-alone items. In the UK, AAA medical care is provided free at the point of use. In countries where this is not the case, finances may be more impacted by AAA. Given that finance loaded only marginally lower than the 0.4 cut off and made a negligible difference to the internal consistency of the scale this item could, if needed, be included in computing AWI in countries where medical care is not provided free of charge or in multinational clinical trials where finances may be more relevant in some countries.
The poor performance of the 'value each day' item was not unexpected as several patients had found the item difficult to understand during pilot testing. This item was therefore completely removed from the questionnaire, resulting in the final 22-item version of the Aneur-ysmDQoL with 20 items contributing to the scale score.
No changes were made to the AneurysmTSQ following psychometric analysis. All items loaded onto a single factor with good reliability. The data also suggest that the first seven items can be used as a separate subscale for patients who are currently under surveillance following diagnosis of a small AAA but have not undergone repair. Due to the small numbers of patients in the dataset currently under surveillance, analysis of this subscale was conducted using data from both pre and post-repair individuals who had undergone surveillance. Although the patient-centred design of the measure provides strong content validity [9] and the free text box provided with each questionnaire (allowing respondents to report for example, any aspects of treatment satisfaction not covered in the questionnaire) suggested no further additions were needed, further work with a larger pre-repair surveillance dataset is required for confirmation of the psychometric properties of the questionnaire when used only with patients undergoing surveillance for small aneurysm. In contrast to the other tools, analyses demonstrated that the 44 items in the AneurysmSRQ could not be combined into a single scale that included all items. However, stepwise removal of items not contributing to a single scale demonstrated that it was possible to combine a total of 24 items into a single 'Composite Symptom Score'. Though this may not be a truly comprehensive score, it does provide the broadest possible single indicator of symptom burden for these patients. Psychometric analysis also showed that six subscales could be identified with the AneurysmSRQ: emotion; appetite; lower-limb symptoms; cognitive function; general malaise; and gastrointestinal symptoms.
These subscales comprised a total of 24 items (not identical to composite symptom score items). Importantly, although 20 items had to be excluded from the subscales on the basis of the psychometric analysis, data from focus groups and patient interviews had suggested that a number of these excluded items were important to patients. Indeed, some items may have failed to load onto subscales not because they are irrelevant or anomalous, but because they are more general in nature. For example, tiredness and back pain were each experienced by more than 40% of respondents (Table 6). Furthermore, some of the non-scale items with very low frequencies and bother ratings (e.g. wound infection; bruising) are most likely to be experienced by patients in the early post-repair groupa group relatively under-represented in this study. The decision was therefore taken to retain all non-scale items in the questionnaire until further data are gathered to justify more fully their retention or exclusion.
In addition to consistency and reliability, the criterion validity of new questionnaires can also be evaluated. Criterion validity is an assessment of the performance of a new tool relative to a known 'gold-standard' measure.
In developing new QoL measures, however, this can be problematic since there is often no existing gold standard. This is the case for patients with AAA. The systematic literature review performed prior to questionnaire development demonstrated that almost all previous attempts to describe QoL amongst patients with AAA have actually used health status measures (such as the SF-36 and EQ-5D) and have yielded an array of conflicting results [7]. The disagreement in previous results suggested that those tools may not have been suitable for purpose and prompted the development of the new questionnaires. It therefore follows that those tools cannot be used as the standard against which the new questionnaires are measured. Though comparison to a 'gold standard' was not possible, the need for such external validation was minimised by the fact that the format of the -DQoL, −TSQ and -SRQ measures have been tried and tested for many other patient groups and many of the items in the new questionnaires were derived from item libraries created in the course of designing and developing the measures for other patient population tools [12, 13, 18-20, 22, 24, 35-40].
The data collected in this study were sufficient for overall psychometric analysis (based on participant number, clean structure and number of high loading items) and indicate that the new tools are valid for both pre-repair surveillance and post-repair patients. Nonetheless, a larger dataset that included a greater number of pre-repair surveillance patients would have allowed for direct confirmation of the psychometric properties for pre-repair patients as a separate group. It is also noted that the male: female distribution in our study population was 9:1 whilst national data on the prevalence of the condition shows the male female ratio to be 6:1 [41]. It is unclear whether this ratio is truly representative of patients undergoing aneurysm repair or whether women were less often invited or less inclined to take part in the study. However, a similar or greater proportion of women patients was also observed in some of the largest studies of AAA repair to date, including the EVAR-1 (9%), EVAR-2 (15%) and UKSAT (18%) trials [42][43][44]. In addition, whilst the initial intention had been to include a longitudinal cohort of patients, this was ultimately not possible and for logistical reasons we conducted a cross-sectional study of patients at different stages pre-and post-aneurysm repair. Thus we have been able to examine subgroup differences [45] but have yet to assess the tools' responsiveness to change. This will need to be examined in future research.

Conclusions
The increasing number of patients now attending screening and undergoing EVAR repair, with its associated follow-up and increased risk of further intervention, suggests an urgent need for condition-specific AAA PROs measures. The AneurysmDQoL, AneurysmSRQ   and AneurysmTSQ are three new condition-specific questionnaires that will allow clinicians to assess quality of life, impact of symptoms and treatment satisfaction of patients with AAA before or after repair. We have published elsewhere evidence of their acceptability to patients [9] and evidence of between group differences [45] and here reported clear structure, good internal consistency, and test-retest reliability. Responsiveness to change now needs to be assessed.