- Short report
- Open Access
Validity and analysis of the Diabetes Injection Device Preference Questionnaire (DID-PQ)
Journal of Patient-Reported Outcomes volume 4, Article number: 104 (2020)
The Diabetes Injection Device Preference Questionnaire (DID-PQ) was designed to assess patient preference between two non-insulin injection devices. In a recent crossover study, people with type 2 diabetes (T2D) completed the DID-PQ after performing mock injections with two non-insulin injection devices. The purpose of the current analysis was to use these data to assess construct validity of the DID-PQ and demonstrate one way to test whether there is a significant preference for one injection device over another.
Data were from an open-label, multicenter, randomized, crossover study assessing preference between the dulaglutide and semaglutide injection pens. In addition to the 10-item DID-PQ, people with T2D completed a global item assessing overall preference. DID-PQ responses were compared to the global preference item (percent agreement, Gwet’s AC1, prevalence-adjusted and bias-adjusted Kappa [PABAK]). For each item of the DID-PQ, a two-sided binomial test assessed whether the difference in preference was statistically significant.
The sample included 310 participants (48.4% female; mean age = 60.0). The DID-PQ had minimal missing data. There was strong concordance (percent agreement > 78%) between the global preference item and all DID-PQ items except item 6, which assesses preference related to needle size (59.7%). The Gwet AC1 and PABAK statistics also indicated strong agreement between the global preference item and all DID-PQ items except item 6. There was a statistically significant difference (p < 0.0001) in preference on every DID-PQ item, with more participants preferring the dulaglutide device.
Patient preference has been recommended as a “major factor driving the choice of medication” in a consensus report by the American Diabetes Association and the European Association for the Study of Diabetes. Current findings suggest that the DID-PQ may be a useful tool for providing insight into preferences of people with T2D using non-insulin injectable medication.
Glucagon-like peptide-1 receptor agonists (GLP-1 RAs) are often recommended for treatment of type 2 diabetes (T2D) . Medications in this class have demonstrated efficacy for glycemic control, along with a low risk of hypoglycemia and the potential benefit of weight loss [2,3,4,5]. The injectable GLP-1 RAs vary in terms of injection devices and treatment administration procedures, which could have an impact on patient preference.
Therefore, two patient-reported outcome (PRO) measures have been developed to assess patient perceptions of injection devices used to administer these non-insulin injectable medications: the Diabetes Injection Device Experience Questionnaire (DID-EQ) and the Diabetes Injection Device Preference Questionnaire (DID-PQ) . The DID-EQ was designed to assess perceptions of a single injection device, and it has demonstrated reliability and validity in patients treated with GLP-1 RAs . The DID-PQ was designed to assess preference between two non-insulin injection devices. This questionnaire has been used in two previous studies [7, 8]. In both studies, however, it was completed by a relatively small subset of patients who had used two non-insulin injection devices (n = 27 and n = 58). Therefore, it was not possible to draw conclusions about construct validity of the DID-PQ from these previous datasets.
In a recent crossover study with a larger sample, people with T2D performed mock injections with two non-insulin injection devices, and all participants completed the DID-PQ to report preferences between the devices . Data from this study provide the first opportunity to examine performance of the DID-PQ in a larger sample. The purpose of the current analysis was to assess construct validity of the DID-PQ and demonstrate one way to test whether there is a significant preference for one injection device over another.
Data were from an open-label, multicenter, randomized, crossover study (ClinicalTrials.gov identifier: NCT03724981) [9, 10] assessing patient preference for the dulaglutide single-use pen  and the semaglutide single-patient-use pen among injection-naïve patients with T2D . The devices used in the study were those commercially available in the United States. The study design is illustrated in Fig. 1. Study participants were recruited at 13 clinical sites across the US, including nine general practice clinics and four endocrinology clinics. After providing consent to participate in the study, participants were randomly assigned to one of the two device orders (i.e., either dulaglutide or semaglutide first, followed by the other device). After being trained to use each device based on device instructions for use (IFU), participants performed all steps of injection preparation and administered mock injections into an injection pad. Further details of the study design, inclusion/exclusion criteria, and methods have been published previously .
After completing training and performing mock injections with both devices, participants completed the measures described below. Both questionnaires were administered on paper forms and used the brand names (Trulicity for dulaglutide; Ozempic for semaglutide). The questionnaires included color images of the injection devices at the top of the page to avoid any confusion regarding which device corresponded to each question and response option.
Global preference item
The global preference item evaluated patient preference between the devices. The item asked “Overall, which device do you prefer?” Response options were Ozempic, Trulicity, or No Preference. All participants completed the global preference item before completing the DID-PQ.
Diabetes Injection Device Preference Questionnaire (DID-PQ)
The DID-PQ was designed to assess patient preferences between two non-insulin injection devices [6, 7]. The 10 questionnaire items were developed based on qualitative research with patients. Items 1 to 7 focus on preference related to specific characteristics of injection delivery systems. Items 8 to 10 are global items assessing preference based on overall satisfaction, ease of use, and convenience of the injection devices. Each item is rated on a five-point scale allowing respondents to indicate whether they prefer or strongly prefer one of the devices over the other. For each item, participants could also select the “no preference” response. As the five response options are categorical, mean scores are not calculated.
Analyses were performed using data from participants who had (1) been randomized to a device order, (2) been exposed to both devices regardless of whether they successfully completed the mock injection, and (3) completed the global preference item. No imputations were performed for missing data. All statistical tests were two-sided with a significance level of 5%. Descriptive statistics (mean, standard deviation, range, and frequency) were used to summarize demographic and clinical characteristics, as well as responses to questionnaires.
The categorical response options of the DID-PQ cannot be treated as continuous scores. Therefore, correlations with a criterion measure that would typically be conducted to examine construct validity of PRO instruments cannot be used. Instead, the 10 DID-PQ items were compared to the global preference item using categorical analyses so that concordance between the two instruments could be assessed. For these analyses, the five DID-PQ response options were collapsed into three categories by combining the “prefer” and “strongly prefer” response options. Thus, the DID-PQ and global preference items had the same three levels of response: prefer dulaglutide device, prefer semaglutide device, and no preference between devices.
These three-level responses were compared to responses on the global preference item in three ways: (1) percent agreement, (2) Gwet’s AC1 statistic [13, 14], and (3) the prevalence-adjusted and bias-adjusted Kappa (PABAK) statistic . The Gwet’s AC1 and PABAK statistics were used to assess concordance instead of the traditional Kappa statistic because Kappa is sensitive to uneven data distributions . For example, when there is high agreement in situations with an uneven distribution of responses across the possible response options (e.g., high prevalence observed for one response option), Kappa may not accurately represent concordance . Gwet’s AC1 is similar to Kappa, but it uses a different definition of chance agreement with a more realistic assumption that only a portion of the observed ratings will potentially lead to agreement by chance . Thus, it is more robust to an uneven distribution of data. The PABAK statistic defines and incorporates both a bias index and prevalence index into its calculation of the estimate of chance agreement, therefore mitigating potential effects of rater bias and overall prevalence . The Gwet AC1 and PABAK statistics were interpreted using benchmarks commonly used to interpret agreement statistics. For example, values over 0.80 are thought to indicate “almost perfect” agreement or “very good” agreement [17, 18].
To determine whether significantly more participants preferred one device over the other with regard to each item of the DID-PQ, comparisons between devices were performed according to the following steps: (1) participants who provided a neutral response for an item were dropped from analysis of that item; (2) for each item, responses were grouped into two categories (prefer dulaglutide device or prefer semaglutide device); and (3) a two-sided binomial test was performed to determine whether the difference in preference between the devices was statistically significant. This test assessed whether the proportion indicating preference for one of the two devices differed from 0.5. For each DID-PQ item, the null hypothesis was that the probability of preferring one of the devices was 0.5, which would indicate that an equal number of respondents preferred each device. If the binomial test yielded a significant p-value, then the null hypothesis could be rejected, which would mean that significantly more participants preferred one device over the other.
A total of 310 participants were included in the sample, with half (n = 155) randomized to each group (i.e., either dulaglutide or semaglutide device first). Detailed demographic and clinical information has been previously published for this sample , and a selection of participant characteristics are presented Table 1.
Validity of the DID-PQ
There were minimal missing data on the DID-PQ, as shown in Table 2. There was strong concordance (percent agreement > 78%) between the global preference item and nine of the 10 DID-PQ items (Table 2). Percent agreement was particularly high (> 91%) for the three DID-PQ global items assessing preference related to overall satisfaction, ease of use, and convenience (items 8, 9, and 10). The only DID-PQ item that did not have strong concordance with the global preference item was item 6, which asks about preference related to needle size (percent agreement = 59.7%). The Gwet AC1 and PABAK statistics were consistent with percent agreement, with results indicating strong agreement between the global preference item and all DID-PQ items except item 6 (Table 2).
Significance testing of preferences between devices
For each item of the DID-PQ, a two-sided binomial test was performed to determine whether significantly more participants preferred one device over the other (Table 3). There was a statistically significant difference (p < 0.0001) in preference on every item of the DID-PQ with significantly more participants reporting a preference for the dulaglutide injection device.
Patient preference has been recommended as a “major factor driving the choice of medication” in a consensus report by the American Diabetes Association and the European Association for the Study of Diabetes . To collect and interpret patient preference data, well-designed and valid measurement tools are needed. Current findings suggest that the DID-PQ may be a useful tool for providing insight into preferences of people with T2D using GLP-1 receptor agonists. While a single global item can be used to assess injection device preference, the DID-PQ can provide a more detailed assessment of factors contributing to this preference, including ease of use, convenience, overall satisfaction, and details of the injection experience.
Concordance with the global preference item supports the construct validity of the DID-PQ. Item 6 of the DID-PQ, which assesses preference related to needle size, had the lowest concordance (59.7% agreement). Although needle size is an important factor for some patients , this item may not have yielded consistent data because participants were injecting into an injection pad rather than injecting themselves. Therefore, they did not personally experience the feeling of injecting with either needle, and the factors that participants considered when responding to this question are unclear and may have varied widely. Future research involving actual injections rather than mock injections may be necessary to assess validity of DID-PQ item 6.
In addition to examining validity, the study provides a parsimonious and easily interpretable method for examining whether preference for one device over another is statistically significant (Table 3). This analysis approach excludes neutral (i.e., no preference) responses. For situations when it may be important to consider the number of neutral responses (which were relatively rare in the current study; Table 3), the Prescott test can be used to determine whether there was a statistically significant difference in preference while accounting for the frequency of respondents with no preference [19, 20]. The Prescott test was used in the original analysis of data from the current study, with similar statistically significant results favoring the dulaglutide device on all 10 items of the DID-PQ .
The structure of the DID-PQ does not allow for typical psychometric analyses, and the resulting limitations need to be considered. Unlike a PRO measure of symptoms or health-related quality of life, the items of the DID-PQ do not have ordinal response options ranging from lowest to highest on a particular construct, and item scores cannot be aggregated into subscales for analysis of continuous data. Instead, DID-PQ items yield categorical data representing preference. Therefore, it is not possible to assess internal consistency reliability with Cronbach’s alpha, test-retest reliability with intra-class correlations, or convergent validity with Spearman correlations. Furthermore, there are not generic instruments or validated gold standard criterion measures that may be used for assessment of construct validity. While the current categorical analyses support construct validity of the DID-PQ via comparisons to a single item assessing global preference, it is not possible to thoroughly investigate reliability or validity of the instrument using common psychometric methods. Future research with the DID-PQ may provide further confidence in its validity.
There may also be limitations associated with the mock injection procedures. Participants were trained on each device prior to injecting each pen into an injection pad. Participants did not inject themselves with medication. Some aspects of the injection experience, such as comfort related to needle size or liquid volume, were not apparent during these procedures. It is possible that some DID-PQ responses could have been different if participants had injected themselves instead of the injection pad. Still, participants were thoroughly trained on both injection devices, and they performed all parts of the injection process. Therefore, their DID-PQ responses were likely based on a good understanding of both devices.
Despite these limitations, the DID-PQ represents a step forward for assessment of patient preference between injection devices. For preference to inform clinical decisions, measurement tools focusing on comparisons between treatments will be necessary. Since the DID-PQ has been useful in several studies, perhaps it could be a model for development of questionnaires designed to assess preference among other treatments across a range of medical conditions.
Availability of data and materials
The study presented in this manuscript has been posted to ClinicalTrials.gov (NCT03724981), and data are available upon request.
Diabetes Injection Device Experience Questionnaire
Diabetes Injection Device Preference Questionnaire
Instructions for use
Prevalence-adjusted and bias-adjusted Kappa
Type 2 diabetes
Davies, M. J., D'Alessio, D. A., Fradkin, J., et al. (2018). Management of hyperglycaemia in type 2 diabetes, 2018. A consensus report by the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetologia, 61(12), 2461–2498.
Aroda, V. R., Henry, R. R., Han, J., et al. (2012). Efficacy of GLP-1 receptor agonists and DPP-4 inhibitors: Meta-analysis and systematic review. Clinical Therapeutics, 34(6), 1247–1258 e1222.
Htike, Z. Z., Zaccardi, F., Papamargaritis, D., Webb, D. R., Khunti, K., & Davies, M. J. (2017). Efficacy and safety of glucagon-like peptide-1 receptor agonists in type 2 diabetes: A systematic review and mixed-treatment comparison analysis. Diabetes, Obesity and Metabolism, 19(4), 524–536.
Tofe, S., Arguelles, I., Mena, E., et al. (2019). Real-world GLP-1 RA therapy in type 2 diabetes: A long-term effectiveness observational study. Endocrinology, Diabetes & Metabolism, 2(1), e00051.
Trujillo, J. M., Nuffer, W., & Ellis, S. L. (2015). GLP-1 receptor agonists: A review of head-to-head clinical studies. Therapeutic Advances in Endocrinology and Metabolism, 6(1), 19–28.
Matza, L. S., Boye, K. S., Stewart, K. D., Paczkowski, R., Jordan, J., & Murray, L. T. (2018b). Development of the Diabetes Injection Device Experience Questionnaire (DID-EQ) and Diabetes Injection Device Preference Questionnaire (DID-PQ). Journal of Patient-Reported Outcomes, 2, 43.
Matza, L. S., Stewart, K. D., Paczkowski, R., Coyne, K. S., Currie, B., & Boye, K. S. (2018c). Psychometric evaluation of the Diabetes Injection Device Experience Questionnaire (DID-EQ) and Diabetes Injection Device Preference Questionnaire (DID-PQ). Journal of Patient-Reported Outcomes, 2, 44.
Matza, L. S., Boye, K. S., Currie, B. M., et al. (2018a). Patient perceptions of injection devices used with dulaglutide and liraglutide for treatment of type 2 diabetes. Current Medical Research and Opinion, 34(8), 1457–1464.
Matza, L. S., Boye, K. S., Stewart, K. D., et al. (2020). Assessing patient PREFERence between the dulaglutide pen and the semaglutide pen: A crossover study (PREFER). Diabetes, obesity & metabolism, 22(3):355–364.
ClinicalTrials.gov, Eli Lilly and Company. A Study Comparing the Dulaglutide Pen and the Semaglutide Pen (NCT03724981). 2019; https://clinicaltrials.gov/ct2/show/NCT03724981. Accessed 19 Dec 2019.
Eli Lilly and Company. Instructions for Use: TRULICITY® (Trū-li-si-tee) (dulaglutide) injection, for subcutaneous use – 0.75 mg/0.5 mL Single-Dose Pen once weekly. 2017; http://pi.lilly.com/us/trulicity-lowdose-ai-ifu.pdf.
Novo Nordisk. Instructions for Use: OZEMPIC® (semaglutide) injection, for subcutaneous use – 0.5 mg/1 mg. 2017; https://www.novo-pi.com/ozempic.pdf.
Gwet, K. L. (2008). Computing inter-rater reliability and its variance in the presence of high agreement. The British Journal of Mathematical and Statistical Psychology, 61(Pt 1), 29–48.
Gwet, K. L. (2014). Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among Raters, (4th ed., ). Gaithersburg: Advanced Analytics, LLC.
Byrt, T., Bishop, J., & Carlin, J. B. (1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology, 46(5), 423–429.
Feinstein, A. R., & Cicchetti, D. V. (1990). High agreement but low kappa: I. the problems of two paradoxes. Journal of Clinical Epidemiology, 43(6), 543–549.
Altman, D. G. (1991). Practical statistics for medical research. London: Chapman & Hall/CRC.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Pictor, A. (2003). Analysing Binary Outcome Data from a Crossover Design Study using the SAS® System, (vol. 2019) (8 July). https://www.lexjansen.com/views/2003/statpharm/st02.pdf.
Prescott, R. (1981). The comparison of success rates in cross-over trials in the presence of an order effect. Journal of the Royal Statistical Society: Series C: Applied Statistics, 30(1), 9–15.
The authors thank Adebimpe Atanda, Gordon Parola, Haylee Andrews, Jessica Jordan, Katelyn Cutts, Katie Stewart, Melissa Garcia, Natalie Taylor, Peter Chongpinitchai, and Timothy Howell for assistance with data collection; K. Jack Ishak for statistical consultation; Karen Malley for statistical programming; Sandra Macker for data management; and Amara Tiebout for editorial assistance.
Funding for this study was provided by Eli Lilly and Company. The authors had independence in the study design, data collection approach, analysis, interpretation of data, and writing the manuscript.
Participants were required to provide written informed consent before completing study procedures, and all procedures and materials were approved by an independent institutional review board (Ethical & Independent Review Services; Study Number 18128–01). This study was conducted in accordance with the Declaration of Helsinki.
Consent for publication
Louis Matza, Karin Coyne, and Brooke Currie are employees of Evidera, a company that received funding from Eli Lilly for time spent conducting this research. Kristina Boye is an employee of and owns stock in Eli Lilly and Company.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Boye, K.S., Matza, L.S., Currie, B.M. et al. Validity and analysis of the Diabetes Injection Device Preference Questionnaire (DID-PQ). J Patient Rep Outcomes 4, 104 (2020). https://doi.org/10.1186/s41687-020-00266-x
- Injection devices
- Type 2 diabetes
- Patient-reported outcome measures
- Crossover study