Psychometric evaluation of the DAILY EATS questionnaire in individuals living with obesity

Background Physiological and behavioral factors including hunger, satiety, food intake, and cravings are health determinants contributing to obesity. Patient-reported outcome (PRO) measures focused on eating-related factors provide insight into the relationships between food choice and quantity, weight change, and weight-loss treatment for individuals living with obesity. The DAILY EATS is a novel 5-item, patient-reported measure evaluating key eating-related factors (Worst and Average Hunger, Appetite, Cravings, and Satiety). Methods Psychometric analyses, consistent with regulatory standards, were conducted to evaluate the DAILY EATS using data from two randomized trials that included individuals with severe obesity without diabetes (NCT03486392) and with severe obesity and type 2 diabetes (NCT03586830). Additional measures included Patient Global Impression of Status (PGIS) and Patient Global Impression of Change items, Impact of Weight on Quality of Life-Lite, Ease of Weight Management, and Patient-Reported Outcomes Measurement Information System Physical Function Short Form 8b and 10a. The reliability, validity, and responsiveness of the DAILY EATS were assessed, and a scoring algorithm and thresholds to interpret meaningful score changes were developed. Results Item-level analyses of the DAILY EATS supported computation of an Eating Drivers Index (EDI), comprising the related items Worst Hunger, Appetite, and Cravings. Internal consistency (Cronbach’s coefficient alphas ≥0.80) and test-retest reliability (coefficients > 0.7) of the EDI were robust. Construct validity correlation patterns with other PRO measures were as hypothesized, with moderate to strong significant correlations between the EDI and PGIS-Hunger (0.30 ≤ r ≤ 0.68), PGIS-Cravings (0.33 ≤ r ≤ 0.77) and PGIS-Appetite (0.52 ≤ r ≤ 0.77). Anchor- and distribution-based analyses support reductions ranging from 1.6 to 2.1 as responder thresholds for the EDI, representing meaningful within-person improvement. Conclusions The DAILY EATS individual items and the composite EDI are reliable, sensitive, and valid in evaluating the concepts of hunger, appetite, and cravings for use in individuals with severe obesity with or without type 2 diabetes.


Background
A recent initiative to develop a patient-centered disease-illness model for obesity identified physiological and behavioral factors, including hunger, satiety, food intake, and cravings, as health determinants contributing to obesity [1]. Specifically, potential eating-related barriers to weight loss included difficulties in controlling hunger and appetite and the lack of the sensation of fullness after eating a meal. Individuals living with obesity may be able to lose weight or maintain a healthier weight if they have more control over these eating-related factors, thus allowing for more appropriate meal portion-sizes and fewer cravings, particularly for foods high in calories. For individuals living with obesity, with or without concomitant type 2 diabetes mellitus (T2DM), weight loss may relieve physical, social, emotional, and functional impacts associated with obesity [2,3].
While select weight-loss medications target hormones that control hunger and satiety, patients' interpretations of such concepts and their role in chronic weight management are not well understood. Patientreported outcome (PRO) measures focused on factors related to eating may help facilitate better understanding of the relationships between eating-related factors, weight change, and weight-loss treatment.
The DAILY EATS: Measuring Daily Eating Factors questionnaire is a novel 5-item, patient-reported measure developed to provide information about the key factors associated with eating and includes assessments of hunger (2 items), appetite, cravings for unhealthy food, and satiety after meals. An 11-point numerical rating scale (0-10) is used for each item, with a higher value indicating more hunger, bigger appetite, stronger cravings, or greater satiety. Selection of potential concepts for the DAILY EATS was informed by the results of previously conducted obesity-related research, input of clinical and PRO experts, and qualitative research that included concept elicitation interviews with 35 overweight or obese individuals, either with or without T2DM [4]. The results of the qualitative research informed the development of a conceptual model of the hypothesized relationships among the eating-related factors, as well as the impacts of these factors on food quantity and choice, identified as important to patients during the concept elicitation interviews (see Fig. 2 in Appendix).
The pilot version of the DAILY EATS, initially referred to as the Eating-Related Concepts Questionnaire, was subsequently debriefed in three rounds of interviews and refined between each round, as needed, based on participant feedback. The DAILY EATS is designed to be completed as a daily diary (24-h recall period) at the same time each day, preferably in the evening. The daily responses are used to compute item-level weekly averages.
The objectives of this research were to conduct a psychometric evaluation of the DAILY EATS, assessing its reliability, validity, and responsiveness, as well as assess structure to develop optimal composite scores and an interpretation guideline. The psychometric evaluation was conducted by using data from two phase 2 clinical trials of a novel weight-loss medication: one conducted in individuals with severe obesity (body mass index [BMI], 35-50 kg/m2) and without diabetes (Study 1; NCT03486392) and a separate study conducted in individuals with severe obesity and T2DM (Study 2; NCT03586830). Development and psychometric evaluation of the DAILY EATS were conducted in a manner consistent with the review criteria described in the United States Food and Drug Administration's Patient-Reported Outcome Guidance [5]. Additional details about the psychometric evaluation are summarized in the Appendix.

Study measures
Instruments used in the psychometric analysis included the DAILY EATS diary; Patient Global Impression of Severity (PGIS) items related to hunger, appetite, cravings, satiety, and physical functioning; Patient Global Impression of Change (PGIC) items related to hunger, cravings, and physical functioning; the Impact of Weight on Quality of Life-Lite (IWQOL-Lite) measure; the single-item Ease of Weight Management (EWM) measure; and the Patient-Reported Outcomes Measurement Information System Physical Function Short Form (PROMIS PF SF) 8b and 10a measures. All instruments were administered on paper in Study 1 and Study 2. Higher scores for the DAILY EATS, PGIS items, and PGIC items are indicative of higher levels of hunger, appetite, cravings, and other eating-related behaviors, while higher scores for the IWQOL-Lite, EWM and PROMIS PF SF indicate better health-related quality of life, greater ease of weight loss, and physical functioning, respectively. Details related to the recall period and time points for the key measures, including the DAILY EATS, used in the psychometric evaluation are provided in Table 7 in Appendix.

Study design and population
Data from the two studies were used separately to evaluate the psychometric properties of the DAILY EATS in individuals living with severe obesity. Study 1 was a randomized, phase 2b, double-blind, placebocontrolled and open-label active-controlled, parallelgroup, multicenter, dose-ranging study to evaluate the safety and efficacy of a novel weight-loss medication in individuals with severe obesity without diabetes across 26 weeks of treatment. Study 2 was a randomized, phase 2b, double-blind, placebo-controlled, parallel-group, multicenter, dose-ranging study to evaluate the safety and efficacy of the same novel weight-loss medication in individuals with severe obesity with T2DM across 12 weeks of treatment. All psychometric analyses were conducted without reference to treatment group (i.e., data were pooled across treatment arms into a study-related analysis population). For each study, analyses were conducted using all patients in the modified intent-to-treat clinical analysis data set who completed at least one DAILY EATS item at least 1 day at baseline and also at least 1 day in a follow-up week. Both studies complied with the Declaration of Helsinki and were approved by the relevant investigational review boards or ethics committees for the respective study sites.
Descriptive statistics, missing data, and DAILY EATS structure The study populations and descriptive statistics for the supporting measures were summarized descriptively. Weekly average and change-score standard descriptive statistics were reported. Floor or ceiling effects for DAILY EATS items were defined as more than 18% of patients (approximately twice the expected probability for each of the 11 categories in a uniform distribution) selecting an extreme response category (e.g., 0 [Not hungry at all], 10 [Extremely hungry]).
The impact of missing DAILY EATS data was evaluated at the daily level of baseline to inform scoring rules for weekly averages using a missing data simulation: different subsets of daily responses to each item were deleted to assess the stability of the resulting distribution.
The pattern of inter-item correlations was evaluated at baseline and end of treatment (EOT: Week 26 in Study 1, Week 12 in Study 2) to inform potential DAILY EATS composite scores, such that moderate correlations (r ≥ 0.30) supported composite formation and strong correlations r > 0.80 indicated potential redundancy. Exploratory factor analysis (EFA) was conducted using an interitem Pearson correlation matrix based on the weekly scores at baseline and maximum likelihood estimation with robust standard errors. The size of the eigenvalues [6] and the scree plot [7] guided the decision regarding dimensionality.
Each psychometric property was evaluated for the DAILY EATS weekly items and, after reviewing the item-level and the scoring analyses, a single composite of three DAILY EATS items, the Eating Drivers Index (EDI), was developed for scoring purposes. The EDI is scored as the average of the weekly scores for Worst Hunger (Item 2), Appetite (Item 3), and Cravings (Item 4). Early qualitative research [4] and the item-level quantitative analyses support the relevance of all five DAILY EATS items. Future studies may consider reporting both the EDI and the five individual items scores. For the purposes of this manuscript, the remaining properties focus primarily on the EDI composite and the three component DAILY EATS items of Worst Hunger, Appetite, and Cravings.

Reliability
Internal consistency reliability analyses evaluated the degree to which items were associated with one another. Cronbach's coefficient alpha [8] was computed at baseline and EOT. The approximate range of optimal alphas suggested by Streiner and Norman [9] is between 0.70 and 0.90, indicating a set of items that is strongly related and capable of supporting a unidimensional scoring structure but not redundant.
The test-retest reliability of the DAILY EATS weekly item scores was assessed by computing intraclass correlation coefficients (ICCs) among patients considered to be stable based on an external criterion over the test-retest period. A two-way mixed-effects analysis of variance (ANOVA) with absolute agreement for single measures was used to compute testretest reliability ICCs [10,11]. Study 1 data used Week 15 (test) and Week 26/EOT (retest) for a subgroup with no corresponding PGIS change. Study 2 data used baseline (test) and Week 12/EOT (retest) for a subgroup with no corresponding PGIS change.

Construct validity
Construct validity describes the relationships among multiple indicators of a construct and the degree to which they follow predictable patterns. Cross-sectional correlations were computed between weekly DAILY EATS item and EDI composite scores and supporting measures (i.e., PGIS item scores, EWM, BMI, IWQOL-Lite domain and total scores, and PROMIS PF SF 8a and 10b total scores) at baseline and EOT. The magnitude and direction of the resulting correlation coefficients were compared with respect to specific a priori hypotheses and to Cohen's guideline [12] for interpreting correlation coefficients: absolute values of correlations of 0.50 or greater are considered strong, correlations that fall between 0.30 and 0.49 are moderate, and those that fall between 0.10 and 0.29 are small. Moderate to strong correlations were hypothesized for the weekly DAILY EATS item scores and the EDI composite with corresponding PGIS items (e.g., between DAILY EATS Worst Hunger items and PGIS-Hunger item), whereas smaller correlations were hypothesized between DAIL Y EATS items and the EDI composite with the PGIS-Physical Functioning (PGIS-PF) item. Trivial (|r| < 0.1) to small correlations were hypothesized for DAILY EATS item scores and the EDI composite with physical function scores based on the PROMIS PF SF 8b and SF 10a.

Known-groups validity
Known-groups analyses comparing subgroups of interest were conducted to evaluate the discriminating ability of the DAILY EATS weekly item and EDI composite scores at baseline and EOT. Analyses of variance, with the use of overall F test and pairwise comparisons based on a priori hypotheses, were conducted to examine mean differences in weekly DAILY EATS item and EDI composite scores between patients classified into subgroups based on the corresponding PGIS items. It was hypothesized that individual DAILY EATS item and EDI composite scores would differentiate between patients who report low levels of eating-related issues versus those who report higher levels on the corresponding PGIS items. It also was hypothesized that patients who reported little to no difficulty with their weight management on the EWM would have lower DAILY EATS item and EDI composite scores, on average, than those patients who report higher levels of difficulty in managing weight.

Responsiveness
The DAILY EATS' responsiveness-or its ability to detect change when change is expected-was evaluated using multiple methods: by computing correlations of change from baseline to EOT in the weekly DAILY EATS item and EDI composite scores and the supporting outcome measures, ANOVA, and effect-size estimates of change. Specifically, longitudinal correlations were computed between changes in weekly DAILY EATS item and EDI composite scores and changes in the supporting measures (i.e., corresponding PGIC items, weight change percentage, and changes in corresponding PGIS items, EWM, BMI, IWQOL-Lite, and PROMIS PF SF 8b and SF 10a) at the EOT. For the ANOVAs (using overall F test, pairwise comparisons, and effect sizes), it was hypothesized that patients who had improved scores on the corresponding PGIS (or PGIC) would have larger changes indicative of improvement than would patients who have remained the same or worsened on these assessments. For correlational (Pearson) analyses, the following correlations were hypothesized: (1) Moderate to strong correlations between changes in weekly DAILY EATS item and EDI composite scores and changes in the corresponding PGIS items; smaller correlations between changes in weekly DAILY EATS item and EDI composite scores and the change in PGIS-PF; (2) Moderate to strong correlations between changes in DAILY EATS Worst Hunger and Appetite items and PGIC-Hunger; moderate to strong correlations between the change in the weekly DAILY EATS Craving item and PGIC-Craving; smaller correlations between the changes in weekly DAILY EATS item and EDI composite scores and PGIC-Physical Functioning (PGIC-PF); (3) Small correlations between changes in weekly DAILY EATS item and EDI composite scores and changes in the physical function scores from the IWQOL-Lite and the PROMIS PF SF 8b and 10a; and (4) Small to moderate correlations between changes in weekly DAILY EATS item and EDI composite scores and the weight change percentage. Effect sizes of approximately 0.20 were interpreted to represent small effects, those of approximately 0.50 represented moderate effects, and those greater than approximately 0.80 represented large effects [13].

Interpretation of change
To identify patients who experienced a meaningful change, a threshold or responder definition was estimated for three weekly DAILY EATS items (Worst Hunger, Appetite and Cravings) and the EDI composite score. Both anchor-based and distribution-based methods were used to estimate thresholds defining meaningful within-person change, or responder definitions, of the weekly DAILY EATS item and EDI composite scores in individuals with severe obesity without diabetes (Study 1) and with T2DM (Study 2). An anchor-based approach is the primary method recommended in the PRO guidance [5] to define this threshold. Prior to applying anchor-based methods, the appropriateness of the anchor measures was assessed by reviewing responsiveness correlations. A commonly applied criterion for identifying an appropriate anchor measure was used: the magnitude of the correlation of change was required to be at least 0.371, based on achieving a large effect size using Cohen's rule of thumb [14][15][16][17]. In addition, the size and direction of the mean and median change in the weekly DAILY EATS item and EDI composite scores by the change in the corresponding anchor measures were reviewed to confirm that greater improvement or worsening in the weekly DAILY EATS item and EDI composite scores was achieved by patients who showed greater levels of improvement or worsening on the change in the anchor measures.
A 1-point improvement on the related PGIS was selected a priori as the primary anchor. Distributionbased estimates were also conducted to provide additional information and to serve as secondary threshold estimates. Finally, to support the anchorbased methods, cumulative distribution function (CDF) and probability density function (PDF) plots were developed. Table 1 presents key baseline characteristics of the 99 patients from Study 1 (individuals with severe obesity without diabetes) and the 146 patients from Study 2 (individuals with severe obesity with T2DM) in the psychometric analysis sample. Patients without diabetes and those with diabetes had an average BMI of 40.9 and 40.3, respectively, and were aged, on average, 48.2 years and 56.4 years at time of study entry. Both samples contained a higher proportion of female patients (71.7%, 58.9%) than male patients (28.3%, 41.1%), and patients were predominantly white (79.8%, 69.2%) and of non-Hispanic or Latino ethnicity (79.8%, 74.4%).

Sample characteristics
Descriptive statistics of the supporting PRO measures used in the psychometric evaluation were reviewed (data not shown). The dominant baseline responses were "Moderate" on the PGIS-Hunger, the PGIS-Cravings, and the PGIS-Appetite; this supports patients acknowledge concerns in the key eating behavior concepts assessed on the DAILY EATS.
Notably, most patients reported being "Completely satisfied" on the PGIS-Satiety in both studies, suggesting that patients were eating to being comfortably full. The baseline scores of the PRO measures addressing physical functioning and health-related quality of life tended to correspond to a better status in the sample without diabetes (Study 1) than the scores in the sample with T2DM (Study 2).
The trends in the responses of PGIS-Hunger, PGIS-Cravings, and PGIS-Appetite showed improvement from "Moderate" to "Mild" by EOT. In addition, by EOT patients on average showed some overall improvement on all the supporting measures in both studies.

Descriptive statistics, missing data, and DAILY EATS structure
An examination of the item response distributions during the baseline weeks in Study 1 and Study 2 indicated little evidence of ceiling effects and no evidence of floor effects. Over the baseline week, the highest percentage of patients who reported a daily score of 10 on any day was from DAILY EATS Satiety (Item 5) in both studies (15.3% in Study 1; 19.9% in Study 2) (data not shown). The baseline weekly averages were indicative of moderate severities on eating-related concepts and ranged from 5.9 (Average For each study, the psychometric analysis sample included all patients in the modified intent-to-treat clinical analysis data set who completed at least one DAILY EATS item at least 1 day at baseline and also at least 1 day in a follow-up week BMI Body mass index, SD Standard deviation, T2DM Type 2 diabetes mellitus  Table 8 in Appendix). The average weekly change from baseline to EOT was an improvement (a decline for Items 1-4 and an increase for Item 5) of approximately − 1.1 points across the items in both studies; the change in Cravings (Item 3) was the largest at − 1.6 (Study 1), and the change in Satiety (Item 5) was the smallest at 0.3 and 0.0 points ( Table 8 in Appendix).
Across evaluated time points and studies, more than 98% of patients completed all five items of the DAILY EATS for at least 6 days, indicating very good assessment compliance. No problematic completion differences were observed across the items. Missing simulation analyses in both studies showed that the 95% CIs of the SD of each item-level weekly score from partially complete data were still within the ± 0.5 limits of the SD from complete data, despite the random loss of up to 6 daily responses. These results support the proposed missing rule for weekly scoring (requirement of at least 4 days of data per week).
Satiety (Item 5) scores performed differently than the other DAILY EATS items (i.e., low EFA loadings and weak inter-item correlation ( Table 9 in Appendix) and Table 2) when evaluating the DAILY EATS structure. Further, because Average and Worst Hunger item scores were found to be potentially redundant (i.e., a high degree of collinearity due to overlapping content area), Worst Hunger (Item 2) was retained for further consideration instead of Average Hunger (Item 1). Subsequently, analyses of the DAILY EATS structure supported the computation of a three-item DAILY EATS composite, the EDI, as the average of the weekly scores for Worst Hunger (Item 2), Appetite (Item 3), and Cravings (Item 4) for both populations. It is recommended that Average Hunger (Item 1) and Satiety (Item 5) should be reported separately.

Reliability
Item-level test-retest reliability coefficients were above 0.7, except for Appetite (Item 3) in Study 2 and Satiety (Item 5) in both studies ( Table 2). The smaller magnitude of the ICC for Satiety (Item 5) was expected since responses in both samples were high throughout the treatment period, reducing the scores' variability across participants (hence the ICC) at each time point. Internal consistency reliability for the EDI was strong across studies and time points (all Cronbach's coefficient alpha ≥0.80), providing evidence to support the relationships among the items to justify reporting a composite score. Test-retest reliability coefficients were greater than the 0.7 threshold for both studies for the EDI, indicating stability in the EDI scores.

Construct validity
Correlations were computed between the weekly DAILY EATS items and EDI composite scores and supportive measures at EOT (Table 3). Correlation patterns observed were generally as hypothesized. Specifically, strong positive correlations were observed between the DAILY EATS items for Worst Hunger, Appetite, and Cravings and their corresponding PGIS items. Correlations between pairs of DAILY EATS items and PGIS referring to similar content were typically the largest observed. Also as expected, the correlations between the three DAILY EATS item scores and PGIS-PF were considerably lower than the correlations between the item scores and the eating-related     trivial (near 0) and the correlations with BMI in Study 2 were small.

Known-groups validity
Known-groups ANOVAs were conducted to evaluate the discriminating ability of the three weekly DAILY EATS item and EDI composite scores at baseline and Week 26/EOT or Week 12/EOT. As hypothesized and shown in Table 4 for EOT, patients reporting "No …" or "Mild …" (e.g., hunger, appetite, cravings) on the corresponding PGIS (i.e., PGIS-Hunger, PGIS-Appetite, PGIS-Cravings) had lower (less severe) weekly DAILY EATS item and EDI composite scores on average than those with "Moderate …" or "Severe …" responses (P < 0.0001). As expected, the mean weekly DAILY EATS item and EDI composite scores increased (increased hunger,   appetite, or cravings) as the PGIS level increased (increased hunger, appetite, or cravings), providing strong support for the discriminating ability of the individual DAILY EATS items and the EDI composite. Results were strongest for Cravings (Item 4) and the EDI composite, with the overall and all pairwise comparisons statistically significant at the P < 0.05 level.

Responsiveness
The correlation coefficients for change from baseline to EOT scores between the three weekly DAILY EATS items and the EDI composite with a subset of the supporting measures are shown in Table 5.  Table 11 in Appendix).

Interpretation of change
Anchor-based and distribution-based methods were used to estimate thresholds defining meaningful within-person change, or responder definitions, of the three DAILY EATS item and the EDI composite in individuals with severe obesity without diabetes  (Study 1) and with T2DM (Study 2) after confirming the appropriateness of the candidate anchor measures. Table 6 displays the responder definition estimates characterizing improvement based on change in the corresponding PGIS and PGIC items, as well as the half-standard deviation and standard error of the measurement (SEM) estimates. Due to the small sample sizes in the 1-point deterioration PGIS subgroups and the "Moderately worse/Moderately hungrier/Moderately stronger cravings" PGIC subgroups, the estimation of thresholds identifying deterioration are not recommended using the current data. A larger sample is recommended to further investigate deterioration. Tables 12 and 13 in Appendix show the complete set of results. The range of responder definitions, based on a 1point improvement in the PGIS, the primary anchor, were higher than the range of estimates based on PGIC and the distribution-based methods. The thresholds estimated using anchor-based methods with Study 1 data tended to be slightly larger than the anchor-based thresholds estimated with Study 2 data. However, the SEM-based estimates were larger in Study 2 due to the lower ICCs (resulting from the test-retest evaluation timespan). Furthermore, all estimates were closer in magnitude across studies than within study using different methods (e.g., PGIS based, PGIC based, distribution based).
The CDF and PDF plots were reviewed to provide visual support of the primary anchor measures. For example, a greater proportion of patients with improvement in PGIS-Hunger also achieved improvement in the EDI composite from baseline to EOT across a range of possible response thresholds, as shown in the CDF curves for Studies 1 and 2 ( Fig. 1a-b). The 1-point improvement (cyan blue) curve is clearly distinct from the no change (green) curves in each curve, providing support for the use of the 1-point improvement in PGIS as the primary anchor. In addition, PGIS-Hunger was adequately associated with the EDI composite change scores within each level of change in PGIS, as shown in the PDF plots for Studies 1 and 2 ( Fig. 1c-d).

Discussion
The purpose of this analysis was to evaluate the DAILY EATS measurement properties using data from two studies in severely obese adult patients with and without T2DM.
Descriptive statistics for the DAILY EATS item scores suggested adequate item performance with no limiting distributional anomalies or response biases in the daily and weekly average scores at baseline or EOT. Furthermore, the change in scores across time points was indicative of improvement during the study period in Average Hunger, Worst Hunger, Appetite, and Cravings items. In comparison, patients reported fairly high Satiety scores at baseline, suggesting they experienced a great deal of satisfaction with being "comfortably full" prior to treatment, and the scores over time provided evidence of maintenance of satisfaction.
A review of the structure of the DAILY EATS informed the preliminary scoring decisions. Average Hunger, Worst Hunger, Appetite, and Cravings items were strongly correlated, and the results suggest that these items can support the formation of a composite score. Scores for Average and Worst Hunger exhibited a high degree of collinearity, which may be viewed as redundant (overlapping content area); thus, the Worst Hunger item was retained in the composite instead of Average Hunger. The resulting composite, the EDI, is an average of the three item weekly scores. The item-level results also indicated that Satiety scores performed differently than the other DAIL Y EATS items (i.e., low loadings and weak inter-item correlation). These results corroborate the findings from the qualitative work with obese patients that the concept of satiety is distinct from the other eatingrelated factors. Given the importance of the concept of satiety, it is recommended that the DAILY EATS questionnaire retain the Satiety item and report it in addition to the other item and EDI composite scores.
Due to the small sample sizes in the PGIS and PGIC subgroups for satiety, the estimation of thresholds identifying deterioration could not be evaluated using the current data. A larger sample is recommended to further investigate deterioration.
Overall, Average Hunger, Worst Hunger, Appetite, Cravings, and the EDI composite weekly and change scores demonstrated acceptable measurement properties. Internal consistency evidence was strong and supported the EDI composite. Test-retest reliability estimates were well above the recommended 0.70 threshold when using Study 1 data; ICCs based on Study 2 were not as strong, potentially owing to differences in the studies' respective test-retest evaluation time points. Study 1 used a span of 9 weeks, and both time points were within the treatment period; Study 2 used a span of 14 weeks, in which the first time period occurred within the pretreatment phase and the second time period occurred within the treatment phase. The remaining properties focused on the EDI composite.
For construct validity, the patterns of correlations with other PRO measures were as hypothesized and consistent across the two studies, thus supporting the weekly DAILY EATS item scores and EDI composite scores and the constructs measured. Mean weekly item and composite scores also differed as anticipated and significantly across known groups based on the PGIS, providing evidence for the scores discriminating between meaningful groups. Lastly, the weekly item and composite scores demonstrated responsiveness based on the moderate to strong correlations of change observed with the related PGIS and PGIC measures, the moderate to large effect-size estimates of change, and the moderate to large magnitudes of change observed across levels of change in the PGIS and between PGIC improvement classification groups.
Finally, results of the anchor-based analyses using the PGIS provided evidence that changes ranging from − 1.5 (mean) to − 2.1 (mean or median) for the EDI composite were appropriate for identifying meaningful withinperson improvement. Estimates based on the PGIC, a supportive anchor, tended to be lower in magnitude than the PGIS-based estimates for the items and EDI composite, and the distribution-based estimates were lower than the anchor-based values.
Along with the existing qualitative evidence supporting the measure's content validity in these patient populations [4], the quantitative results provide further evidence that the DAILY EATS item and EDI composite scores are well-defined, reliable, sensitive, and valid for use in individuals with severe obesity with or without T2DM.

Conclusions
The five-item DAILY EATS and its EDI composite exhibit content validity and good psychometric properties for assessing key factors related to eating. The DAILY EATS item and EDI composite scores shows similar performance among individuals with severe obesity alone and individuals with severe obesity and T2DM, providing a fitfor-purpose measure of eating-related behaviors. The proposed scoring algorithm and thresholds for meaningful change are recommended for both populations.     EOT end of treatment, Q1 25th percentile, Q3 75th percentile, SD standard deviation, T2DM type 2 diabetes mellitus Note: For DAILY EATS weekly scores, floor is defined as score 0 and ceiling is defined as score 10 For each study, the psychometric analysis sample included all patients in the modified intent-to-treat clinical analysis data set who completed at least one DAILY EATS item at least 1 day at baseline and also at least 1 day in a follow-up week

Longitudinal Responsiveness
Patterns of mean change in the weekly DAILY EATS item scores across the levels of the change in the corresponding PGIS and PGIC provide supportive evidence for the responsiveness of the EDI composite score. As an item-level example, the largest average weekly Worst Hunger (Item 2) score change reflecting improvement (negative change) was in the improved PGIS-Hunger subgroup (least square mean, − 2.1 and − 1.8) and this value was significantly larger than the mean change in the stable subgroup (least square mean, − 0.7 and − 0.4; P = 0.0072). In addition to the mean differences, the effect-size estimates of change for the EDI composite based on the SD at baseline, the SD of change, and between the PGIS subgroups are moderate to large, at least 0.50 or greater.