To our knowledge, this study represents the first review of the most widely used SLE-PRO measures to assess how well they align with the recommendations of FDA 2009 PRO guidance. Our results contradict our hypothesis that PRO measures developed after the FDA 2009 PRO guidance release would be adherent (or more adherent) to the FDA recommendations than those developed prior to 2009. In fact, our review found mixed results regarding alignment with FDA-guidance recommendations regarding target population, concepts measured, testing of other psychometric properties, and documentation for all the measures examined. Some or much of this misalignment may be due to lack of availability of the detailed documentation on development needed to assess if the FDA guidance was followed.
The LupusQoL and LupusPRO SLE-PRO have been used for many years and have led to many advancements in capturing what is most important to patients with SLE. For the original SLE instruments, the evaluation of concepts measured involved patient-engagement interviews with concepts elicited until saturation. Moreover, cognitive testing allowed for patients to provide input on the draft versions of the measures. Documentation of the development and validation process was enhanced with figures depicting that process, as well as identified domain structures. Despite these strengths, important limitations were identified in our assessment. Often, due to not finding any information or lack of sufficient detail in the documentation identified.
To date, awareness of PRO guidance recommendations is unknown in research settings outside of the pharmaceutical industry (e.g., clinical trials vs clinical care). We postulate that some PRO measures may not align with FDA guidance because there is lack of knowledge about the guidance in some sectors with possible reliance on checklists [32, 33] and a lack of understanding on how to execute and evaluate the processes described in the guidance. This might explain why some developers cite the FDA 2009 guidance, but do not align with recommendations. As an example, the Engelberg Center for Health Care Reform at the Brookings Institution published a report discussing opportunities and challenges in the development and use of PROs [34]. The report summarized experiences gathered from an expert workshop across five sessions discussing challenges with FDA PRO guidance.
While the LupusQoL and LupusPRO measures were not developed in the context of clinical-trial use for product approval and labeling claims, acknowledgement of FDA guidance was noted by the developers [17, 31]. Yet, not, all processes and/or level of documentation are aligned with FDA guidance recommendations. For all measures (original and revised), the target population is unclear as study population characteristics varied, were not consistently reported, or were not considered across the item-generation, cognitive-testing, and other psychometric-testing phases. The information available on development is limited and lacking detail on the qualitative processes. For example, the original measure-development work engaged patients in the development process, but documentation on content validity was not detailed enough to understand if/how it aligned with the guidance. It is unclear if a wide range of patients representing the target population were interviewed and whether the concepts were experienced by the majority of sample population. Additionally, there was not documentation that indicated items were developed using the exact words as described by patients in the interviews, nor documentation from the testing of item wording. Similarly, documentation confirming item response options, the recall period of the measure, etc., were not available. These findings are similar to other reports regarding PRO labelling claims rejected by FDA for lack of content validity as well as a systematic review evaluating qualitative methods used to generate instruments [10, 11, 35]. Similarly, developers may have learned of FDA guidance after development. For example, McElhone et al., developed the LupusQol prior to release of the FDA guidance and published an analysis on ability to detect change in 2016, citing the FDA guidance in the evaluation [31].
Another issue may be terminology used in identified reporting that may not have been clear. For example, content validity typically encompasses both item generation and cognitive interviewing. However, the original measures appear to have had content validity assessed through cognitive interviewing only. Similarly, face and content validity terms were used interchangeably. Face validity is evaluated after an instrument has been developed whereas content validity is embedded in the development process [36]. Documentation also was lacking to determine if saturation of concepts was reached or deemed comprehensive, as well as whether the potential for bias in interviewing for concept elicitation or cognitive debriefing was mitigated. For example, interviewing should be conducted using open-ended questions in contrast to directed questions that can be answered simply with yes/no response.
Documentation of instrument origination may enhance understanding of the rationale behind decisions made during the developmental process. Documentation provides transparency and evidence in support of preliminary instrument development, content validity, measure development, interpretation, as well as any changes made to the measure. Otherwise, decisions may not be clear to potential users seeking permission to use PRO instruments. An example is highlighted by Mathias et al., who argued in their 2018 study that the recall period of existing instruments did not capture accurate reporting of fluctuations in SLE symptoms and impacts of the disease [29]. As a result, a 24-hour (h) recall period would be more appropriate for all symptoms except hair loss in contrast to the conventional 4 weeks. The suggested 24-h recall period was confirmed by patients as they reported daily fluctuations [29]. Documentation allows reviewers to understand methodology and evaluate if data generation processes were suitable and complete for the target population (e.g., the identification and inclusion of concepts that matter most to patients). The documentation process applies for disease-specific and disease-agnostic measures, including legacy measures. Others can contribute to the literature by expanding upon and carrying the documented instrument forward while minimizing redundancy. Not only is documentation important in the development process, but it is also important when making modifications to existing instruments. Existing instruments may be modified when administered in RCTs, however, the modifications are not transparent nor tested [37]. To assist with the incorporation and qualification of PRO measures in RCTs, Coles et al. [38] proposed the development of a publicly available validity repository of "validity arguments", as a mechanism to collect evidence to support the validity of PRO measures respective to the context of use.
The FDA PFDD guidance series is underway to provide more detail on development of COAs for use in regulatory approval of medical products. With the pending draft and final releases of the FDA PFDD guidance series, the 2009 guidance remains in effect. Not only will appropriate use of these documents improve transparency of the development process, consistency in selection of the study population across development and/ or testing phases, and engaging patients appropriately when adaptation existing PRO measures, but also for newly developed measures. Effective use of more detailed PRO guidance may improve standardization of the process and documentation, thereby raising uptake in the use of PRO measures due to comparability and enhanced understanding in interpretation of results. Adherence to FDA guidance will increase the chances of FDA accepting COA tools as fit-for-purpose (e.g., FDA Drug Development Tools COA Qualification Program: https://www.fda.gov/drugs/clinical-outcome-assessment-coa-qualification-program/clinical-outcome-assessments-coa-qualification-program-resources). This is imperative as PROs can provide a comprehensive view of the patient experience in patient-focused drug development and related research. As previously mentioned, LupusPRO has not been used in RCTs, while the LupusQoL was used in three randomized, controlled trials with scores being used as exploratory endpoints [19,20,21]. To note, the review by Izadi et al., from 2018 highlighted that LupusQoL had been used in one RCT, however, data was not provided [19]. This may be the reason the RCT was excluded in newer reviews and therefore, was not included as part of the RCTs mentioned above [20, 21]. If PRO data is not deemed fit as a primary endpoint due to nature of the study, having PRO data act as secondary endpoints can support primary endpoint interpretation. In the 2018 review by Mercieca-Bebber et al., there are several examples of how PRO data used as primary or secondary endpoints contributed to approval of treatments [5].
It is recognized that the reviewed documents may be providing limited insight into the development and validation processes. Developers may have followed FDA guidance for PRO development and validation processes but did not document or describe detail adequately to demonstrate evidence of alignment. Under these circumstances, our review is limited, as we are only able to evaluate documents that are publicly available and accessible. Furthermore, developers’ perceptions and interpretation of FDA guidance may differ compared to that of others. Based on our review, developers should ensure: patient involvement in the process; that the study population characteristics are similar across all phases of measure development; and clear and publicly available documentation of all methods. The FDA advocates for documentation of development process to be made publicly available and accessible, including—but not limited to—cognitive interview summaries or transcripts, source of items, and an item-tracking matrix. “Without adequate documentation of patient input, a PRO instrument’s content validity is likely to be questioned” [6]. Publication limitations means authors need to consider using an appendix or supplementary materials section to make those details available. Alternatively, authors can make the information available in an accessible users-manual [6].