Study participants
Details of the phase 2b study have been previously published [9]. Briefly, participants in the study were premenopausal (according to the Stages of Reproductive Aging Workshop [STRAW] criteria), nonpregnant women ≥ 21 years old with HSDD, female sexual arousal disorder (FSAD), or a combination of these disorders for ≥ 6 months prior to the start of the study. Participants were required to have been diagnosed by a qualified clinician using a diagnostic interview and to have a total score > 18 on the FSDS-DAO and a total score < 26.5 on the Female Sexual Function Index (FSFI) [10]. Participants were also required to have had previous sexual functionality for at least 2 years, be currently in a monogamous relationship of ≥ 6 months’ duration, and be willing to be sexually active with this partner ≥ 1 time/month during the study. Exclusion criteria included women who presented with unstable or uncontrolled medical conditions, had a history of unresolved sexual trauma or abuse, had been treated for depression or psychosis within the preceding 6 months, had used antidepressants or antipsychotics within the preceding 3 months, or were undergoing current psychotherapy for FSD. Also excluded were women with lifelong anorgasmia, vaginismus, sexual pain disorder, sexual aversion disorder, or persistent sexual arousal disorder. For inclusion in this analysis, data were derived from participants who had FSDS-DAO scores at baseline and at 1 or more follow-up visits. These 325 subjects comprised the evaluable modified intent-to-treat (MITT) population.
Study design
This multicenter, randomized, placebo-controlled, dose-finding study was conducted at 68 sites in the United States and Canada. All participants underwent a 4-week, no-treatment screening/qualification period, followed by a 4-week, single-blind self-dosing (placebo-only) period to establish baseline and were then randomized to self-administer placebo or 3 different doses of bremelanotide (0.75, 1.25, or 1.75 mg) as desired over 12 weeks [9]. In the phase 2b study, the primary efficacy endpoint was the change from baseline to the end of the study in the number of satisfying sexual encounters (SSEs) as assessed by the Female Sexual Encounter Profile-Revised questionnaire (FSEP-R Q10) [9]. Other PRO measures were the Female Sexual Distress Scale-Desire/Arousal/Orgasm (FSDS-DAO), Female Sexual Function Index (FSFI), Sexual Interest and Desire Inventory (SIDI-F), General Assessment Questionnaire (GAQ), and Women’s Inventory of Treatment Satisfaction (WITS-9).
PRO outcomes were assessed at various time points throughout the trial to observe changes over time, including baseline, early in the trial, and at the trial endpoint. Time points varied as not all PRO instruments were administered at each time point: FSEP-R was completed after each sexual encounter, while other PRO outcomes were assessed at Weeks 0 and 4 (Visits 1 and 2), with the exception of GAQ and WITS-9, and at Weeks 10, 16, 20, and 23 (Visits 5, 10, 11, and 12, respectively). All PRO instruments were completed by participants using an electronic handheld device (eDiary). In addition, the SIDI-F was also completed via interview by clinical research staff. In this analysis, data from Weeks 0, 4, 10, and 23 were used for psychometric evaluation.
PRO measures
Female Sexual Distress Scale-Desire/Arousal/Orgasm (FSDS-DAO)
The 15-item FSDS-DAO retains the 13 items from the Likert-type FSDS-R scale, which has evidence supporting reliability and validity [6, 7]. The FSDS-DAO includes 2 new items that ask women to rate their level of distress related to arousal and orgasm. As with previous versions of the FSDS, participant responses to “How often did you feel concerned with difficulties with sexual arousal?” and “How often did you feel frustrated by problems with orgasm?” are provided using a polytomous response scale ranging from 0 (never) to 4 (always). Subjects who met eligibility criteria completed the FSDS-DAO with a 30-day recall at baseline and at Visits 2, 5, 10, 11, and 12. The total score is calculated as the sum of the responses and ranges from 0 to 60, with higher scores indicating a greater level of distress. The total score on the FSDS-R can range from 0 to 52 [11]. For the purposes of this analysis, we present data on the FSDS-DAO for Visit 2 (baseline), Visit 5, and Visit 12 because of the 30-day recall period. The windows between Visits 10 and 11 and Visits 11 and 12 are only 28 and 21 days apart, respectively. Thus, Visits 10 and 11 data were not included in the analyses to reduce overlap in the assessments.
Psychometric evaluation of the FSDS-DAO was undertaken against the following PRO instruments described below. In addition, the analysis was repeated using the FSDS-R, which does not include the arousal and orgasm items in order to provide a comparison between it and the FSDS-DAO.
Female Sexual Function Index (FSFI)
The FSFI is a 19-item measure of female sexual function consisting of 6 domains: desire, arousal, lubrication, orgasm, satisfaction, and pain [10, 12]. Scores for the arousal, lubrication, orgasm, and pain domains range from 0 to 6 using Likert-type scales. Scores for desire range from 1.2 to 6.0, and those for satisfaction range from 0.8 to 6.0. The total score is the sum of the domain scores and ranges from 2 to 36, and the recall period is the past 4 weeks. Higher scores indicate a better level of sexual function.
Female Sexual Encounter Profile-Revised (FSEP-R)
The FSEP-R is a 10-item instrument that is designed to assess sexual encounters, including initiation, level of desire, satisfaction with arousal, lubrication, arousal, ability to achieve orgasm, and satisfaction with the sexual encounter [13]. Participants completed the FSEP-R within 24 h of a sexual encounter. A “sexual encounter” is defined as any act involving sexual contact with genitalia and/or oral mucosa, and includes intercourse, oral sex, and masturbation by self or a partner. Q10 reads “Did you consider this sexual encounter satisfactory for you?” and answers were yes or no.
General Assessment Questionnaire (GAQ)
The GAQ consists of 4 items: satisfaction with arousal, desire, degree of benefit while on study drug, and impact of taking study drug on relationship with partner. Responses are selected on a 7-point numeric rating scale from 1 (very much worse) to 7 (very much better). A score ≥ 5 indicates benefit.
Women’s Inventory of Treatment Satisfaction (WITS-9)
The validated WITS-9 questionnaire assesses satisfaction with treatment and sexual relations over the past 4 weeks [14]. Participants answer the 9 items on a 7-point numeric rating scale from − 3 (very unsatisfied or very likely not to continue) to 3 (very satisfied or very likely to continue). The total score is calculated as the average of the scores from the 9 questions and ranges from − 3.0 to 3.0. A higher score on the WITS-9 indicates a higher level of satisfaction with treatment.
Statistical analysis
Specific statistical tests are described above for each endpoint. All analyses were performed using SAS version 9.2 or later. All statistical tests were conducted with conservative decision-making criteria established a priori according to published guidance [15]. Missing data were considered missing, and no data imputations were performed. All statistical tests were 2-tailed and were conducted with type I error probability fixed at 0.05. For continuous variables, the mean and standard deviation were described; for categorical variables, the percent distribution by category was described.
FSDS-DAO psychometric evaluation
Instrument descriptive characteristics
Individual item performance and frequency of responses on the FSDS-DAO and FSDS-R items and total scores, including rates of missing data, were examined at Visit 1 (Week 0), Visit 2 (Week 4), Visit 5 (Week 10), Visit 11 (Week 19), and Visit 12 (Week 23). Individual item performance and frequency of responses on the FSEP-R item scores, including rates of missing data, were examined at Visit 2 (Week 4), Visit 5 (Week 10), Visit 11 (Week 19), and Visit 12 (Week 23). Distributional characteristics of the FSFI were examined at Visit 1 (Week 0), Visit 2 (Week 4), Visit 5 (Week 10), Visit 11 (Week 19), and Visit 12 (Week 23). The GAQ and WITS-9 were examined at Visit 5 (Week 10), Visit 11 (Week 19), and Visit 12 (Week 23).
Confirmatory factor analysis (CFA)
A CFA was performed using EQS version 6.1 to determine whether a total score was justified or whether multiple subscales were appropriate with the addition of the new items. CFAs were performed with the data from Visit 1 (Week 0) and from Visit 12 (Week 23). Model fit was assessed using Bentler’s comparative fit index (CFI), with a CFI ≥ 0.90 indicating an acceptable model fit. Additional parameters of model fit that were evaluated were the root mean square error of approximation (RMSEA) and weighted root mean square residual (WRMR).
Reliability
Internal consistency reliability
Internal consistency reliability (Cronbach’s α) addressed the extent to which individual items within an instrument were related to one another [16]. Cronbach’s α was calculated for the FSDS-DAO and FSDS-R at Visit 1, Visit 2, Visit 5, and Visit 12. There were no tests of statistical significance for these estimations; α > 0.70 were generally considered acceptable for group-level data [17].
Test–retest reliability
Test–retest reliability was examined using intra-class correlations (ICCs), Spearman’s correlations, and paired t-tests of FSDS-DAO and FSDS-R scores from Visit 1 to Visit 2. ICC values > 0.70 are generally considered acceptable for establishing test–retest reliability [18].
Validity
Convergent validity
To examine convergent validity, the pattern and magnitude of the relationships of the FSDS-DAO and FSDS-R total scores with the FSEP-R, FSFI subscales and total score, GAQ item scores, WITS-9 total score, and number of satisfying sexual events (SSEs) were examined at Visit 5 and Visit 12 using Spearman’s rank correlation coefficients. Convergent validity was supported by correlations > 0.40 with questionnaires measuring similar concepts. It was expected that these measures would be moderately correlated, indicating that they measured related constructs but that they would not be correlated over 0.80 (indicating that they measured the same construct). Those measures that were more directly related to sexual arousal and level of desire were expected to have higher correlations, while scales related to pain were expected to have lower correlations with the FSDS-DAO and FSDS-R scores and potentially demonstrate divergent validity.
Known-groups validity
The ability of the FSDS-DAO and the FSDS-R to differentiate among groups of participants according to known indicators such as treatment group or disease severity/clinical status at baseline (FSFI total score; FSFI arousal, desire, and satisfaction subscale scores; number of SSEs; and GAQ Items 1 and 2) was assessed using paired t-tests and general linear models (PROC GLM) with Scheffe’s post hoc comparisons to evaluate mean differences among participant subgroups at Visit 12.
Responsiveness
Several analytic approaches were taken to evaluate the responsiveness of the FSDS-DAO and FSDS-R. Changes in the total FSDS-DAO score were calculated from baseline (Visit 1; Week 0) to Visit 12 (Week 23) for the overall sample. Effect size [19] and responsiveness statistic were also calculated. Effect size was interpreted as small (0.20), moderate (0.50), or large (0.80) using Cohen’s convention [20]. The responsiveness statistic was computed by subtracting the placebo change score from the treatment change score and dividing by the standard deviation (SD) of the placebo change score ([treatment change score − placebo change score]/SD of placebo change score). The responsiveness statistic provided the magnitude of change between treatment groups.
Ethical conduct
The study was conducted in accordance with Good Clinical Practice requirements, as described in guidelines of the International Conference on Harmonisation of Technical Requirements of Pharmaceuticals for Human Use (ICH) and in the Declaration of Helsinki. Each study site was reviewed by a central or local institutional review board (IRB) or ethics committee. The IRB approval numbers were Compass, 00519; WIRB, 20111036. Before any study procedures were initiated, written informed consent was obtained from each subject.