Skip to main content

Combining patient reported outcomes and EHR data to understand population level treatment needs: correcting for selection bias in the migraine signature study



Electronic health records (EHR) data can be used to understand population level quality of care especially when supplemented with patient reported data. However, survey non-response can result in biased population estimates. As a case study, we demonstrate that EHR and survey data can be combined to estimate primary care population prescription treatment status for migraine stratified by migraine disability, without and with adjustment for survey non-response bias. We selected disability as it is associated with survey participation and patterns of prescribing for migraine.


A stratified random sample of Sutter Health adult primary care (PC) patients completed a digital survey about headache, migraine, and migraine related disability. The survey data from respondents with migraine were combined with their EHR data to estimate the proportion who had prescription orders for acute or preventive migraine treatments. Separate proportions were also estimated for those with mild disability (denoted “mild migraine”) versus moderate to severe disability (denoted mod-severe migraine) without and with correction, using the inverse propensity weighting method, for non-response bias. We hypothesized that correction for non-response bias would result in smaller differences in proportions who had a treatment order by migraine disability status.


The response rate among 28,268 patients was 8.2%. Among survey respondents, 37.2% had an acute treatment order and 16.8% had a preventive treatment order. The response bias corrected proportions were 26.2% and 11.6%, respectively, and these estimates did not differ from the total source population estimates (i.e., 26.4% for acute treatments, 12.0% for preventive treatments), validating the correction method. Acute treatment orders proportions were 32.3% for mild migraine versus 37.3% for mod-severe migraine and preventive treatment order proportions were 12.0% for mild migraine and 17.7% for mod-severe migraine. The response bias corrected proportions for acute treatments were 24.8% for mild migraine and 26.6% for mod-severe migraine and the proportions for preventive treatment were 8.1% for mild migraine and 12.0% for mod-severe migraine.


In this study, we combined survey data with EHR data to better understand treatment needs among patients diagnosed with migraine. Migraine-related disability is directly related to preventive treatment orders but less so for acute treatments. Estimates of treatment status by self-reported disability status were substantially over-estimated among those with moderate to severe migraine-related disability without correction for non-response bias.


Quality of care is often assessed using Electronic health records (EHR) data or survey data. For underdiagnosed conditions, EHR data do not capture the undiagnosed cases and do not provide a means to consistently assess symptom severity or functional impact. Survey data with a diagnostic screener can capture undiagnosed cases and offers a direct means of documenting patient symptoms and functional status. Survey data are often limited, however, by modest participation rates and the potential for response bias. Healthcare systems are increasingly combining EHR and survey data to better evaluate population level gaps in treatment, but without recognizing how response bias can influence results [1, 2].

Herein, we present migraine as a use-case to demonstrate that the combined use of EHR data and survey data facilitates a better understanding of population health needs and overcomes the response bias common to traditional population-based surveys. Migraine is a prevalent, often disabling chronic disease which exemplifies other symptomatic and burdensome diseases where people may not seek care and those that do seek care may not be diagnosed or receive an appropriate treatment [3,4,5,6,7,8,9,10,11,12,13]. Survey data indicate that variation in use of acute and preventive medications is directly linked to migraine-related disability and to associated comorbidities [5,6,7, 12, 14]. However, survey data for migraine are also prone to non-response and reporting biases in ways that directly influence estimates of migraine severity and prescription medications use whether a survey is done within a healthcare system or in the general population [15,16,17,18,19,20]. Moreover, response probability is associated with the severity of the disease being studied, and tends to be lower among those with lower education and socio-economic status (SES) and who are non-Caucasian race, younger age, and male gender [15,16,17,18]. Finally, the validity of self-report also varies by some of these same factors, as education and SES levels influence ability to interpret questions and response options [19,20,21].

Studies of migraine prescription drug use that rely on EHR data or medical claims do not have the concerns of non-response and recall bias, but have other limitations. Medication claims or EHR data are limited to patients who had sought care for migraine, underestimating the size of the population with migraine, and most EHR data document medication orders, not whether the patient actually obtained the medication. Evidence suggests that approximately 20% of first prescriptions are not adjudicated [22,23,24,25]. Neither medication claims nor EHR data capture information on those with undiagnosed migraine. EHR data generally does not systematically include information on migraine patient reported outcomes (PROs), including pain and symptom intensity, days with headache, or associated disability, precluding population health assessment of need for care. More generally, few studies combine EHR data with patient reported outcomes to gain a more comprehensive understanding of patient needs versus the care they receive [26].

The objective of this analysis of data from the Migraine Signature Study (MSS) was to demonstrate a use-case that combines the strengths of EHR and survey data to accurately understand population health level needs and, in particular, the relation of migraine related disability and prescription medication care. We leveraged the availability of EHR data on all patients to both quantify population level prescription care and to adjust for non-response bias from those invited to participate in the complementary MSS survey on migraine diagnostic questions and migraine disability status.


Longitudinal EHR data were obtained on all adult primary care (PC) patients from the Sutter Health System. Survey data on headache and migraine experience were obtained on a stratified random sample of patients. We used EHR data to specifically estimate the proportion of survey respondents with migraine who were prescribed acute or preventive migraine treatments without and with correction for non-response bias. The Sutter Health Institutional Review Board approved the study.

Sources of data

The study population was comprised of adult PC patients who sought care from Sutter Health, a large, not-for-profit integrated healthcare network serving 22 counties in northern California. The Sutter Health Medical Network includes 1200 primary care providers, 126 neurologists, and a diversity of other ambulatory and inpatient care services. Sutter Heath uses a single instance of EpicCare (Epic) EHR.

EHR data

EHR data are organized around encounters and activities that include ambulatory, inpatient, emergency department, telephone, and video, among others. For this study, an individual was defined as an adult PC patient if they had at least one office visit to a PC department during the 5-year study period from 1/1/2013 to 12/31/2017 and were 18–75 years of age sometime during this time period. EHR data were extracted on eligible PC patients for all encounters occurring during the study period and included encounter type, date, and diagnosis (i.e., ICD-9 and ICD-10 codes for primary and secondary diagnosis), and, separately, medication ordered and diagnostic indication for the order.

Migraine Probability Algorithm (MPA) scores were calculated from EHR data on all patients to estimate the probability of having clinically diagnosed migraine [27]. The MPA score was validated in an independent health system, and, for a cut-point of MPA > 10, sensitivity was determined to be 85.0% and positive predictive value was 74.3%. The MPA is based on the number of encounter diagnoses with migraine, prescription orders for migraine, and whether specialty care for migraine was sought. MPA scores greater than 10 indicate a high probability of having migraine. MPA scores were calculated based on 5-years of longitudinal EHR data (MPA5Y) to identify PC patients with a history of care for migraine over the past 5 years and, separately, using the most recent 2-years of longitudinal EHR data (MPA2Y), to identify patients with recent care for migraine (Fig. 1).

Fig. 1
figure 1

Flowchart of population selection and sampling based on EHR data. MPA5Y > 10 Patient is likely to have had migraine at some point in the past 5 years. **MPA2Y > 10 Patient is likely to have used care for migraine in the past 2 years. ***Headache Care NOS5Y: Care for Headache Not Otherwise Specified during the study period

Patient survey and survey data

A stratified random sample of eligible PC patients were invited to complete a survey about headache and migraine history, symptoms, treatment and comorbidities among other data. See Additional file 1: Table S1 for survey details. The sampling strata were defined by probability of having migraine, whether care for migraine was recent (i.e., previous 2-years), and by whether, in addition to PC visits, the patients sought care for migraine from a neurologist (Fig. 1). The five sampling strata (Table 1), denoted A through E, were defined as described in Table 1 for PC population who had at least 1 clinical encounter in the 12-months before the survey:

  • Sampling Group A Recent Neurology Care for Migraine: Had care for migraine in the past 5 years (MPA5Y > 10) and in the past 24-months (MPA2Y > 10) and had care from a neurologist in the previous 5-Year MPA period.

  • Sampling Group B Recent Non-neurologic care for Migraine: Similar to “A”, but never had care from a neurologist.

  • Sampling Group C Remote Migraine Care: Had care for migraine in the past 5 years (MPA5Y > 10) but not in the past 24 months (MPA2Y ≤ 10).

  • Sampling Group D Recent Care for Headache NOS: Had care for headache NOS in the past 5 years, but not for migraine (MPA5Y < 10).

  • Sampling Group E No Care for Headache: Did not have care for either migraine or any type of headache in the past 5 years.

Table 1 Patient strata for the selection of a stratified random sample of patients, Migraine Signature Survey

To be eligible for analysis of the survey data, patients had to have at least 1 encounter of any type and for any condition in the 12-month period before September 13, 2018 when the first email was sent inviting participation in a web-based questionnaire (Additional file 1: Table S1). The questionnaire asked about headache and migraine frequency and symptoms, comorbidities, and patient reported outcomes, including the Migraine Disability Assessment Scale (MIDAS), as detailed below [28, 29].

The survey also included the American Migraine Study/American Migraine Prevalence and Prevention Study (AMS/AMPP) migraine diagnostic screener used to identify survey respondents meeting International Classification for Headache Disorders (ICHD) criteria for migraine [30]. Patients were invited to participate in the survey if they had an email in their EHR. Eligible patients from each stratum who were invited to participate could access a link within the email to the consent form and the questionnaire. The last invitation email was sent on December 8, 2018. Stratums A, B, and C were intentionally over-sampled to ensure that enough patients with migraine participated in the survey. Moreover, stratum specific response rates were monitored and email invitations to new additional patients were sent to ensure that stratum specific quota were met.

Definition of migraine diagnosis, treatment status, and migraine-related disability

EHR data

EHR data were used to define past (MPA5Y) and recent (MPA2Y) status on use of care for migraine and probability of having migraine and to specifically identify orders for acute and preventive migraine treatments.

Clinically diagnosed migraine

EHR documentation of use of care for migraine was used to identify those with clinically diagnosed migraine. An MPA5Y score greater than 10 was used to identify PC patients who had migraine care and an MPA2Y greater than 10 was used to identify patients with more recent migraine care.

Migraine treatment order and adjudication status

EHR data were used in the 12-month period before a completed survey was returned to identify migraine indicated acute and preventive treatment orders for each respondent. A randomly assigned 12-month period was used to extract the same data on non-respondents, where the distribution of the 12-month time periods was the same as that for respondents. Longitudinal EHR data for the 12-month period were specifically used to determine if a patient had been prescribed at least one acute treatment for migraine, at least one preventive treatment for migraine, the total count of prescription acute and preventive treatment orders, and the specific class of medications prescribed. We identified prescriptions that were ordered as a result of encounters with a headache diagnosis. Acute treatments were categorized as non-narcotic analgesics, narcotic analgesics, triptans, and other migraine specific treatments, and preventive treatments were categorized as beta-blockers, calcium channel blockers, antidepressants, anticonvulsants, and onabotulinumtoxinA. Most patients who were prescribed an acute or preventive treatment for migraine only received one to three prescriptions, where the overall distribution is highly right skewed. As such, for analysis, prescription order status was defined using a binary variable that distinguished 1–2 orders from 3 or more orders.

Migraine related disability

Survey data were used to separately identify respondents with active migraine and to assess disability impact of migraine.

Active migraine status was defined from survey responses by applying ICHD-3 criteria for migraine with or without aura to the AMS/AMPP migraine diagnostic questionnaire data [30, 31]. The screener has been previously validated and captures data relevant to the International Classification of Headache Disorders- 3rd edition (ICHD-3) criteria for migraine including headache pain characteristics, exacerbation by routine activity, and associated symptoms [30, 31].

Migraine related disability was assessed with the 5-item Migraine Disability Assessment (MIDAS) scale, a 5-item scale assessing missed and reduced productive days at work, school, or home as well as social and leisure activities during the previous 3 months due to headache [28, 29]. Responses were summed and grouped to identify disability by 4 grades: little or none (score of 0–5, Grade I), mild (score of 6–10, Grade II), moderate (score of 11–20, Grade III), and severe (score of ≥ 21, Grade IV) [28, 29]. MIDAS Grade is often used in clinical trials and specialty care practices as a measure of impact of migraine on functioning and as an indication of treatment need where a higher MIDAS Grade indicates a greater need for acute treatments and, in particular, for preventive treatments. Due to the skewed distribution, we dichotomized MIDAS into a low disability group (Scores 0–10, Grades I–II) and moderate-high disability group (Score 11+, Grades III–IV).

Statistical methods

Analyses were completed to determine the relation between current migraine-related disability status (low vs. moderate-high disability), as documented from survey responses, and EHR-based estimates of the proportion of patients prescribed acute and preventive migraine treatments. The relations were estimated without any corrections, with correction for sampling weights, and then with correction for both sampling weight and non-response bias, using the methods described below. Each individual who completed a survey was assigned a sampling weight that was derived for each strata as the inverse of the sampling fraction for that strata or 1.0 divided by the ratio of the number of respondents in a specific strata divided by the number of individuals from the source population in that respective strata (Table 1). The sampling weight is the size of the source sample divided by the number of respondents within that stratum and it is influenced by both the proportion of individuals in a stratum who were sent surveys and the proportion that completed surveys.

To correct for non-response bias we estimated response propensity scores using standard logistic regression models, where the dependent variable was response status (i.e. response = 1, non-response = 0) to the survey [32,33,34,35]. Independent variables were derived from EHR data in the 12-months before the patient response on demographics, migraine comorbidities, and migraine related variables specific to diagnoses, use of care, and medication orders. All analyses were stratified by sampling strata. The response propensity for each individual was estimated from the final model, along with the prediction error. The final weight (i.e. fully adjusted) that was assigned to each respondent was the product of the inverse of the strata-specific sampling fraction and the inverse of the individual predicted response propensity. The non-response bias corrected measures were estimated using weighted outcomes among respondents, and standard errors (SE) were estimated from bootstrapping with 1000 iterations [36].

Analyses were completed in three steps. First, we describe demographics and co-morbidities of the total source population along with that of survey respondents and non-respondents. Second, using EHR data from respondents only, we derived sampling weighted estimates for six demographics and comorbidities. We then corrected for non-response bias. These analyses were completed to validate that the adjusted estimates for all survey participants were similar to proportions in the total source population (Table 2). The proportion difference between respondents and non-respondents was assessed using the Chi-square test, and the proportion difference of estimates in the last two columns in Table 2 (i.e., estimated proportions corrected for sampling weights and sampling weights + response bias) versus the source population were tested using a proportion test for partially overlapped samples [37]. The same analyses were then completed for respondents who had EHR diagnosis of migraine from strata A–C to estimate the proportion who were prescribed an acute or preventive medication (Table 3).

Table 2 Demographic features and clinical diagnoses percentages by survey response status and without and with corrections for Sampling Weight and for Sampling Weight and Response Bias
Table 3 Diagnosed migraine patients with a migraine specific prescription order 12-months before their completed survey

Respondents from strata D and E were excluded from the latter analysis because the focus was on patients with EHR documentation of care for migraine. Treatment status was estimated as the proportion of respondents prescribed acute and preventive treatments in the 12-month period before the web-survey was completed. A final analysis was then completed using respondent data from strata A–D who met migraine criteria to determine the relation of MIDAS Grade and migraine treatment status. Estimates were stratified by those who had MIDAS Grades of I–II versus III–IV (Table 4), where a binomial test was applied for each row in Table 4 for the corrected proportions compared to the uncorrected proportions in the last two columns. We hypothesized that the corrected proportion would not differ from the uncorrected proportions (as the expected value) in Table 4. The binomial tests in these comparisons may over-estimate the significance of differences given that the assumption of independence after correction for the inverse propensity might not hold and where the variance of corrected estimate is likely to be underestimated. In addition, we accounted for multiple comparison for each acute and preventive medication using the Bonferroni correction (i.e. alpha = 0.05/4) (last two columns in Table 4).

Table 4 Survey diagnosed migraine patients by MIDAS grade with prescription orders 12-months before their completed survey

Analyses were performed in SAS (v9.4, SAS institute, Inc, Cary, NC). All statistical tests were two-sided with a p-value of less than 0.05 considered as a cut-off for statistical significance. We used the Proc Surveyfreq SAS procedure to account for weight.


Figure 1 describes the source population of eligible primary care patients and the number and percent of patients assigned to the five strata or sampling groups,  summarized in Fig. 1 and Table 1. A total 972,535 patients met eligibility criteria, having at least one episode of care in the 12 months before September 13, 2018 (Table 1) for any reason; 28,268 PC patients were randomly selected from the five strata and invited to participate in the web-survey, where 2305 (8.9%) responded. Response rates varied from 17.6% for stratum A to 5.2% for stratum D (Table 1). Results are first summarized for the total population of adult PC patient on demographics and diagnoses (Table 2) and then for patients with diagnosed migraine on the proportion with orders for acute and preventive treatments (Table 3). Tables 2 and 3 summarize estimates for the total relevant source population with estimates corrected for sampling weight and non-response bias. Table 4 describes the relation between MIDAS score derived from the survey and migraine prescription order status in the previous year without and with corrections for sampling weight and non-response bias.

Demographic and selected diagnostic features

The stratified random sample of survey respondents differed from the total source population on all demographic variables and all diagnostic variables summarized in Table 2. Separately, compared to non-respondents, respondents differed significantly on almost all of the demographic and diagnostic variables.

Corrected estimates for demographic and EHR diagnosis

Correction for sampling weight (Table 2) reduced differences in estimates for respondent EHR variables compared to the source population for all demographic and EHR diagnostic variables, but most of these variables were still significantly different from the source population except depression, autoimmune, neurologic and cerebrovascular conditions. After correcting for both sampling weight and non-response bias (Table 2, the rightmost column), estimates of distributions by demographic factors and clinical diagnoses were very similar and none of the comparisons to the total source population were significantly different except for one level in marital status (i.e. other/unknown).

Corrected estimates for acute and preventive treatments orders

We estimated the proportion of patients who were prescribed an acute treatment for migraine and, separately, the proportion prescribed a preventive treatment for migraine (Table 3). Though parallel to the analysis in Table 2, Table 3 includes data on patients from strata A, B, and C specific to migraine prescription orders. Patients from strata D and E were excluded because they did not have EHR documentation of migraine or a prescription treatment order for migraine. Compared to respondents, non-respondents were considerably less likely to have a prescription order in their EHR in the year before the survey for either an acute (33.3% vs. 20.4%, p value < 0.001) or preventive medication (i.e., 17.1% vs. 11.2%, p value < 0.001).

When corrected for sampling weights, the estimated proportion of survey respondents with an acute prescription treatment order was substantially greater than the source population estimate (37.2% vs. 26.2%) and greater than the uncorrected estimate (33.3%). The sampling weight corrected proportion with a preventive treatment order was also significantly greater than the source population proportion estimate (16.8% vs. 12.0%).

When we added a correction for non-response bias (Table 3, rightmost column), estimates compared to the source population dramatically improved (Table 3, leftmost column) for acute (26.2% vs. 26.4%) and preventive treatments (11.6% vs. 12.0%). None of the non-response bias corrected estimates for the overall acute and preventive medication orders or for medication specific orders were significantly different from those of the source population.

Corrected estimates for acute and preventive treatment orders by MIDAS grade

Survey data from respondents in strata A-D were used to understand the relation of MIDAS Grade and prescription medication orders. Among the 1719 survey respondents that met ICHD criteria for migraine, 1520 (88%) completed the MIDAS questionnaire (Additional file 1: Table S2). Completion rates for the MIDAS questionnaires varied by strata from 92.7% for stratum B to 81.0% for stratum D (Additional file 1: Table S2).

Patients with MIDAS Grades III–IV had more prescription orders for both acute and preventive treatments overall and for specific treatment classes than those with MIDAS Grades I–II (Table 4). Among those in MIDAS Grade I–II, the adjustment for sampling weights increased the estimated proportion that were prescribed an acute medication but had little or no effect on the estimated proportion with a preventive treatment order. Among those with MIDAS Grades III–IV, the sampling fraction corrected estimate was unchanged for the proportion prescribed an acute treatment and was lower and improved for the proportion prescribed a preventive treatment (Table 4).

When correcting for both sampling weight and response bias the estimated proportions with an acute or a preventive treatment were significantly lower than the uncorrected estimates for both the MIDAS Grades I–II and III–IV groups (Table 4, last two columns). The differences were larger, however, for patients in the MIDAS Grade III–IV group where the corrected estimates were substantially lower than the uncorrected estimates. MIDAS Grade III–IV patients were significantly more likely than MIDAS Grade I–II patients to have orders for acute treatment, for anticonvulsants (i.e., both 1–2 and 3+ orders), and for 3+ orders of anti-depressant preventive treatments. Multiple comparison adjusted testing revealed that each acute treatment was more likely to be prescribed to MIDAS Grade III–IV patients than to MIDAS Grade I–II patients, but no differences were observed by MIDAS Grade for preventive medication orders.


Assessing the value of care for many diseases can be challenging if patient reported information on disease onset, progression, severity, and other factors is essential to evaluating quality of care. This is especially true for conditions like migraine and the diversity of other chronic diseases with episodic manifestations for which there are no objective clinical or laboratory measures of disease status, severity or control [38]. PROs and, more generally, self-reported experience is central to evaluating population level care gaps. We consider the limitations of using administrative claims or EHR data (e.g., medication prescriptions) versus self-reported data (e.g., MIDAS) to understand quality of care for migraine and the unique advantages that come from combining these two data sources.

Population-based surveys are often used to understand the epidemiology of disease and related use of care. These types of surveys usually include clinically validated questionnaires to standardize detection of active disease and the measurement of disease severity, whether or not an individual sought care for the specific disease, and whether or not it was diagnosed. These approaches have been particularly useful for migraine as a substantial minority of people with migraine do not seek medical care and may not receive a medical diagnosis [6, 39]. Quality of care gaps can also be quantified with self-reported information on the experience of care. But survey data of health conditions are often inherently limited because only a minority of those invited will participate. Moreover, the likelihood of participation is usually related to having the disease of interest, to the severity of disease, and to the use of care [15,16,17]. Self-report is also prone to selective recall and other types of biases that may yield a distorted understanding of the relation between disease severity and utilization of care. The results of this study reveal response biases (Table 2, respondents vs non-respondents) that are consistent with previous surveys where females, non-Hispanics, Whites, and those with a greater disease burden are more likely to participate [15,16,17,18].

EHR or medical claims and pharmacy data reveal utilization of those who seek care for a specific disease. But ascertainment is often incomplete because many diseases are difficult to detect using diagnostic codes. Migraine is often assigned a non-specific diagnostic code (e.g., Headache NOS) [40]. In our study, the source population headache NOS group accounted for 48% (i.e., 69,704/144,201, Table 1) of primary care patients with a primary headache diagnosis [41]. The survey data in this report indicated that a substantial proportion of those with headache NOS have moderate to severe migraine, confirming prior work [41]. In addition, for migraine, in particular, survey data indicate that a substantial minority report never having sought care for migraine and, accordingly, would never be identified from EHR data [4, 6]. Finally, EHR data lack information on disease onset, severity, progression, and other meaningful outcomes for conditions like migraine deemed essential to identifying care gaps [42].

The complementary strengths and weaknesses of survey and EHR data offer a synergistic and powerful means of gaining a comprehensive and accurate assessment of disease burden and patterns of care within a health system. The synergy comes from the way in which EHR data, available on all patients, can be used to eliminate problems with recall bias and, importantly, overcome selection bias challenges from survey non-response. We specifically focused on migraine for this study because it is representative of many other symptomatic and burdensome diseases common to adolescents and working age populations where people often do not seek care, do not receive a diagnosis when they do seek care, or are under-treated [3,4,5,6,7,8].

We validated the method of adjustment for non-response bias using known source population data that included demographics, diagnosed comorbid diseases, and prescriptions for migraine or headache orders. Even with the relatively low response rates in each strata the corrected estimates using data from survey respondents were similar to the source population estimates (Tables 2, 3). Survey respondents were more likely to respond if they had at least one acute or preventive treatment order overall and for each of the medication specific classes. The response bias differences are in the expected direction given what is known about selective participation of those with more severe disease and when the purpose of the survey is known in advance.

Direct comparisons to previous studies on the use of prescription medications are difficult. Studies differ substantially in the source of data and in the factors that directly influence the estimated proportion of patients using a prescription medication [24, 25, 43, 44]. Studies also differ in the selection criteria used for study participation. Some focus on newly diagnosed patients while others enrolled patients from particular care settings such as the emergency department. Look back periods for assessing use of prescription medications also widely varied as did the means of assessing treatments used (e.g., pharmacy claims data or survey self-report).

The American Prevalence and Prevention Study (AMPP) offers the most relevant comparative survey data on prescription medications used by those with migraine [5]. AMPP had a higher response rate than more recent web-surveys conducted in the US. The AMPP Study is a longitudinal national survey that used the same validated diagnostic screener as was used in our study to identify adults meeting ICHD diagnostic criteria for migraine. Among all AMPP survey respondents that met active ICHD migraine criteria, 20.1% reported current use of an acute prescription treatment and 13% reported current use of a preventive medication for migraine. The percentage for current use of preventive treatment excludes coincidental use for other health problems as we did in our study. But our analysis of prescription medication for migraine was confined to primary care patients with a physician diagnosis of migraine, whereas the AMPP study included participants whether or not they reported a medical diagnosis and whether or not they had a recent episode of medical care for migraine [42]. Comparable numbers for the AMPP Study sample can be derived by limiting the denominator to the 56.2% of those with ICHD criteria for migraine that self-reported having received a medical diagnosis of migraine. Of these, 35.9% were using an acute prescription treatment and 23.0% were using a preventive treatment. By comparison, among those with a medical diagnosis of migraine in our study (Table 3), the corrected comparable estimates are considerably lower for acute prescription medications (i.e., 26.2% vs. 35.9% in AMPP) and preventive medications (11.6% vs. 23% in AMPP). The substantially lower estimate using Sutter Health MSS data may indicate that the actual use of prescription treatments by people with migraine is overestimated in population surveys. This could be explained by selective participation of those with more frequent, severe and disabling migraine who have been prescribed a treatment.

Medication orders for acute and preventive migraine treatments were greater for those with MIDAS Grade III–IV than for those with MIDAS Grade I–II, particularly for preventive treatment (Table 4). This trend was expected and is consistent with a previous study [44]. The differences between the sample-weighting-corrected estimates and the fully corrected estimates are striking. The fully corrected estimates for MIDAS Grade III–IV, in particular, are substantially lower than either the uncorrected or the sample-weighting-corrected estimates. Moreover, after accounting for multiple comparisons, no differences were observed by MIDAS Grade for preventive medication orders. This finding suggests that that individuals with MIDAS Grade III and IV migraine are more likely to respond to surveys than those in Grades I and II migraines. Though the statistical test (i.e. binomial test in Tables 3, 4) may over-estimate the statistical significance between difference of fully adjusted and unadjusted proportion, the quantity of the proportion difference for each treatment suggest that previous population surveys may substantially overestimate the use of acute and preventive treatments, especially for patients with MIDAS Grade III and IV [44].


Combining survey and EHR data has many potential applications to evaluating quality of care even when survey response rates are low. Because EHR data are available on all individuals whether or not they respond to a survey, statistical methods serve to adjust for non-response bias in ways that cannot be resolved by traditional approaches to motivating participation (e.g., gift card or other incentives) [15, 16]. While combining survey and EHR data opens many possibilities to gain a richer population level understanding of the quality of care that patients receive, this same approach may offer a more accurate means of gaining a general understanding the epidemiology of a diversity of health problems. Additional research is required to develop methods for routine internal and external validation and to better understand conditions under which substantial non-response bias may persist even after adjusting for propensity to respond.

Finally, we note that patient surveys and EHR analyses are often used as alternative methods to study disease burden and health care delivery. By collecting surveys in patient samples derived from an integrated delivery system, these two approaches can complement each other in many ways. While technically feasible and promising, linking survey data to EHR data raises issues of patient privacy, informed consent, and the development of strategies for optimizing survey participation and representativeness. This paper illustrates the promise of the method and one approach to addressing selection bias.

Availability of data and materials

Computing codes can be requested and shared upon Sutter IRB and Information security office approval. Due to HIPAA, patient EHR data cannot be shared with researchers outside of Sutter.



Electronic health records


Migraine Disability Assessment Scale


Socio-economic status


Primary care


Epic Care


International classification of diseases


Migraine probability algorithm


MPA based on five years of longitudinal EHR data


MPA based on most recent two years of longitudinal EHR data


Not otherwise specified


American migraine study


American migraine prevalence and prevention


International classification of headache disorders




Allodynia symptom checklist


Over the counter


Migraine treatment optimization questionnaire


Generalized anxiety disorder questionnaire


Patient health questionnaire


Post-traumatic stress disorder


Migraine signature study


  1. Donelan K, Barreto EA, Sossong S et al (2019) Patient and clinician experiences with telehealth for patient follow-up care. Am J Manag Care 25(1):40–44

    PubMed  Google Scholar 

  2. Brizuela V, Leslie HH, Sharma J, Langer A, Tunçalp Ö (2019) Measuring quality of care for all women and newborns: how do we know if we are doing it right? A review of facility assessment tools. Lancet Glob Health 7(5):e624–e632.

    Article  PubMed  Google Scholar 

  3. Bigal ME, Kolodner KB, Lafata JE, Leotta C, Lipton RB (2006) Patterns of medical diagnosis and treatment of migraine and probable migraine in a health plan. Cephalalgia 26(1):43–49.

    Article  PubMed  CAS  Google Scholar 

  4. Lipton RB, Scher AI, Kolodner K, Liberman J, Steiner TJ, Stewart WF (2002) Migraine in the United States: epidemiology and patterns of health care use. Neurology 58(6):885–894.

    Article  PubMed  CAS  Google Scholar 

  5. Diamond S, Bigal ME, Silberstein S, Loder E, Reed M, Lipton RB (2007) Patterns of diagnosis and acute and preventive treatment for migraine in the United States: results from the American Migraine Prevalence and Prevention study. Headache 47(3):355–363.

    Article  PubMed  Google Scholar 

  6. Lipton RB, Scher AI, Steiner TJ et al (2003) Patterns of health care utilization for migraine in England and in the United States. Neurology 60(3):441–448.

    Article  PubMed  CAS  Google Scholar 

  7. Lipton RB, Munjal S, Buse DC et al (2019) Unmet acute treatment needs from the 2017 migraine in America symptoms and treatment study. Headache 59(8):1310–1323.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Stewart WF, Ricci JA, Chee E, Morganstein D, Lipton R (2003) Lost productive time and cost due to common pain conditions in the US workforce. JAMA 290(18):2443–2454.

    Article  PubMed  CAS  Google Scholar 

  9. Stewart WF, Lipton RB, Celentano DD, Reed ML (1992) Prevalence of migraine headache in the United States. Relation to age, income, race, and other sociodemographic factors. JAMA 267(1):64–69

    Article  CAS  PubMed  Google Scholar 

  10. Hu XH, Markson LE, Lipton RB, Stewart WF, Berger ML (1999) Burden of migraine in the United States: disability and economic costs. Arch Intern Med 159(8):813–818.

    Article  PubMed  CAS  Google Scholar 

  11. Stewart WF, Lipton RB, Simon D (1996) Work-related disability: results from the American migraine study. Cephalalgia 16(4):231–8.

    Article  PubMed  CAS  Google Scholar 

  12. Lipton RB, Bigal ME, Diamond M et al (2007) Migraine prevalence, disease burden, and the need for preventive therapy. Neurology 68(5):343–349.

    Article  PubMed  CAS  Google Scholar 

  13. Smitherman TA, Burch R, Sheikh H, Loder E (2013) The prevalence, impact, and treatment of migraine and severe headaches in the United States: a review of statistics from national surveillance studies. Headache 53(3):427–436.

    Article  PubMed  Google Scholar 

  14. Woolley JM, Bonafede MM, Maiese BA, Lenz RA (2017) Migraine prophylaxis and acute treatment patterns among commercially insured patients in the United States. Headache 57(9):1399–1408.

    Article  PubMed  Google Scholar 

  15. Groves RM (2006) Nonresponse rates and nonresponse bias in household surveys. Public Opin Q 70(5):645–675.

    Article  Google Scholar 

  16. Groves R, Peytcheva E (2008) The impact of nonresponse rates on nonresponse bias: a meta-analysis. Public Opin Q 72:167–189.

    Article  Google Scholar 

  17. Brick J, Williams D (2013) Explaining rising nonresponse rates in cross-sectional surveys. Ann Am Acad Political Soc Sci 645:36–59.

    Article  Google Scholar 

  18. Sahlqvist S, Song Y, Bull F et al (2011) Effect of questionnaire length, personalisation and reminder type on response rate to a complex postal survey: randomised controlled trial. BMC Med Res Methodol 11:62.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Stull DE, Leidy NK, Parasuraman B, Chassany O (2009) Optimal recall periods for patient-reported outcomes: challenges and potential solutions. Curr Med Res Opin 25(4):929–942.

    Article  PubMed  Google Scholar 

  20. Bradburn NM, Rips LJ, Shevell SK (1987) Answering autobiographical questions: the impact of memory and inference on surveys. Science 236(4798):157–161.

    Article  PubMed  CAS  Google Scholar 

  21. Schmier JK, Halpern MT (2004) Patient recall and recall bias of health state and health status. Expert Rev Pharmacoecon Outcomes Res 4(2):159–163.

    Article  PubMed  Google Scholar 

  22. Shah NR, Hirsch AG, Zacker C, Taylor S, Wood GC, Stewart WF (2009) Factors associated with first-fill adherence rates for diabetic medications: a cohort study. J Gen Intern Med 24(2):233–237.

    Article  PubMed  Google Scholar 

  23. Shah NR, Hirsch AG, Zacker C et al (2009) Predictors of first-fill adherence for patients with hypertension. Am J Hypertens 22(4):392–396.

    Article  PubMed  Google Scholar 

  24. Polson M, Williams TD, Speicher LC, Mwamburi M, Staats PS, Tenaglia AT (2020) Concomitant medical conditions and total cost of care in patients with migraine: a real-world claims analysis. Am J Manag Care 26(1 Suppl):S3–S7.

    Article  PubMed  Google Scholar 

  25. Bonafede M, McMorrow D, Noxon V, Desai P, Sapra S, Silberstein S (2020) Care among migraine patients in a commercially insured population. Neurol Ther 9(1):93–103.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Lafata JE, Tunceli O, Cerghet M, Sharma KP, Lipton RB (2010) The use of migraine preventive medications among patients with and without migraine headaches. Cephalalgia 30(1):97–104.

    Article  PubMed  CAS  Google Scholar 

  27. Pressman A, Jacobson A, Eguilos R et al (2016) Prevalence of migraine in a diverse community–electronic methods for migraine ascertainment in a large integrated health plan. Cephalalgia 36(4):325–334.

    Article  PubMed  Google Scholar 

  28. Stewart WF, Lipton RB, Kolodner KB, Sawyer J, Lee C, Liberman JN (2000) Validity of the Migraine Disability Assessment (MIDAS) score in comparison to a diary-based measure in a population sample of migraine sufferers. Pain 88(1):41–52.

    Article  PubMed  Google Scholar 

  29. Stewart WF, Lipton RB, Kolodner K, Liberman J, Sawyer J (1999) Reliability of the migraine disability assessment score in a population-based sample of headache sufferers. Cephalalgia 19(2):107–14.

    Article  PubMed  CAS  Google Scholar 

  30. Headache Classification Committee of the International Headache Society (IHS) (2018) The international classification of headache disorders. Cephalalgia 38(1):1–211.

    Article  Google Scholar 

  31. Stewart WF, Lipton RB, Liberman J (1996) Variation in migraine prevalence by race. Neurology 47(1):52–59.

    Article  PubMed  CAS  Google Scholar 

  32. Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89(427):846–866.

    Article  Google Scholar 

  33. Hernán MA, Robins JM (2006) Estimating causal effects from epidemiological data. J Epidemiol Community Health 60(7):578–586.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Austin PC, Stuart EA (2015) Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med 34(28):3661–3679.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55.

    Article  Google Scholar 

  36. Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1(1):54–75

    Google Scholar 

  37. Choi SC, Stablein DM (1982) Practical tests for comparing two proportions with incomplete data. J R Stat Soc Ser C (Appl Stat) 31(3):256–262

    Google Scholar 

  38. Haut SR, Bigal ME, Lipton RB (2006) Chronic disorders with episodic manifestations: focus on epilepsy and migraine. Lancet Neurol 5(2):148–157.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Lipton RB, Munjal S, Alam A et al (2018) Migraine in America Symptoms and Treatment (MAST) study: baseline study methods, treatment patterns, and gender differences. Headache 58(9):1408–1426.

    Article  PubMed  Google Scholar 

  40. Elston Lafata J, Moon C, Leotta C, Kolodner K, Poisson L, Lipton RB (2004) The medical care utilization and costs associated with migraine headache. J Gen Intern Med 19(10):1005–1012.

    Article  PubMed  Google Scholar 

  41. Kolodner K, Lipton RB, Lafata JE et al (2004) Pharmacy and medical claims data identified migraine sufferers with high specificity but modest sensitivity. J Clin Epidemiol 57(9):962–972.

    Article  PubMed  Google Scholar 

  42. Lipton RB, Serrano D, Holland S, Fanning KM, Reed ML, Buse DC (2013) Barriers to the diagnosis and treatment of migraine: effects of sex, income, and headache features. Headache 53(1):81–92.

    Article  PubMed  Google Scholar 

  43. Young NP, Philpot LM, Vierkant RA et al (2019) Episodic and chronic migraine in primary care. Headache 59(7):1042–1051.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Silberstein SD, Lee L, Gandhi K, Fitzgerald T, Bell J, Cohen JM (2018) Health care resource utilization and migraine disability along the migraine continuum among patients treated for migraine. Headache 58(10):1579–1592.

    Article  PubMed  Google Scholar 

Download references


Michelle Goodreau, Alexandra Restall, Zijun Shen, Alex Scott.


Financial support for this manuscript was provided by Amgen Inc.

Author information

Authors and Affiliations



WFS developed the concept for the manuscript, the study design, and analysis plan and was a major contributor to the writing of the manuscript. XY performed all analysis, and co-led in reframing study design, and writing statistical methods and editing the manuscript. AP contributed as a study investigator and worked on the study conception, study design, data interpretation, manuscript review and revisions. SV and AJ contributed by creating an analytical dataset using Sutter Health EHR and survey data to define cohort, treatment patterns, MPA and MIDAS score. SV and AJ also contributed by reviewing and providing analytical design facts to write this manuscript. VC contributed to the conception and design of study and to the analysis and interpretation of data. DCB contributed to conception and design and revising the manuscript for intellectual content. RBL contributed as a study investigator and worked on the study design, data interpretation, manuscript review and revisions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Alice Pressman.

Ethics declarations

Ethics approval and consent to participate

This research was approved by the Sutter Health Institutional Review Board (SHIRB). Consent from participants was received prior to their participation in the research. A waiver of HIPAA authorization was obtained for recruitment purposes.

Consent for publication

Not applicable.

Competing interests

The Migraine Signature Study was funded by a research grant to Sutter Health and the Albert Einstein College of Medicine from Amgen. Walter F. Stewart has served as a consultant to Promius/Dr. Reddy and Allergan. Dawn C. Buse has served as a consultant to Amgen/Novartis, Allergan, Biohaven, Eli Lilly, Promius/Dr. Reddy’s, and Teva Pharmaceuticals. She is on the editorial board of Current Pain and Headache Reports. Richard B. Lipton serves on the editorial board of Neurology, as senior advisor to Headache, and as associate editor of Cephalalgia; he holds stock options in Biohaven Holdings and CtrlM Health. He receives research support from the NIH and FDA. He serves as consultant, advisory board member, has received honoraria from or research support from: Abbvie (Allergan), Amgen, Biohaven, Dr. Reddy’s (Promius), Electrocore, Eli Lilly, eNeura, Equinox, GlaxoSmithKline, Grifols, Lundbeck (Alder), Merck, Pernix, and Teva. He receives royalties from Wolff’s Headache 7th and 8th Edition, Oxford University Press, 2009, Wiley and Informa. Xiaowei Yan has no conflict of interest to claim. Alice Pressman has no conflict of interest to claim. Alice Jacobson has no conflict of interest to claim. Shruti Vaidya has no conflict of interest to claim. Victoria Chia is an employee of, and shareholder in, Amgen Inc.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Table S1. Survey data collected from patients in Strata A-D and Stratum E*. Table S2. MIDAS Grade distribution by sampling strata.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stewart, W.F., Yan, X., Pressman, A. et al. Combining patient reported outcomes and EHR data to understand population level treatment needs: correcting for selection bias in the migraine signature study. J Patient Rep Outcomes 5, 132 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: