We assessed utilities for IFN-α 2b toxicities with questionnaires instead of computerised interviews, and we used a one-item utility assessment instead of iterative probability variation. Nevertheless, patients seemed to manage the difficult task of the SG well: The rate of patients who misordered scenarios was very similar to the reference study with 11.3% vs. 11.2% (which of course does not imply that the remaining patients in both studies necessarily understand the SG task correctly). The mean utilities were also quite similar. This convergence was all the more surprising as there are further differences between the two studies, with the reference study being conducted in the U.S. instead of Germany and more than a decade earlier. However, median utility values were less similar, and a higher proportion of patients felt upset by the study and a lower proportion of patients considered their responses informative and well thought through. The latter may either indicate that patients indeed feel more comfortable with a SG when conducted in a face-to-face interview. Alternatively, the difference could be due to social desirability bias, with people being more hesitant to criticise the study in a face-to-face interview than in a questionnaire. The high proportion of participants who stated feeling upset by the questions in both studies may point at a general problem with the SG approach which asks respondents to hypothetically trade the risk of instant death against impaired health. Another reason may be that participants were confronted with information on side effects of melanoma treatment and the possibility of melanoma recurrence; this may have been perceived as threatening, even though our participants had experienced low-risk melanoma only.
In this study, 47% of patients were female, which reflects the gender distribution in patients with melanoma in Germany [10]. However, while the median age at melanoma diagnosis is 64 years, our sample was younger with a median age of 53 [10].
As a limitation, we could only compare our results with the in-person computerised SG interview used in the reference study, but not with other assessment methods such as computerised self-completion SG without interviewers or non-computerised in-person interviews. We also do not know whether participants understood the SG task correctly in both our and the reference study, even if they had no missing values and did not misorder scenarios.
It should also be noted that in this study, a utility of 1 does not equal full health but the health state without treatment side effects or recurrence of melanoma as described in the scenarios. This is because the second option (the pill) is described as preventing the respective scenario, but not as preventing any other health impairment. Utilities found in this study are therefore not comparable with utilities ranging from “dead” to “perfect health” without adjustment [11]. In addition, we did not allow for scenarios to be rated worse than death, which would lead to negative utility values. As both are also true for the reference study, this does not impair the comparison between the two approaches that this manuscript targets but should be considered in future uses of the paper-based one-item approach.
In the reference study, patients were presented with both chance of survival (p) and death risk (1−p) of the treatment. In our study, we had to decide whether to ask for p or 1−p because patients should provide a specific number instead of deciding for one out of two options. We chose to ask for 1−p (risk of death) for two reasons. One, this means that both options are framed in the same negative direction (inconvenience, side effects, symptoms vs. risk of death). Two, we felt that this allowed for a more comprehensible SG question. However, had we asked for the minimum chance of survival instead, patients may have been willing to accept a riskier treatment, as positive framing is associated with the treatment being perceived as less harmful [12], resulting in lower utility values.
In conclusion, the paper questionnaire-based one-item utility elicitation used in this study resulted in very similar mean (but not median) utility values as the computerised face-to-face interviews in the reference study [7]. It may be a feasible—and cost-saving—alternative in situations where interview and/or computerised SG is difficult, for example if patients shall be reached by mail over a large geographic area and for patients who are not computer-literate or do not have access to a computer. Thus, further research on the reliability and validity of this approach is warranted.