Paper review: Mental Health Outcomes in Transgender and Nonbinary Youths Receiving Gender-Affirming Care

On Reddit I am, when I express my disappointment with the state of research on “gender-affirming” treatments for adolescents with gender dysphoria, at times given long lists of studies that purport to demonstrate the effectiveness of puberty blockers (PBs) and “gender-affirming” hormones (GAHs). These lists are so long that if I were to review each study, it would constitute the literature review for a PhD. I don’t have that kind of time, but I want to share my assessment of a few of these articles as I take them on. This represents my current thinking and, as always, is subject to change.

One such list of studies recently referenced “Mental Health Outcomes in Transgender and Nonbinary Youths Receiving Gender-Affirming Care” by Tordoff, Wanta, and Collin, et al., JAMA Network Open Vol. 5. No. 2., 15 Feb 2022. This post is a review of that article, offered as illustrative of the weaknesses I have seen in this literature so far.

Findings  In this prospective cohort of 104 TNB youths aged 13 to 20 years, receipt of gender-affirming care, including puberty blockers and gender-affirming hormones, was associated with 60% lower odds of moderate or severe depression and 73% lower odds of suicidality over a 12-month follow-up.

The word “associated” is important. This indicates a statistical correlation, not a causal finding. Looking at the paper, we see that it is a “prospective” study in which outcomes of depression, anxiety, and suicidality were followed over time relative to the beginning of treatment at a gender clinic. The treatment was “receipt of gender-affirming care, including puberty blockers and gender-affirming hormones”. The key result:

By the end of the study, 69 youths (66.3%) had received PBs, GAHs, or both interventions, while 35 youths had not received either intervention (33.7%). After adjustment for temporal trends and potential confounders, we observed 60% lower odds of depression and 73% lower odds of suicidality among youths who had initiated PBs or GAHs compared with youths who had not.

There is an attempt through statistical analysis to compensate for the fact that without randomization of the treatment group, there would be a bias between those who received and did not receive the puberty blockers and hormones which could itself explain the difference in outcomes. Income, race, sex, gender identity, etc. are all potentially included in the models.

A few issues arise on close inspection.

First, versus an actual experiment, observational studies of this sort are vulnerable to the possibility of confounding variables which were not thought of at time of analysis, or on which data was not collected when the study was run, affecting the outcome. One commenter suggests physical activity, BMI, and similar as likely confounders; the paper itself suggests psychotropic medications as a potential confounder. None of these are included in the analysis, nor any others anyone else might think up. With a randomized experiment, the possibility of such confounding variables is eliminated.

Second, I will highlight the “adjustment for temporal trends” mentioned in the limitations section. First, I mistook this for adjusting for seasonal trends, such as a tendency for people to be more depressed in the winter. However, this is not the case. Temporal trends refer to the differences in outcome at the different followup times (initial visit, 3 months after, 6 months after, 12 months after). How the “adjustment” is done is not detailed in the article. This is of interest because the key findings only arise after this “adjustment”; there would be no paper if not for the difference it makes. The lack of explicit detail on the procedure is concerning. The names of the two primary modes of analysis (“Model 1” and “Model 2”) seem symptomatic of a search for significant results by modifying the analysis, rather than the more robust approach of pre-registering an analysis and sticking to it.

Other issues are given in the “Limitations” section of the paper:

Our findings should be interpreted in light of the following limitations. This was a clinical sample of TNB youths, and there was likely selection bias toward youths with supportive caregivers who had resources to access a gender-affirming care clinic. Family support and access to care are associated with protection against poor mental health outcomes, and thus actual rates of depression, anxiety, and suicidality in nonclinical samples of TNB youths may differ. Youths who are unable to access gender-affirming care owing to a lack of family support or resources require particular emphasis in future research and advocacy. Our sample also primarily included White and transmasculine youths, limiting the generalizability of our findings. In addition, the need to reapproach participants for consent and assent for the 12-month survey likely contributed to attrition at this time point. There may also be residual confounding because we were unable to include a variable reflecting receipt of psychotropic medications that could be associated with depression, anxiety, and self-harm and suicidal thought outcomes. Additionally, we used symptom-based measures of depression, anxiety, and suicidality; further studies should include diagnostic evaluations by mental health practitioners to track depression, anxiety, gender dysphoria, suicidal ideation, and suicide attempts during gender care.

The obvious bias in the selection of people first into the clinic and then into the treatments is the greatest weakness of this study, for reasons the researchers themselves describe.

I highlight the word advocacy as it indicates that the researchers are not disinterested observers, but rather already believe “gender-affirming” treatment of dysphoric youth is a righteous cause. Researcher bias is a significant concern, see Ioannidis’ seminal paper, Why Most Published Research Findings are False. Teams which have a stance of advocacy rather than objectivity are more likely to choose methods which favor their preferred outcome.

They also indicate that there was attrition on the 12-month followup, meaning participants stopped responding to surveys. This could indicate that those whose treatment did not lead them to feel happier declined to participate for fear of disappointing the (surely very friendly and helpful) clinic staff.

Finally, as mentioned above, the study authors point out that use of psychiatric medications was not a variable they analyzed. So any benefit received (whether through placebo or other effect) from antidepressants for example is not accounted for.

The placebo effect hangs over this study in general. The placebo effect is the tendency for people to improve simply because they believe the treatment they receive is effective. This study’s results could be explained purely by placebo effect. The belief that “the gender clinic will help you” and that suppressing puberty and taking opposite-sex hormones will help is widespread among those seeking treatment for gender dysphoria at gender clinics—the self-selected population from which this study’s participants are drawn. In a sense, a placebo effect would mean that the treatment really is effective; but if it’s due to placebo, then the long-term consequences of suppressing puberty and taking opposite-sex hormones are hard to justify.

Another possibility unaccounted for by this study is simply regression to the mean. The study begins with a significant portion of participants experiencing “severe” depression, and the greatest effects are seen in those with the most severe self-reported depression. Beginning at one extreme of the normal distribution, the most likely thing to happen is simply for the outcome to move toward the mean over time, and that is exactly what is described in the study, though it is couched as being very significant, rather than completely expected. A proper accounting for this would require comparison to baseline progression of depression, anxiety, and suicidality among similar, severely-depressed youth among the general population.

When I dig into this “gender affirming care” research, I so far find it is of low quality. First and foremost is the absence as far as I have found of experimental studies, which are obviously what is called for. The sample sizes are always low as well, and there is apparently bias on the part of at least some researchers “rooting” for a particular outcome in which they already believe. I haven’t seen anything yet which persuades me that these treatments really are effective, at least for any particular reason other than that something is being done that people believe will be effective.


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *