Invisible in Data, Excluded from Research: A Literature Review of Sexual Orientation and Gender Identity Data
The lesbian, gay, bisexual, transgender, and queer (LGBTQ+) community comprises some of the most vulnerable populations in the United States, yet they are underrepresented in research. The social crises facing the LGBTQ+ community remain understudied. Despite persistent efforts of advocacy groups across the country, data on LGBTQ+ people are sparse and, when available, hard to access. Invisible in data and excluded from the research process, LGBTQ+ people are
rendered missing from important conversations. The lack of data collected about LGBTQ+ people and issues materially impacts policy decisions and the (re)distribution of resources. Disparities go ignored, and entire communities are vanished. Comprehensive data are crucial to quantify the challenges facing the LGBTQ+ community, from health outcomes to economic precarity to discrimination. Data help policymakers and activists to advocate for LGBTQ+ rights and to triage the issues facing their communities.
Data collection efforts inclusive of LGBTQ+ demographics are urgently needed at this moment of American history. With the rise of anti-LGBTQ+ animus — alongside waves of far-right extremism and white supremacy 1 — anti-LGBTQ+ public demonstrations, threats of violence, and more have risen to the highest levels recorded. 2 Hostile legislation has gained momentum, as seen through the surge in anti-trans bills that restrict trans people’s full participation in public
life. In 2023, a record-breaking 600 anti-trans bills were introduced across 49 states, a nearly ten-fold increase from the number introduced in 2020. 3 All of this culminates in deadly violence that disproportionately takes the lives of Black trans women. Thirteen percent of the trans community is estimated to be Black, yet nearly three-quarters of known victims of anti-trans homicides are Black trans women.4 The LGBTQ+ community faces interwoven crises of homophobia, transphobia, anti-Blackness, racism, ableism, and more. Action is swiftly needed.
Data are useful tools for LGBTQ+ advocacy. Sexual orientation and gender identity (SOGI) data are vital in ascertaining how homophobia and transphobia move along other axes of power and identity, like race, class, and disability. Although public and private entities collect a wealth of data, demographic questions on LGBTQ+ people are routinely left out. For instance, the American Community Survey by the U. S. Census Bureau is a comprehensive survey that has been running since 2005, but it only began to collect data on same-sex couples in 2019.5 Furthermore, the Census — the gold standard of federal data collection — has failed to ask any SOGI questions despite continual pressure from LGBTQ+ organizations. 6 Data invisibility of LGBTQ+ people is not a coincidence. It is manufactured.
Even when surveys attempt to collect information on gender, they will often ask about sex assigned at birth but not gender identity, two different but often conflated subjects. Furthermore, most surveys handle topics of sex and gender in a binary way, which erases nonbinary, two-spirit, and intersex respondents. Specific and distinct categories of LGBTQ+ identity are often combined, making disparities within the community unclear.
This literature review aims to address the ongoing erasure of LGBTQ+ people in datasets by comparing public and private databases and measuring how often SOGI data are collected and which specific SOGI questions are asked. We examine whether data were collected based on the following categories: sexual orientation (lesbian, gay, bisexual, queer, etc.); transgender identity; nonbinary identity; two-spirit identity; and intersex status. The goals of this document are to visualize data invisibility and assist researchers and activists in accessing datasets on LGBTQ+ populations.