How AI Models Gender and Sexual Orientation

by Algorithmic Bias (dot tech)April 23rd, 2025
Read on Terminal Reader
tldt arrow

Too Long; Didn't Read

This study examines how generative AI models identify and categorize gender and race based on textual cues. It highlights limitations in the models, including challenges in representing non-binary, transgender, and racially diverse identities, and explores how biased word lists influence AI-generated stories.
featured image - How AI Models Gender and Sexual Orientation
Algorithmic Bias (dot tech) HackerNoon profile picture
0-item

Authors:

(1) Evan Shieh, Young Data Scientists League ([email protected]);

(2) Faye-Marie Vassel, Stanford University;

(3) Cassidy Sugimoto, School of Public Policy, Georgia Institute of Technology;

(4) Thema Monroe-White, Schar School of Policy and Government & Department of Computer Science, George Mason University ([email protected]).

Abstract and 1 Introduction

1.1 Related Work and Contributions

2 Methods and Data Collection

2.1 Textual Identity Proxies and Socio-Psychological Harms

2.2 Modeling Gender, Sexual Orientation, and Race

3 Analysis

3.1 Harms of Omission

3.2 Harms of Subordination

3.3 Harms of Stereotyping

4 Discussion, Acknowledgements, and References


SUPPLEMENTAL MATERIALS

A OPERATIONALIZING POWER AND INTERSECTIONALITY

B EXTENDED TECHNICAL DETAILS

B.1 Modeling Gender and Sexual Orientation

B.2 Modeling Race

B.3 Automated Data Mining of Textual Cues

B.4 Representation Ratio

B.5 Subordination Ratio

B.6 Median Racialized Subordination Ratio

B.7 Extended Cues for Stereotype Analysis

B.8 Statistical Methods

C ADDITIONAL EXAMPLES

C.1 Most Common Names Generated by LM per Race

C.2 Additional Selected Examples of Full Synthetic Texts

D DATASHEET AND PUBLIC USE DISCLOSURES

D.1 Datasheet for Laissez-Faire Prompts Dataset

B EXTENDED TECHNICAL DETAILS

B.1 Modeling Gender and Sexual Orientation

We note that in the context of studies of real-world individuals, the gold standard for assessing identity is through voluntary self-identification [33, 53, 55]. Given our context of studying fictional characters generated by LMs, our study measures observed identity [33] from the perspective of the LMs to the degree that they may be considered “authors” of the text.


For modeling gender associations in textual cues, we utilize the concept of word lists that have been used in both studies on algorithmic bias in language models [29, 54] and social psychology [21, 22]. We extend prior word lists to capture nonbinary genders. Noting the potential volatility of such seed lexicons in bias research [84], we provide our complete list of gendered references with a mapping to broad gender categories in Table S6a.


Given a list of textual cues that we mine from each story (described in Supplemental B.3), we perform case and punctuation-insensitive matching on the word lists above to label observed gender. With the exception of transgender identities, the resulting categories map over to Census surveys on categorical gender [85]. If no such matches to the above lists exist for all textual references (e.g. as with first-person writing), we label gender as Unspecified. If we find matches across multiple gender categories, we label gender as Unsure. In the Love domain, we also measure bias against individuals by observed sexual orientation based on observed genders and characters (see Fig. 1).


We note several limitations to this approach for modeling gender and sexual orientation. First, categorical mapping on word lists does not capture real-world instances where people may choose gender pronouns from multiple categories (e.g. “they/she”) or neopronouns. Second, we are not able to effectively infer transgender identities, as such individuals may choose to adopt pronouns or references in any of the above categories despite maintaining a separate gender identity (furthermore, we observe no instances of the terms “transwoman” or “transman” in any of the generated stories). Third, our approach does not take into account sexual orientations that cannot be directly inferred from single snapshots of gender references. In order to better capture broadly omitted gender populations, we utilize search keywords to produce qualitative analyses (e.g., “transgender”) (see Supplemental B.7). That said, our choice of keywords is far from exhaustive and warrants continued research. To support such efforts, we open-source our collected data (see Supplemental D).


Table S6: Word Lists Used for Matching


B.2 Modeling Race

For modeling racial associations in textual cues, we use fractional counting, which has been shown to avoid issues of bias and algorithmic undercounting that may impact minoritized races in particular in comparison to categorical modeling [55]. Following this approach, a fractional racial likelihood can be assigned to a name based on open-sourced datasets of realworld individuals reporting self-identified race from settings such as mortgage applications [86] or voter registrations [56]. Specifically, we define racial likelihood as the proportion of individuals with a given name self-identifying as a given race:





Modeling observed race at an aggregate level enables us to better capture real-world occurrences where any given name may be chosen by individuals from a wide distribution of races, albeit at different statistical likelihoods for a given context or time period. Therefore, the choice of dataset(s) influences the degree to which fractional counting can account for various factors that shape name distribution (e.g. trends in migration or culture).


Due to the comparatively high prevalence of first names that are returned in response to our prompts we are unable to use U.S. Census data as it only releases surname information. Therefore, we base our fractional counting on two complementary datasets for which data on first names is present. The first dataset consists of open-sourced Florida Voter Registration Data from 2017 and 2022 [56], which contains names and self-identified races for 27,420,716 people comprising 447,170 unique first names. Of the seven racial categories in the latest OMB-proposed Census [53], the Florida Voter Registration Data contains five: White, Hispanic or Latino, Black, Asian Pacific Islander (API), and American Indian or Alaska Native (AI/AN). To be inclusive of non-binary genders, we refer to Hispanic or Latino as Latine. The two absent categories are Middle Eastern or North African (MENA) and Native Hawaiian or Pacific Islander (NH/PI), the latter of which is aggregated broadly into the “API” category. Omission or aggregation of these two races (e.g. into categories such as “Asian / Pacific Islander”) was a shortcoming we observed in all comparison datasets we considered with a large number of individuals that contained self-reported race by first name data [56, 86, 87].


Therefore, in the absence of self-reported race information, we identified an additional data source to approximate observed racial likelihood for MENA and NH/PI. We build off of the approach developed in [57] that constructs a dataset of named individuals on Wikipedia’s Living People category to compare disparities in academic honorees by country of origin as an approximation of race. Our approach leverages OMB’s proposed hierarchical race and ethnicity classifications to approximate race for the two missing categories by mapping existing country lists for both racial groups to Wikipedia’s country taxonomy. For MENA, we build upon OMB’s country list [53] that was proposed based on a study of MENA-identifying community members [88]. For NH/PI, we build upon guides for Asian American individuals in the health setting intended for disaggregate analysis [89]. Our mappings are listed in Table S6b.


In total, the Wikipedia scrape [57] consists of 706,165 people comprising 75,450 unique first names. Based on the lists above, 26,738 individuals map to MENA (with 6,766 unique first names), and 2,797 individuals map to NH/PI (with 1,808 unique first names). Using these mappings, we then can calculate racial likelihoods by name for both categories (in comparison to other countries not listed above).


In the absence of self-reported data, the datasets we use have several limitations. First, we note that countries of origin can only approximate race in the absence of self-reported data. Second, methods of creation and collection for both datasets themselves skew racial distribution, due to factors like voting restrictions and demographic bias of Wikipedia editors [90].


Using these datasets, we then perform exact string matching on first name to compute racial likelihoods. Across all 500K LM-generated stories, we observe 2928 unique first names, of which we are able to successfully match 2868, associating racial likelihoods by first name for 612,085 out of 612,181 total named characters (or 99.98% coverage).


This paper is available on arxiv under CC BY 4.0 DEED license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks