New Story

How AI Assigns Power Based on Race and Gender

by Algorithmic Bias (dot tech)April 22nd, 2025
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Language models don’t just omit—they subordinate. When prompts include power dynamics, minoritized characters are overwhelmingly cast in weaker roles. Feminized characters dominate as “star students” in Learning, but become powerless in Labor. The more racialized the name, the higher the likelihood of subordination. For instance, Latine masculinized characters are over 1,300x more likely to be portrayed as struggling students. White names dominate neutral and powerful roles—Sarah and John appear thousands of times as leads—while names like Maria, Ahmed, and Jamal appear disproportionately as subordinates. This hierarchy, backed by statistically significant subordination ratios, reveals how LMs amplify real-world power imbalances, turning race and gender into proxies for weakness.
featured image - How AI Assigns Power Based on Race and Gender
Algorithmic Bias (dot tech) HackerNoon profile picture
0-item

Authors:

(1) Evan Shieh, Young Data Scientists League ([email protected]);

(2) Faye-Marie Vassel, Stanford University;

(3) Cassidy Sugimoto, School of Public Policy, Georgia Institute of Technology;

(4) Thema Monroe-White, Schar School of Policy and Government & Department of Computer Science, George Mason University ([email protected]).

Abstract and 1 Introduction

1.1 Related Work and Contributions

2 Methods and Data Collection

2.1 Textual Identity Proxies and Socio-Psychological Harms

2.2 Modeling Gender, Sexual Orientation, and Race

3 Analysis

3.1 Harms of Omission

3.2 Harms of Subordination

3.3 Harms of Stereotyping

4 Discussion, Acknowledgements, and References


SUPPLEMENTAL MATERIALS

A OPERATIONALIZING POWER AND INTERSECTIONALITY

B EXTENDED TECHNICAL DETAILS

B.1 Modeling Gender and Sexual Orientation

B.2 Modeling Race

B.3 Automated Data Mining of Textual Cues

B.4 Representation Ratio

B.5 Subordination Ratio

B.6 Median Racialized Subordination Ratio

B.7 Extended Cues for Stereotype Analysis

B.8 Statistical Methods

C ADDITIONAL EXAMPLES

C.1 Most Common Names Generated by LM per Race

C.2 Additional Selected Examples of Full Synthetic Texts

D DATASHEET AND PUBLIC USE DISCLOSURES

D.1 Datasheet for Laissez-Faire Prompts Dataset

3.2 Harms of Subordination

Representation from minoritized groups drastically increases when power dynamics are added to the prompts, specifically with the introduction of a subordinate character (Table 1). Broadly, we find that race and gender-minoritized characters appear predominantly in portrayals where they are seeking help or powerless. We quantify their relative frequency using the subordination ratio (see Equation 4), which we define as the proportion of a demographic observed in the subordinate role compared to the dominant role. Fig. 2a displays overall subordination ratios at the intersection of race and gender.


Figure 2. Overall Subordination Ratios by Gender and Race. 2a shows subordination ratios across all domains and models, increasing from left to right. Ratios for each model are indicated by different symbols plotted on a log scale, with a bar showing the median across all five models. Redder colors represent greater degrees of statistical confidence (p-values for the ratio distribution), compared against the null hypothesis (subordination ratio = 1, dotted). 2b shows the median subordination values across all five models by gender, race, and domain. Values above 1 indicate greater degrees of subordination and values below 1 indicate greater degrees of domination.


This approach allows us to focus on relative differences in the portrayal of characters when power-laden prompts are introduced. If the subordination ratio is less than 1, we observe dominance; if the subordination ratio is greater than 1, we observe subordination; and if the subordination ratio is 1, then the demographic is neutral (independent from power dynamics).



Overall, feminized characters are generally dominant in the Learning domain (i.e., subordination < 1, meaning they are more likely to be portrayed as a “star student”). However, they hold broadly subordinated positions in the Labor domain (i.e., subordination > 1 – see Fig. 2a,b). White feminized characters are uniformly dominant in stories across all five models in Learning (median subordination: 0.25), while White masculinized characters are uniformly dominant in Labor (median subordination: 0.69). For Love, most models with the exception of PaLM2 and ChatGPT4 portray White feminized characters as dominant (median subordination: 0.73). We observe that for any combination of domain and model, at least one of White feminized or White masculinized is dominant (p < .001).


The same universal access to power is not afforded when considering other combinations of race and gender. Nonbinary intersections across all races tend to appear as more subordinated (although these results are not significant for most populations, due to omission as shown in Figure 1d). As shown in Figure 3, an even more striking result appears when we examine names that are increasingly likely to be associated with one race (measured using fractionalized counting – see Equation 1). With few exceptions (e.g., PaLM2 tends to repeat a single high-likelihood Black name, “Amari,” as a star student in Learning), the models respond to greater degrees of racialization with greater degrees of subordination for all races except White, as shown in Figures 3a and 3b (recall that LMs do not produce high-likelihood racialized names for NH/PI and AI/AN as shown in Fig. 1c, hence these two categories are missing from Figure 3).


To quantify how the subordination ratio varies across names of increasing degrees of racialization, we introduce the median racialized subordination ratio to quantify subordination across a range of possible racial thresholds. First, we control for possible confounding effects of textual cues beyond name by conditioning on gender references (pronouns, titles, etc.). Then, for each intersection of race and gender we compute the median of all subordination ratios for names above a variable likelihood threshold t as defined in Equation (5). With sufficiently granular t, this statistic measures subordination while taking the spectrum of racial likelihoods into account. For our experiments, we set t ∈ [1, 2, … 100].



Figure 3c shows intersectional median racialized subordination ratios by race and gender. We find large median subordination ratios for every binary gender intersection of Asian, Black, Latine, and MENA characters across nearly all models and domains (recall that for non-binary characters, LMs do not produce a significant number of high-likelihood racialized names for any race except White, hence our focus on binary genders for this analysis). In 86.67% of all cases (i.e. 104 of 120 table cells) minoritized races are subordinated compared to 3% of all cases for White names (i.e. 1 of 30 cells). The magnitude of subordination ratios we observe is staggering, even at the median level. In Learning, Latine masculinized students are portrayed by Claude2.0 as 1,308.6 times more likely to be subordinated (i.e. a struggling student) than dominant (i.e. a star student). Asian feminized characters reach subordination levels of over 100 for three different models (172.6 for ChatGPT4 in Learning, 352.2 for Claude2.0 in Labor, and 160.6 for PaLM2 in Labor). Black and MENA masculinized characters are subordinated on a similar order of magnitude by PaLM2 (83.5 for Love and 350.7 for Labor, respectively).


Figure 3. Subordination Ratios by Name and Racial Likelihoods. 3a shows subordination ratios, increasing from left to right per plot, of unique first names across all LMs, by race for which likelihoods vary (the models do not generate high likelihood NH/PI or AI/AN names as shown in 1c). When a name has 0 occurrences in either dominant or subordinated roles, we impute using Laplace smoothing. 3b plots overall subordination across all models above a racial likelihood threshold ranging from 0 to 100. 3c. shows the median subordination ratio taken across all integer thresholds from 0 to 100, controlling for the effects of gender and categorized by domain, model, race, and gender (for non-binary characters, the models do not generate high likelihood racial names as shown in 1d).


Table 3: Most Common Highly Racialized Names by Race and Gender, Domain and Power Condition


To further illustrate this subordination by example, in Table 3 we provide counts for the most common highly racialized names across LMs by race, gender, domain, and power condition (baseline is power-neutral; dominant and subordinated are power-laden). Asian, Black, Latine, and MENA names are several orders of magnitude more likely to be subordinated when a power dynamic is introduced. By contrast, White names are several orders of magnitude more likely to appear than minoritized names in baseline and dominant positions. In the Learning domain, Sarah (74.9% White) and John (88.0% White) appear 11,699 and 5,915 times, respectively, in the baseline condition; and 10,925 and 5,239 times, respectively, in the dominant condition. The next most common name, Maria (72.3% Latine), is a distant third, appearing just 550 times in the baseline condition and 364 times in the dominant condition.


Alternatively, when it comes to the subordinated roles, this dynamic is reversed. Maria appears subordinated 13,580 times compared to 5,939 for Sarah and 3,005 for John (a relative difference of 229% and 452% respectively) in Learning. Whereas Maria is significantly more likely to be portrayed as a struggling student than a star student, the opposite is true for Sarah and John. This reversal pattern of subordination extends to masculinized Latine, Black, MENA and Asian names. For example, in the Learning domain, Juan (86.9% Latine) and Jamal (73.4% Black) are respectively 184.41 and 5.28 times more likely to hold a subordinated role than a dominant one. The most commonly occurring masculinized Asian (i.e., Hiroshi, 66.7% Asian) and MENA names (Ahmed, 71.2% MENA) do not appear at all in either baseline or dominant positions in Learning, despite the latter appearing hundreds of times as subordinated. Of the most frequently occurring race-minoritized names, only two appear more frequently in dominant than subordinated roles: Amari (86.4% Black; 1251 stories); and Priya (68.2% Asian; 52 stories), both in the Learning condition (and these portrayals are generated exclusively by PaLM2). In Labor and Love, these exceptions disappear, and all of the most common minoritized names for both masculinized and feminized characters are predominantly subordinated. This pattern extends beyond the most common minoritized names (see Figure 3a); we also provide a larger sample of names in Tables S10 and S11(a-e).


This paper is available on arxiv under CC BY 4.0 DEED license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks