Authors:
(1) Evan Shieh, Young Data Scientists League ([email protected]);
(2) Faye-Marie Vassel, Stanford University;
(3) Cassidy Sugimoto, School of Public Policy, Georgia Institute of Technology;
(4) Thema Monroe-White, Schar School of Policy and Government & Department of Computer Science, George Mason University ([email protected]).
Table of Links
1.1 Related Work and Contributions
2.1 Textual Identity Proxies and Socio-Psychological Harms
2.2 Modeling Gender, Sexual Orientation, and Race
3 Analysis
4 Discussion, Acknowledgements, and References
SUPPLEMENTAL MATERIALS
A OPERATIONALIZING POWER AND INTERSECTIONALITY
B EXTENDED TECHNICAL DETAILS
B.1 Modeling Gender and Sexual Orientation
B.3 Automated Data Mining of Textual Cues
B.6 Median Racialized Subordination Ratio
B.7 Extended Cues for Stereotype Analysis
C ADDITIONAL EXAMPLES
C.1 Most Common Names Generated by LM per Race
C.2 Additional Selected Examples of Full Synthetic Texts
D DATASHEET AND PUBLIC USE DISCLOSURES
D.1 Datasheet for Laissez-Faire Prompts Dataset
2 METHODS AND DATA COLLECTION
We conduct our investigation on 500,000 synthetic texts generated by five publicly available generative language models: ChatGPT 3.5 and ChatGPT 4 (developed by Open AI), Llama 2 (Meta), PaLM 2 (Google), and Claude 2.0 (Anthropic). We base our selection of models on both the sizable amount of funding wielded by these companies and their investors (on the order of tens of billions in USD [47]), as well as the prominent policy roles that each company has played on the federal level. In July of 2023, the US White House secured voluntary commitments from each of these “leading artificial intelligence companies” to ensure that “products are safe before introducing them to the public” [48].
We query LMs with 100 unique open-ended prompts pertaining to 50 everyday scenarios across three core dimensions of social life situated within the context of the United States. Several principles guided our prompt design. First, prompts were designed to reflect real-world use cases including an AI writing assistant for students in the classroom [9, 12] and screenwriters in entertainment [15]. Second, each prompt uses the colloquial term “American”, which is common parlance to refer to members of the United States (i.e., “The American People”) regardless of their socio-economic background (i.e., race, ethnicity, citizenship, employment status, etc.). Even though “American” is a misnomer in that it can also be used to refer to members outside of the United States (e.g. individuals living in Central or South American nations), as we show in the results, language models also appear to interpret “American” to mean the United States, thus furthering U.S.-centric biases present in earlier technology platforms. Third, each domain is examined from an intersectional theoretical framework (see Supplement A) which describes how power is embedded in both social discourse and language [38]. Driven by this, we study how LMs generate textual responses in response to prompts that depict everyday power dynamics and “routinized forms of domination” [36]. For each scenario, we capture the effect of power by dividing our prompts into two treatments: one power-neutral condition and one power-laden condition, where the latter contains a dominant character and a subordinate one. Fourth, to obtain stories from a wide variety of contexts, our prompts span three primary domains that we call Learning (i.e., student interactions across K-12 academic subjects), Labor (i.e., workplace interactions across occupations from the U.S. Bureau of Labor Statistics), and Love (i.e., interpersonal interactions between romantic partners, friends, and siblings). In total, our study assesses 50 prompt scenarios: 15 for Learning, 15 for Labor, and 20 for Love (see Table 1 for examples).
Learning scenarios describe classroom interactions between students, spanning 15 academic subjects: nine (9) core subjects commonly taught in U.S. public K-12 schools, three (3) subjects from Career and Technical Education (CTE), and three (3) subjects from Advanced Placement (AP). Labor scenarios describe workplace interactions, and span 15 occupations categorized by the U.S. Bureau of Labor Statistics (BLS). For both of these domains, we base our selection of subjects and occupations to reflect a diversity of statistical representations by gender, class, and race, including subjects and occupations for which minoritized groups are statistically overrepresented in comparison to the 2022 U.S. Census [83] (see Tables S1-S2). Love scenarios describe interpersonal interactions that are subcategorized by interactions between (a) romantic partners, (b) friends, or (c) siblings. In each of these three subcategories, we design six shared scenarios capturing everyday interpersonal interactions (ranging from going shopping to doing chores). For romantic partners, we add two extension scenarios that capture dynamics specific to intimate relationships: (1) going on a date, and (2) moving to a new city. We make the decision to limit our scenarios to interpersonal interactions between two people in the interest of studying the effects of power (see next section) and while these prompt scenarios do not reflect the full diversity of experiences that comprise interpersonal interactions, we believe this framework offers a beachhead for future studies to assess an even wider variety of culturally relevant prompts, both within the U.S. and beyond. For each LM, set to default parameters, we collect 100K synthetic text generations (or 1,000 samples for each of the 100 unique prompts). We provide a complete list of prompt scenarios in Tables S3, S4, and S5. Data collection was conducted from August 16th to November 7th, 2023.
This paper is