A Novel Method for Analysing Racial Bias: Appendix: Toxicity Measurement

Written by escholar | Published 2024/05/14
Tech Story Tags: natural-language-processing | analysing-racial-bias | time-adjusted-toxicity | semantic-axes | socio-political-changes | google-ngrams-data | sentiment-analysis | bias-in-literature

TLDRIn this study, researchers propose a novel method to analyze representations of African Americans and White Americans in books between 1850 to 2000.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Muhammed Yusuf Kocyigit, Boston University;

(2) Anietie Andy, University of Pennsylvania;

(3) Derry Wijaya, Boston University.

Table of Links

Appendix: Toxicity Measurement

For toxicity measurements we use a filtering method to remove words that might lose their toxic meaning over time. We use the conservative subset from Hurtlex (Bassignana, Basile, and Patti 2018) and we include all categories off toxic words in our analysis. In Table 2 we see the words that are removed from the toxic words list. A few interesting observations are words relating to prostitution are generally removed in the early decades. This could be that while they could still have taboo, they had a big enough difference in how they relate to other words in that time that they were filtered out. Another interesting observation is that the word fascist was filtered out during the decades 1930 and 1940. This is before World War II where a lot of nationalist sentiment was becoming mainstream in the World which could also explain this change in the meaning.


Written by escholar | We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community
Published by HackerNoon on 2024/05/14