A Novel Method for Analysing Racial Bias: Appendix: Correlation Over Time

Written by escholar | Published 2024/05/14
Tech Story Tags: natural-language-processing | analysing-racial-bias | time-adjusted-toxicity | semantic-axes | socio-political-changes | google-ngrams-data | sentiment-analysis | bias-in-literature

TLDRIn this study, researchers propose a novel method to analyze representations of African Americans and White Americans in books between 1850 to 2000.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Muhammed Yusuf Kocyigit, Boston University;

(2) Anietie Andy, University of Pennsylvania;

(3) Derry Wijaya, Boston University.

Table of Links

Appendix: Correlation Over Time

We plot the correlation over-time to get a general picture of the embedding space in Figure 5. To support our observations we also conduct a Kolmogorov–Smirnov test. We use the two-sided test where the null-hypothesis is that the two empirical distributions are the same. We simply take two columns from our heatmap, ignore the rows where either of the entries are 1 and take the difference and then the absolute between the two lists. The resulting list consitutes our samples from the first distribution for our KS test. The samples from the second distribution is simply the same list for every other transition in our heatmap appended together, since the KS test is not dependent on the number of samples we can run the test for each transition.

Below in Table 3 and 4 the results for the KS test are given. The test simply tells if the two empirical distributions are likely to be from the same distribution. We observe that there are two cases where we can reject the null hypothesis relatively safely. One is for the White Americans heatmap between the years 1920-1930 and the second is for the African American heatmap between 1900-1910. For the first one we observe that the average similarity is well above the average similarity of samples from distribution 2 signaling that the null hypothesis was rejected not because the difference in this transition is big but the contrary. To our point, we observe that for the latter of the two cases the average similarity is much smaller.


Written by escholar | We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community
Published by HackerNoon on 2024/05/14