paint-brush
Media Slant: Researching the Text Similarity Between Newspapers and TV Stationsby@mediabias

Media Slant: Researching the Text Similarity Between Newspapers and TV Stations

tldt arrow

Too Long; Didn't Read

When studying the mechanisms in Section 6, we require topic labels for the news articles – to classify them as either local or non-local news.
featured image - Media Slant: Researching the Text Similarity Between Newspapers and TV Stations
Tech Media Bias [Research Publication] HackerNoon profile picture
0-item

Abstract and 1 Introduction 2. Data

3. Measuring Media Slant and 3.1. Text pre-processing and featurization

3.2. Classifying transcripts by TV source

3.3. Text similarity between newspapers and TV stations and 3.4. Topic model

4. Econometric Framework

4.1. Instrumental variables specification

4.2. Instrument first stage and validity

5. Results

5.1. Main results

5.2. Robustness checks

6. Mechanisms and Heterogeneity

6.1. Local vs. national or international news content

6.2. Cable news media slant polarizes local newspapers

7. Conclusion and References


Online Appendices

A. Data Appendix

A.1. Newspaper articles

A.2. Alternative county matching of newspapers and A.3. Filtering of the article snippets

A.4. Included prime-time TV shows and A.5. Summary statistics

B. Methods Appendix, B.1. Text pre-processing and B.2. Bigrams most predictive for FNC or CNN/MSNBC

B.3. Human validation of NLP model

B.4. Distribution of Fox News similarity in newspapers and B.5. Example articles by Fox News similarity

B.6. Topics from the newspaper-based LDA model

C. Results Appendix

C.1. First stage results and C.2. Instrument exogeneity

C.3. Placebo: Content similarity in 1995/96

C.4. OLS results

C.5. Reduced form results

C.6. Sub-samples: Newspaper headquarters and other counties and C.7. Robustness: Alternative county matching

C.8. Robustness: Historical circulation weights and C.9. Robustness: Relative circulation weights

C.10. Robustness: Absolute and relative FNC viewership and C.11. Robustness: Dropping observations and clustering

C.12. Mechanisms: Language features and topics

C.13. Mechanisms: Descriptive Evidence on Demand Side

C.14. Mechanisms: Slant contagion and polarization

3.3. Text similarity between newspapers and TV stations

3.4. Topic model

When studying the mechanisms in Section 6, we require topic labels for the news articles – to classify them as either local or non-local news. We use the Latent Dirichlet Allocation (LDA) topic modeling approach of Blei et al. (2003). We build the topic model based on a random sample of 1 million newspaper article snippets, specifying 128 topics. [7] The topics are labeled manually based on the associated words (see Appendix B.6). We then use the trained model to assign topic(s) to all newspaper articles.


This paper is available on arxiv under CC 4.0 license.


[7] LDA is the standard approach for topic modeling in social science (e.g., Hansen et al., 2018; Bybee et al., 2020). We use the online variational Bayes (VB) implementation by Hoffman et al. (2010). To select the number of topics, we started with 32 and doubled the topic number until they became largely interpretable to humans.

Authors:

(1) Philine Widmer, ETH Zürich and [email protected];

(2) Sergio Galletta, ETH Zürich and [email protected];

(3) Elliott Ash, ETH Zürich and [email protected].