Abstract and 1 Introduction 2. Data
3. Measuring Media Slant and 3.1. Text pre-processing and featurization
3.2. Classifying transcripts by TV source
3.3. Text similarity between newspapers and TV stations and 3.4. Topic model
4. Econometric Framework
4.1. Instrumental variables specification
4.2. Instrument first stage and validity
5. Results
6. Mechanisms and Heterogeneity
6.1. Local vs. national or international news content
6.2. Cable news media slant polarizes local newspapers
Online Appendices
A. Data Appendix
A.2. Alternative county matching of newspapers and A.3. Filtering of the article snippets
A.4. Included prime-time TV shows and A.5. Summary statistics
B. Methods Appendix, B.1. Text pre-processing and B.2. Bigrams most predictive for FNC or CNN/MSNBC
B.3. Human validation of NLP model
B.6. Topics from the newspaper-based LDA model
C. Results Appendix
C.1. First stage results and C.2. Instrument exogeneity
C.3. Placebo: Content similarity in 1995/96
C.8. Robustness: Historical circulation weights and C.9. Robustness: Relative circulation weights
C.12. Mechanisms: Language features and topics
C.13. Mechanisms: Descriptive Evidence on Demand Side
C.14. Mechanisms: Slant contagion and polarization
This section describes how we construct the language measures used as outcomes in our regression analysis. We aim to capture the textual similarity between (i) the newspaper article snippets and (ii) the TV show transcripts. Therefore, we implement a supervised machine-learning approach to predict if a newspaper article’s content resembles that from a particular TV station (FNC or CNN/MSNBC).[4]
First, we preprocess the newspaper articles and TV transcripts, stem all words, and form bigrams (two-word phrases), see details in Appendix B.1.
The frequency threshold excludes infrequent bigrams that are highly distinctive for a given channel but carry little substantive political or topical information. This procedure produces a vocabulary V with 65,000 bigrams. Supervised learning models using n-grams are rarely sensitive to specific pre-processing and featurization choices (e.g., Denny and Spirling, 2018).
This paper is available on arxiv under CC 4.0 license.
[4] The approach is related to Gentzkow et al. (2019b), who also use a regularized linear model with n-gram inputs. Our different approach reflects a different scientific objective. Gentzkow et al. (2019b) are interested in measuring the level of polarization between groups in language. We are interested in forming a predicted probability of the source of a document for scoring influence in a second corpus. Other related methods are Peterson and Spirling (2018) and Osnabrügge et al. (2021).
[5] We have fewer snippets from FNC than from CNN/MSNBC. Thus, we randomly under-sample the snippets from the CNN/MSNBC corpus to match the number of snippets from FNC.
Authors:
(1) Philine Widmer, ETH Zürich and [email protected];
(2) Sergio Galletta, ETH Zürich and [email protected];
(3) Elliott Ash, ETH Zürich and [email protected].