Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Aim to analyze political polarization in US political system
  • Use Language Models to help candidates make informed decisions
  • Help voters understand candidates views on economy, healthcare, education, etc.
  • Dataset extracted from Wikipedia spanning 120 years
  • Language model based method to analyze polarization
  • Data divided into 2 parts: background and political information
  • Hypothesis: political views should be based on reason and independent of factors like birthplace, alma mater, etc.
  • Data split into 4 phases chronologically to understand if/how polarization changes
  • Data cleaned to remove biases
  • Results from classical language models (Word2Vec, Doc2Vec)
  • Use Longformer (transformer based encoder) to find nearest neighbors of each candidate based on political view and background

Paper Content

Introduction

  • Polarization among Republican and Democratic parties has been studied for a long time
  • Online discussion has become polarized
  • Polarization can affect decision-making ability of a candidate
  • Natural Language techniques used to measure polarization
  • Classical methods (Word2Vec, Doc2Vec) used to find polarization
  • More sophisticated models (Transformers, Longformers) used to project candidate-specific data
  • Nearest neighbors used to measure how polarized a candidate is
  • KhudaBukhsh et al. 2021 talks about political polarization online
  • Bhatt et al. 2018 discusses the impacts of hyper-partisan websites
  • Overt support for either a Democrat or Republican is taken to be an indicator of the site being either Liberal or Conservative
  • KhudaBukhsh et al. 2022 shows the polarization in TV media and fringe new networks
  • DeSilver 2022 claims that the candidates become polarized and moved away from the center over the years
  • Belcastro et al. 2020 demonstrates that Political Polarization can be mapped with the help of Neural Networks
  • Khadilkar, KhudaBukhsh, and Mitchell 2022 looks into gender and racial bias in movies
  • Rajani et al. 2019 tries to improve speech-based models on their ability to verbalize the reasoning that they learned during training
  • Devlin et al. 2018 introduces BERT
  • Petroni et al. 2019 demonstrates the ability of pretrained high-capacity models like BERT and ELMo to be used as knowledge repositories
  • Palakodety, KhudaBukhsh, and Carbonell 2020 demonstrates the ability of BERT and similar LM’s to track community perception
  • Hamilton, Leskovec, and Jurafsky 2016 proposes a robust method by using embeddings to counter the problem of word meaning changing semantically with context

Data collection and processing

  • Scraped Wikipedia for list of politicians
  • Stored label, party, and metadata for each instance
  • Annotated data into 3 categories: Background, Political, Other
  • Wikipedia page sections not fixed format, grouped into single category
  • 1656 categories merged into 3 categories for 1631 instances
  • Data cleaned with NER model and regular expressions

Language model

  • Natural Language Processing based applications are dominated by transformer-based language models
  • Experiments were done using Doc2Vec and Word2Vec models
  • Doc2Vec model was built from scratch with raw data
  • Classification accuracy of 59.52% with political data and 61.846% with background data
  • Word2Vec tests were run on pretrained models and models built from scratch
  • Nearest neighbor approach led to interesting insights

Main analysis

  • RoBERTa caused a drop in classification
  • Longformer increases the score significantly
  • Longformer large is the best performing model
  • BigBird also used to analyze
  • Calculate global attention scores to understand political polarization
  • Background of a candidate can help identify political leaning

Results

  • Tested different models with annotated raw dataset to understand polarization in text
  • Leveraged attention mechanism of Longformer model to find words with highest attention scores
  • Designed interactive website to understand if polarization exists and estimate politician’s polarization

Future work

  • Aim to use other metrics to measure political polarization
  • Use attention tokens to look at ratio of tokens from background to political
  • Figures 1-6 show webscraping, background, political, and nearest neighboring words
  • Binary SVM Classification Results