Postdoctoral Research Fellow
London School of Economics
Machine learning is a fancy term for a set of statistical methods that are used to make predictions based on patterns in data
Unlike traditional statistical methods, machine learning methods are (usually) designed to make predictions rather than test hypotheses or estimate causal effects
Simple example: predicting an outcome based on a predictor variable – e.g., predicting whether a person will turnout to vote based on their income (\(P(Y=turnout|income)\))
Introduction to Machine Learning
Introduction to Machine Learning
Introduction to Machine Learning
Introduction to Machine Learning
Introduction to Machine Learning
Note: The Iris dataset represents 3 kind of Iris flowers (Setosa, Versicolour
and Virginica) with 4 attributes: sepal length, sepal width, petal length and petal width.
Introduction to Machine Learning
ML in Causal inference
ML in Causal inference
Propensity score matching & weighting (Rosenbaum and Rubin 1983; Horvitz and Thompson 1952)
Traditional matching methods: nearest neighbor matching, radius matching, kernel matching, Mahalanobis distance etc.
ML in Causal inference
ML in Causal inference
ML in Causal inference
ML in Natural language processing
ML in Natural language processing
You shall know a word by the company it keeps
(Firth 1957)
We can model the relationship between words using word embeddings
King
= [0.1, 0.2, 0.3, 0.4, 0.5]
Man
= [0.2, 0.3, 0.4, 0.5, 0.6]
Woman
= [0.3, 0.5, 0.7, 0.9, 1.1]
King
- Man
+ Woman
= [0.2, 0.4, 0.6, 0.8, 1.]
= Queen
ML in Natural language processing
We can model a word’s meaning over another dimension (e.g. documents, time, party, gender, etc.)
\[\mathbf{Y} = \mathbf{X} \beta + \mathbf{E} \]
Where \(\mathbf{Y}\) is a \(n \times 1\) vector of word embeddings, \(\mathbf{X}\) is a \(n \times k\) matrix of covariates, \(\beta\) is a \(k \times 1\) vector of coefficients, and \(\mathbf{E}\) is a \(n \times 1\) vector of errors.
Empire
(Rodriguez, Spirling, and Stewart 2023).
ML in Natural language processing
Clustering method that groups words into topics based on their co-occurrence in documents
Logic: Given a set of documents, find a set of topics that best describes the documents
Text
-> Embeddings
-> Clustering
-> Word Representations
-> Topics (Summarised)
ML in Natural language processing
Language models combine embeddings with neural networks to predict the next word in a sequence of words
Neural networks are trained on large datasets of text to learn patterns in language
Language models are used to generate text, answer questions, and perform other tasks
Example: ChatGPT
Language Models
Language Models
Language models can be used to generate text, answer questions, and perform other tasks
Pre-trained models are trained on large datasets of text and can be fine-tuned for specific tasks
Coding Example with BERT:
# input:
from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-uncased')
unmasker("The man worked as a [MASK].")
# output 1:
'score': 0.09747550636529922,
'token_str': 'carpenter'
# output 2:
'score': 0.04962705448269844,
'token_str': 'barber'
Language Models
Task-specific models are trained on specific datasets and are designed to perform specific tasks
Language Models
Language Models
Machine learning and NLP have the potential to transform the study of political behavior, but they also raise a number of ethical concerns
ML Ethics
ML Ethics
ML Ethics
Some practical considerations for researchers using ML in their research:
ML Ethics
ML Resources
ML Resources
ML Resources
What are some applications of machine learning you could incorporate in your research?
What types of data might you use?
What are some ethical considerations you should keep in mind when using machine learning in your research?
Machine Learning & NLP for the Study of Political Behavior