Data & Language Models
Many of the models I’ve trained for specific purposes in my research are available on the Hugging Face Model Hub.
Language models
Facebook bart-large-cnn sequence-to-sequence model trained to summarise policy positions from party press releases
LLM available at: z-dickson/bart-large-cnn-climate-change-summarization
Vinai/bertweet-large model trained to predict opposition to COVID-19 policies in US congressmembers’ tweets
LLM available at: z-dickson/US_politicians_covid_skepticism
Bert-base-multilingual-cased trained to predict the CAP issue codes of political text (i.e. bills, speeches, tweets etc.)
Details
Language model trained to predict the CAP Issue Code of political text. The model was trained on the universe of coded data from the Comparative Agendas Project (huge thanks!) and can accurately predict the CAP code of political text in multiple languages and domains. The model is available on the Hugging Face Model Hub: z-dickson/CAP_multilingualBert-base-multilingual-cased Sentiment model trained on Polish, English, Spanish, Dutch and German Newspaper headlines.
Details
Language model trained to predict the sentiment of newspaper headlines in English, Polish, Spanish, Dutch and German. The model is available on the Hugging Face Model Hub: z-dickson/multilingual_sentiment_newspaper_headlinesDatasets
- UK Parliamentary Statutory Instruments: 1970-2021 [github]
- Parliamentary Bills - Dáil Éireann (Ireland): 1950-2020 [github]
- Parliamentary Bills - New Zealand Parliament: 1900-2020 [github]