Russian invasion of Ukraine - topical evaluation of world news sources with machine learning
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
On the morning of the 24th of February 2022, Russia launched a full-scale invasion
of Ukrainian territory. The war erupted in many different places in Ukraine, the Russian armies
bombed almost every major city’s infrastructure, and as of August 2022, the conflict is still
ongoing.
The attention of the whole world is focused on the events unfolding in Ukraine through
numerous international news media sources. Different information resources can spotlight the
same event from different perspectives depending on factors like audience type, political
agenda, degree of speech freedom, etc.
The goal of this thesis was to collect a dataset of news from such resources and then build the
pipeline for topic modelling and sentiment classification to analyze the differences and
similarities between the news sources. Firstly, we selected several of the most considerable
world information resources in our work and collected a dataset of news. Secondly, we created
a topic modelling and sentiment analysis pipeline supported by visualization tools. Finally, we
analyzed the outcomes of the pipeline and discovered distinctions in the most frequently
discussed topics, the sentiment and changes in the popularity of these topics through the
timeline. The practical contribution of the thesis consists of several aspects: the novel dataset
of news from various sources that spotlight the war, which can be used for further study and
the created topical analysis pipeline that consists of the topic modelling and sentiment analysis
parts.
Description
Keywords
Russia, Ukraine, war, topic modelling, sentiment analysis, text analysis, dataset collection