Russian invasion of Ukraine - topical evaluation of world news sources with machine learning
On the morning of the 24th of February 2022, Russia launched a full-scale invasion of Ukrainian territory. The war erupted in many different places in Ukraine, the Russian armies bombed almost every major city’s infrastructure, and as of August 2022, the conflict is still ongoing. The attention of the whole world is focused on the events unfolding in Ukraine through numerous international news media sources. Different information resources can spotlight the same event from different perspectives depending on factors like audience type, political agenda, degree of speech freedom, etc. The goal of this thesis was to collect a dataset of news from such resources and then build the pipeline for topic modelling and sentiment classification to analyze the differences and similarities between the news sources. Firstly, we selected several of the most considerable world information resources in our work and collected a dataset of news. Secondly, we created a topic modelling and sentiment analysis pipeline supported by visualization tools. Finally, we analyzed the outcomes of the pipeline and discovered distinctions in the most frequently discussed topics, the sentiment and changes in the popularity of these topics through the timeline. The practical contribution of the thesis consists of several aspects: the novel dataset of news from various sources that spotlight the war, which can be used for further study and the created topical analysis pipeline that consists of the topic modelling and sentiment analysis parts.
Russia, Ukraine, war, topic modelling, sentiment analysis, text analysis, dataset collection