Sentiment analysis of selected works by Mark Twain with the statistical software R
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
The aim of this thesis is to explore the possibilities of sentiment analysis by
focusing on a selection of literary works by Mark Twain such as: The Adventures of
Huckleberry Finn , The Adventures of Tom Sawyer , The Prince and the Pauper , Tom
Sawyer, Abroad , Tom Sawyer, Detective . The sentiment analysis is done in the
programming language R and with the RStudio program, including software packages
available through it. Since the process of doing sentiment analysis via RStudio is not fully
automated, a script had to be written in order to accomplish the goal of doing sentiment
analysis. The analysis of the selected works will be done in three ways. Firstly, by
identifying the nine most frequent negative and positive sentiment words in every book
separately. Secondly, visualizing the overall distribution of sentiment in every book in
order to categorize the narratives into certain types of stories. Thirdly, by portraying the
amount of negative and positive sentiment words that were used.