Prediction of a movie’s box office using pre-release data
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
It’s difficult to overestimate the impact of the film industry in our lives, it expands our knowledge
about the world and culture and entertains. Going to the cinema has become an important
leisure activity. Moreover, the total worldwide box office in 2018 hit a significant amount of
$41B. This is not surprising as only in 2018 there were released 11,911 feature-length films
worldwide. The box office generated from cinema ticket sales is the main source of profit for
widely released movies. However, not all movies are successful in terms of profit when the cost
of production is compared with the total box office. 78% of movies released worldwide are not
profitable and 35% of profitable movies earn 80% of the total profit. Seeing the importance of
theatrical screenplays and tough competition for the profit made, we want to be able to predict
how successful a movie is going to be and whether it is worth taking the risk of investment.
Only pre-release available data is used to be able to make a prediction at the earliest stages. We
went through several stages typical for data mining and machine learning to obtain possibly the
biggest and feature-rich dataset used in box office gross prediction. We use neural networks and
gradient boosting machines to be able to predict the absolute box office gross, predict within
which range it is likely to be, and whether a movie will be profitable, and the results obtained
are very competitive in the domain.
Description
Keywords
Regression, Classification, Motion pictures, Box office, Neural networks, LightGBM