The power of stars: An empirical analysis of successful and flop movies



Journal Title

Journal ISSN

Volume Title


Tartu Ülikool


Production companies collectively produce thousands of new movies every year. Some of the movies perform incredibly well but some of them turn out to be flops. Having a movie flop can be fatal to a production company because of the costs of producing a movie. Therefore, it is essential to produce movies with a higher likelihood of success. In this thesis, we explore the features of successful and unsuccessful movies by performing empirical analysis on a large dataset of movies. The data about movies was collected from The Internet Movie Database (IMDb), The Movie Database (TMDb) and Box Office Mojo, and the data about movie trailers from YouTube Data API v3. The final dataset contains 470,743 movies, has 26 features, and is about 2.1 GB large. As expected, a single feature can not be used to determine whether a movie is successful or not, the outcome depends on many different factors. However, by performing association rule mining, we found that there are lots of combinations of features that affect the outcome of the movie with the crew, cast, production company, belonging to a collection, trailer, genre, maturity rating all playing a major role.



successful movies, unsuccessful movies, movie characteristics, empirical analysis, association rule mining