Movie Quality Predictor: The Report

The Magic Formula

Ever since the first motion picture camera was invented 120 years ago, film, one of the richest form of human expression and a now gigantic industry that generates tens of billions of dollars every year, has witnessed numerous technological and artistic developments and engaged almost everybody with its entertainment. Nowadays, while many people are working towards producing movies of the highest quality, the stakeholders of this industry are increasingly focused on commercial success. As a result, very few studies have been done on how to make high quality films as opposed to how to achieve box office success. Aspired to discover the secret formula of producing the high quality movies, we as a team are going to explore multiple features of film production and advance people’s understanding of this topic with machine learning models. We believe our endeavors will benefit many people especially film producers, directors, companies or students who are seeking the magic formula to craft high quality films.

Features & Data

Dataset

We compiled a list of over 8000 films produced since 1976 in the USA, applied a number of filters and generated 4476 movies for our study. All the features are generated from the IMDB database or wikipedia web parsing. (Please see our report for details.)

Labels

IMDb rating is widely used by critics and consumers as an indicator of movie quality relatively independent of commercial success. Therefore we rely directly on IMDB ratings to generate our movie quality label. We used two ways to classify instances given the distribution of imdb ratings and conducted respective experiments.

Quaternary Quality Class Labels:

A: rating >= 7.1, first quartile
B: rating >= 6.4 second quartile
C: rating >= 5.7 third quartile
D: rating < 5.7 fourth quartile

Binary Quality Class Labels:

A: rating >= 6.7, top 40%, high quality
B: rating < 6.7, lower 60%, low quality

Features

Nine features are applied: Year, Runtime, Number of Languages, Main Genre, Number of Genres, Award Index of Director, Award Index of Top Three Actors and Actresses, Box Office, Budget. The first three are queried from the IMDB database, the the last four are parsed from Wikipedia pages. Award indices of directors, actors and actresses are generated from counts of keywords such as Oscar, Golden Globe, Awards, etc.

Training & Testing

We conducted experiments on both the Quaternary Class Label and Binary Class Label dataset using a number of different models. The Quaternary Class Label, which divides movies into more usable quality classes, has the tradeoff of low prediction accuracies. However, it doesn't mean the model is necessarily less usable as the classification contains more information. In either case, the models show that the features such as award indices of director, actors and actresses as well as budget shows significant predictive power of movie quality, captured in ratings or classes in our measure.

The models that yielded considerable accuracy improvements from the base accuracy (25% and 60% as we partitioned the quality classes) for each case are included. Various parameters of the models are tuned to produce their performance. The multilayer perceptron neural network model results in the best accuracy with a slight advantage over decision tree algorithms, especially for the Four Class Label Model. We attribute this difference to the fact that the size of our data is too small for the tree models to achieve good results for the relatively big output space (quaternary vs binary). Please see a more detailed discussion about different models in the report.

Future Steps

Include more features that have an empirical impact on movie quality including special effect techniques, sound/videography effects, etc. Especially, as movies are usually made by many people with myriads of roles, finding a way to factor all of the human part of movie productions should be very effective.

Refine the way of calculating the award index of actors and directors. We have seen that award indices of directors and actors have proven to be very effective. Right now we do a simple count of a few keywords, which may be a good representation of the award concept but does differentiate or give weight to different awards and nomination versus winning status as well as the date of award.

Expand our dataset to include more movies of different periods, regions and kinds as well as fill in missing data for the existing dataset. This will certainly increase both the performance and usability of our model.

Divide the label of models into more classes like 6 or even 10 classes. This will achieve a lower testing percentage as the Zero R base percentage will be really low, but it the classification result will be more useful as the user can infer more information from the result.

Experiment with more training methods. Neural network has proven to be the most effective. However, there are still many different settings of this model that should be test with in hope to achieve better testing results.

Contact us

This project is created for EECS 349 Machine Learning at Northwestern University.

Guixing Lin
guixinglin2018@u.northwestern.edu

Junhan Liu
junhanliu2015@u.northwestern.edu

Ruohong Zhang
ruohongzhang2017@u.northwestern.edu

Yi Zhang
yizhang2017@u.northwestern.edu