Mathematical Statistics and Mathematics student Sean Cranston is celebrating his abstract accepted by the National Conference on Undergraduate Research.
He did the research project “Build an artificial intelligence-based movie recommendation engine with machine learning algorithms” under supervision of Dr. Mengshi Zhou, assistant professor of Data Science/Analytics.
In this project, Cranston used multiple data science tools such as recommendation systems, machine learning, data integration, and R shiny package.
In accepting the abstract, the National Conference on Undergraduate Research noted that the abstract demonstrates a unique contribution to the field of study.
The project was funded by the National Science Foundation ACCESS S-STEM Scholarship grant.
The National Conference on Undergraduate Research is a gathering of student scholars where undergraduate student achievements are celebrated and promoted by showcasing exemplary models of research, scholarship and creative activity.
Read the abstract:
Build an artificial intelligence-based movie recommendation engine with machine learning algorithmsSean Cranston and Mengshi Zhou
In the technology age we have seen an explosive amount of information and data which presents us with an obstacle—how do we find what we are looking for? Recommendation systems can solve the information overload problem by guiding people toward the information through predictive modeling. The goal of this study is to build an artificial intelligence (AI)-based recommendation engine to find movies of interest to the users. Several articles can be found on recommendation systems. However, most of the articles studied a single class of recommendation algorithms. In this study, we systematically compared various types of algorithms and deployed the algorithms to real-world use. Our study includes data preprocessing, data visualization, hyperparameter optimization, model comparison, and deploying models in a web app. We analyzed a large dataset consisting of 105,339 ratings on over 10,329 movies. The cross-validation experiments were used to explore and compare the performance of different machine learning algorithms on the dataset. The popular algorithm, which recommends movies based on movies’ popularity, was the most effective as measured by ROC curves and precision-recall curves. This result is important because it shows that we don’t always need a complicated algorithm to have the best model. We then build a recommendation engine using the popular algorithm. Finally, we developed a shiny app for our AI-based recommendation engine.