Football match prediction through the use of statistical analysis and machine learning

Duyile, Akintunde (2017) Football match prediction through the use of statistical analysis and machine learning. [Dissertation (University of Nottingham only)]

[thumbnail of Akin, Duyile_MScDissertation_4282308.pdf] PDF - Registered users only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB)

Abstract

There is a large market for accurately predicting the result of football matches. Bookmakers thrive off of this by laying bets, created by their in-house models, however the information they use is public. Academics in this field have at many attempts tried to develop accurate models using machine learning algorithms, however the majority of them have all focused on a specific range of features.

Evidence suggests that football teams undergo similar trends at particular intervals during a season, therefore the aim of this study was to determine if it’s possible to produce more accurate models using a historical trend metric as a feature for machine learning algorithms.

The problem was addressed by at first determining what historical trend metric to use. After different intervals, and different game sets, overall and monthly intervals was identified as the most suitable metric to use. Following this matches from a number of previous seasons and different English leagues were modified to include different form features (varying number of games considered for the form factor) and on separate datasets form features and the historic trend feature.

After the datasets were prepared, they were used for K-Nearest Neighbours and Random Forest machine learning algorithms. Both the algorithms showed firstly that 7-game form dataset produced the most accurate results, and ultimately that the historical trend metric did in fact increase the accuracy of the models.

Item Type: Dissertation (University of Nottingham only)
Keywords: Average Monthly Points per Game, Comma Separate Value, k-Nearest Neighbour classifier, Multinomial Logistic, Regression, Naïve Bayes classifier, Positive Predictive Value, Random Forest, Support Vector Machine classifier, True Positive Rate
Depositing User: Gonzalez-Orbegoso, Mrs Carolina
Date Deposited: 15 Jan 2018 12:46
Last Modified: 17 Jan 2018 23:29
URI: https://eprints.nottingham.ac.uk/id/eprint/49105

Actions (Archive Staff Only)

Edit View Edit View