This project contains the source code and documentation for my third year dissertation project where I attempted to determine the outcome of premier league games using twitter feeds as well as statistical history.
- This set up assumes you have downloaded this package and are in the current directory reading this README.md
- First step is to install virtualenvwrapper using
pip install virtualenvwrapper. - Then enter
mkvirtualenv <project_name>command replacing <project_name> with a name. - The command might need to be sourced. So we need to export some variables and export the script:
- export WORKON_HOME=$PWD/<project_name>
- export PROJECT_HOME=$PWD
- source /usr/local/bin/virtualenvwrapper.sh
- Activate this new environment by using the
source ./venv/<project_name>/bin/activatecommand. - Install the required depencencies using the provided pip:
./venv/<project_name>/bin/pip install -r requirements.txt. - Install nltk_data (used for twitter analysis):
./venv/<project_name>/bin/python -m nltk.downloader all. Note: this takes a while. - Use the provided python binary to start the server:
./venv/<project_name>/bin/python run.py. - Navigate to 127.0.0.1:5000 and begin using the project
The config.py file contains API keys to use.
- FOOTBALL_DATA_KEY can be found at: http://www.football-data.org/
- FOOTBALL_API_KEY can be found at: https://football-api.com/
- CROWD_SCORE_KEY can be found at: http://fastestlivescores.com/live-scores-api-feed/
- Twitter keys and secrets can be found at: https://apps.twitter.com/ and creating a new app
- DEBUG is a variable that can be True or False depending on if you want to output logging info to the console
There are three folders.
- The
datafolder contains jsons of each team when parsed. This is only created when DEBUG is True as writing to file is expensive in terms of I/O operations. - The
staticfolder contains all the assets required to render a HTML page such as CSS, JS, images etc. They can be stored in cache for quicker loading. - The
templatesfolder contains each template for each route. Variables are passed to this template which then use data returned by functions to dynamically create the HTML page required.
The bulk of the work comes from the remaining 5 python files.
- The entry point
run.pycontains the route. It is the connection between the front-end and backend. Once getting the path from the browser, it uses functions to obtain data which it then bundles into variables to pass to the templates. - The
run.pyfile talks to two helper functions. These arefunctions.pyandutil.py.util.pyis more of a look-up dictionary which avoids repeated code and contains utility objects.functions.pyon the other hand helps by doing the calculation before returning the final values torun.py. Most of the API requests which determine the match information come fromfunctions.py. historic_data.pyis responsible for collecting match specific data as well as in-depth information on specific teams. It then uses this information to create a score for win, loss or draw.twitter_data.pyis similar tohistoric_data.pyexcept it anaylses tweets to determine positive and negative scores. However the output is the same ashistoric_data.pyso that those values can be processed byfunctions.py.