Simultaneous Time Series Forecasting on the World's COVID-19 Daily Vaccinations

About

Dataset: COVID-19 World Vaccination Progress
This is my project for the Data Mining Course (COSC-526). The main code is in this Jupyter Notebook.

Code Locations

The dataset is in the datasets/covid-world-vaccinations-progress directory
The metadata dataset is in the datasets/countries-of-the-world directory
The jupyter notebook used is the project.ipynb
Some custom packages used in the notebook are located in the data_mining directory:
- Project Utils:
  - NullsFixer: for inferring the nulls in the COVID-19 vaccination dataset
  - Preprocess: the preprocessing code of the dataset before training
  - BuildModel: contains all the functions related to the building of the TF model
  - Visualizer: the implementations of all the visualizations
- Configuration: it handles the yml configuration
- ColorizedLogger: code for formatted logging that saves output in log files
- timeit: ContextManager+Decorator for timing functions and code blocks
The project was compiled using my Template Cookiecutter project: https://github.com/drkostas/starter

Document Locations

The extended abstract and the poster are both located in the Documents folder.

Information About The Dataset

The COVID-19 Vaccination Progress Dataset contains information about the daily and total vaccinations of 193 different countries over 135 different dates. The data are being collected almost daily and of writing this (4/29), the dataset has 14230 rows and 15 different features.

The features of the dataset are the following:

Country: this is the country for which the vaccination information is provided
Country ISO Code: ISO code for the country
Date: date for the data entry; for some dates we have only the daily vaccinations, for others, only the (cumulative) total
Total number of vaccinations: this is the absolute number of total immunizations in the country
Total number of people vaccinated: a person, depending on the immunization scheme, will receive one or more (typically 2) vaccines; at a certain moment, the number of vaccination might be larger than the number of people
Total number of people fully vaccinated: this is the number of people that received the entire set of immunization according to the immunization scheme (typically 2); at a certain moment in time, there might be a certain number of people that received one vaccine and another number (smaller) of people that received all vaccines in the scheme
Daily vaccinations (raw): for a certain data entry, the number of vaccination for that date/country
Daily vaccinations: for a certain data entry, the number of vaccination for that date/country
Total vaccinations per hundred: ratio (in percent) between vaccination number and total population up to the date in the country
Total number of people vaccinated per hundred: ratio (in percent) between population immunized and total population up to the date in the country
Total number of people fully vaccinated per hundred: ratio (in percent) between population fully immunized and total population up to the date in the country
Number of vaccinations per day: number of daily vaccination for that day and country
Daily vaccinations per million: ratio (in ppm) between vaccination number and total population for the current date in the country
Vaccines used in the country: total number of vaccines used in the country (up to date)
Source name: source of the information (national authority, international organization, local organization etc.)
Source website: website of the source of information

For recalculating the per hundred people values we used another dataset that contains some metadata about the countries of the world, including their population.
Metadata Dataset: DataBank - World Development Indicators

Getting Started

These instructions will get you a copy of the project up and running on your machine.

Prerequisites

You need to have a machine with Python >= 3.6 and any Bash based shell (e.g. zsh) installed.

$ python3.6 -V
Python 3.6.13

$ echo $SHELL
/usr/bin/zsh

Setting Up

All the installation steps are being handled by the Makefile. The server=local flag basically specifies that you want to use conda instead of venv, and it can be changed easily in the lines #25-28. local is also the default flag, so you can omit it.

$ make install server=local

To update the COVID-19 vaccination dataset with the latest information, run:

$ make download_dataset server=local

Running the code

In order to run the code, you will only need to modify the yml file if you need to, and open a jupyter server.

Modifying the Configuration

There is an already configured yml file under confs/covid.yml with the following structure:

tag: project
covid-progress:
  - properties:
      data_path: datasets/covid-world-vaccination-progress/country_vaccinations.csv
      data_extra_path: datasets/world-bank/data.csv
      log_path: logs/covid_progress.log
    type: csv

Running Jupyter

After loading the cond environment with the command conda activate data_mining, run jupyter notebook and open the project.ipynb file.

TODO

Read the TODO to see the current task list.

Built With

Jupyter - An interactive computing framework
Tensorflow - A deep learning framework

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thanks to PurpleBooth for the README template

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Documents		Documents
confs		confs
data_mining		data_mining
datasets		datasets
img		img
outputs		outputs
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
env_example		env_example
project.ipynb		project.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simultaneous Time Series Forecasting on the World's COVID-19 Daily Vaccinations

Table of Contents

About

Code Locations

Document Locations

Information About The Dataset

Getting Started

Prerequisites

Setting Up

Running the code

Modifying the Configuration

Running Jupyter

TODO

Built With

License

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

License

drkostas/covid19-vaccinations-predict

Folders and files

Latest commit

History

Repository files navigation

Simultaneous Time Series Forecasting on the World's COVID-19 Daily Vaccinations

Table of Contents

About

Code Locations

Document Locations

Information About The Dataset

Getting Started

Prerequisites

Setting Up

Running the code

Modifying the Configuration

Running Jupyter

TODO

Built With

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages