Skip to content

Data visualization and prediction of crime data in Vancouver

Notifications You must be signed in to change notification settings

ywan3223/CrimeData-Vancouver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vancouver Crime Data Analysis and Prediction

Contents

  • SQL Scripts: Contains SQL scripts for database schema creation, including tables for crimes, perpetrators, and their relationships.
  • Data Warehousing: Includes files and documentation on the setup of Druid for storing and analyzing weather-related data.
  • ETL Processes: Scripts and workflows for extracting, transforming, and loading data into the database and data warehouse.
  • Data Visualization: Dashboards and visualizations created with Apache Superset to explore crime data in relation to weather conditions.
  • Predictive Analytics: Jupyter notebooks and scripts for machine learning models predicting neighborhood crime occurrences.

Installation

  • PostgreSQL for database management.
  • Druid and Apache Superset for data warehousing and visualization.
  • Python with libraries such as Pandas, Scikit-learn for data processing and machine learning.

Dataset

The dataset is collected from the Vancouver Open Data Catalogue, which contains the criminal records from 2003 to 2019 counted by the Vancouver Police Department. There are 530652 criminal records in it, excluding the total number of calls or complaints made to VPD, only including the crime categories described in the attributes. Some crimes are excluded for privacy and investigation reasons.

Logical model

cmo

Physical model

cmo

Data model

cmo

Data Warehouse Architecture

The project utilizes Vancouver Crime Data for data analysis, employing Airflow for ETL processes from operational databases to Druid-based data warehouse. It features an OLAP server for multi-dimensional analysis, identifying trends through a multidimensional model. Superset finalizes the architecture with intuitive data visualization and interactive dashboards, enhancing data exploration and insight discovery.

cmo

Data Warehouse population

Materialize Cubes

Visualization

cmo

cmo

cmo

cmo

About

Data visualization and prediction of crime data in Vancouver

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published