Project Title: Designing Advanced Data Architectures for Business Intelligence - Motor Vehicle Collisions/Crashes Analysis
Introduction: This project aims to design and implement advanced data architectures for analyzing motor vehicle collisions and crashes data from three major cities: New York, Chicago, and Austin. The data will be obtained from the respective Department of Transportation portals of each city. The project will involve data extraction, transformation, loading (ETL), dimensional modeling, and visualization using tools like Alteryx, Talend, Azure SQL Server/MySQL/SQL Server, Tableau, and Power BI.
Project Details:
- Data Sources:
- Motor Vehicle Collisions - Crashes | NYC Open Data (cityofnewyork.us)
- Austin Crash Report Data - Crash Level Records | Open Data | City of Austin Texas
- Traffic Crashes - Crashes | City of Chicago | Data Portal
Project Objectives:
- Determine the total number of accidents in each city.
- Present accident data effectively on a dashboard.
- Identify areas within each city with the highest number of accidents.
- Analyze accidents resulting in injuries.
- Investigate pedestrian involvement in accidents.
- Determine peak times for accidents (seasonality).
- Analyze injuries and fatalities among motorists.
- Identify areas with the highest fatality rates.
- Conduct time-based analysis of accidents.
- Analyze factors contributing to accidents.
Project Timeline:
- Part 1:
- Tasks:
- Data profiling using Alteryx/ydata profile
- Analysis document
- Data staging (Staging tables)
- ETL jobs using Talend
- Incorporation of standard practices
- Dimensional modeling (Facts and Dimensions)
- Tasks:
- Part 2:
- Tasks:
- Staging to Integration
- Validation of dimensional data
- Query dimensional data model for business questions
- Tasks:
- Part 3:
- Tasks:
- Visualization using Tableau and Power BI
- Report publication (optional)
- Submission of screenshots and source workbooks
- Tasks:
Project Deliverables:
- Data profiling reports
- Analysis document
- Staging tables
- ETL jobs scripts
- Dimensional model (Facts and Dimensions)
- Mapping document
- SQL scripts
- Validation scripts
- Visualization dashboards in Tableau : https://public.tableau.com/app/profile/yuga.santosh.kanse/viz/MotorVehicleCollisionAnalysis_17152030344110/Dashboard-Q104 and Power BI
- Submission in a zip file
Project Notes:
- Configure at least one dimension as SCD2 (Slowly Changing Dimension)
- Handle null values appropriately
- Maintain Source DIM table and audit columns
- Ensure row counts match the file rows
- Submit as a team, with one person responsible for submission
Project Support:
- Reach out for any clarification or assistance required.
- Utilize provided templates for mapping documents if needed.