Skip to content

szuvarska/NYC---bikes-analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NYC---bikes-analysis

Project for Structured Data Processing course at Warsaw University of Technology.

Authors: Łukasz Lepianka, Marta Szuwarska.

Data source: Citi Bike.

Łukasz:

I conducted 2 analyses in R:

  1. Firstly, I took a look at how Covid 19 pandemy influenced travelling by bike. I fetched rides' count and time from Marches since 2019 to 2022 and presented them on interactive plots. 1

  2. The other analysis focused on presenting the number of the bike uses on interactive NYC map diveded by neighbourhoods. I created some maps and two of them showed movement in morning rush hours. From these I could tell which ones are more residential neighbourhoods and which ones are more focused on offices and entertainment facilities. 3

Both analyses were transfered to shiny app.

While working with this project I learned mainly how to create a simple app in shiny, how to create interactive maps in leaflet and using spatial objects.

Marta:

  1. To start, I engaged in data cleansing, which included deleting empty and invalid rows (e.g. with age set to lower than 5 or average speed over 40 km/h). All analises we did were conducted on the cleansed data.

CleaningData2019

  1. Then I got round to an age analysis, in which I compared cyclists' age with distance they covered and their average speed.

AgeAnalysisDistance AgeAnalysisSpeed

  1. Finally, I made a simple predictive model using logistic regression to predict user type (users with and without year subscription).

usertypepredictionaccuracy2019

All my work was done in Python. This project allowed me to get the hang of predictive modelling with scikit-learn and improve my data analysis skills.

About

Data analysis of Citi Bike dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 66.9%
  • Python 33.1%