Skip to content

Analyzing ridesharing data in GCP using PySpark to evaluate impact of UChicago free Lyft program on student travel behavior

Notifications You must be signed in to change notification settings

hvpachisia/uchicago-lyft-rideshare

Repository files navigation

The UChicago Lyft Ride Smart Program: Effects on Ridesharing

By Harsh Vardhan Pachisia, Abe Burton, Ridhi Purohit, and Rohit Kandala

Business/Research Question

How much did UChicago’s Lyft Ride Program impact ridesharing in Hyde Park and is it worth continuing?

image

Data

Raw Datasets:

  1. Transport Network Providers Ridesharing Dataset: each row is one ride from 2018-2023 (84.8 GB)
  2. Daily Weather (1 GB)

Processed Dataset:

  1. In program Dataset (rides in Hyde Park, Woodlawn, Kenwood): 10 GB
  2. Other parts of Chicago: 74 GB

Project Architecture

image

Findings

Unsupervised ML (clustering)

image

  • The program did not affect where people were calling rideshares from in Hyde Park.
  • On average, trip duration increased post-program
  • On average, number of trips increased post-program
  • On average, fare increased post-program
  • More late-night trips were taken post-program
  • Marked increase in the proportion of rides taken at the end of month

Supervised ML (linear regression with elastic net)

Showcase the impact of the Lyft program on daily ridership counts within the program area (Hyde Park, Kenwood, Woodlawn) by predicting how behavior would have been without the program (predict counter-factual)

image

  • There was a clear increase in ridership counts after the program was implemented
  • Usage increase breaks down to about 4 rides per student per month

Conclusion

  • More rides are taken later in the evening supporting a safety-motivated hypothesis for Lyft usage
  • Lyft Ride Smart amplify student habits - similar destinations to campus, shopping, and apartments with higher frequency and more trips taken at night
  • Rides are not for superfluous spending - they facilitate necessary student trips in a wider range of times with increased safety. This is clarified with our clustering analysis

About

Analyzing ridesharing data in GCP using PySpark to evaluate impact of UChicago free Lyft program on student travel behavior

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published