A Bikesharing Analysis Using Tableau to Visualize Bike-Sharing Data.
Using New York City bike-sharing data, the purpose of this project was to create a business proposal and convince potential investors to invest in a bike-sharing company in Des Moines, Iowa.
- Data File: August 2019 City Bike Data downloaded from https://ride.citibikenyc.com/system-data. Note: August data was used because there is likely more traffic during the summer months.
- Software: Tableau Public 2021.3, Python 3.7.10. and Jupyter Notebook 6.3.0
The analysis of the New York City bike-sharing data answers the following questions:
- How many trips were recorded during the month of August?
- What was the proportion of short-term customers to annual subscribers?
- What were peak riding hours in the month of August?
- What were the top bike stations in the city for starting a journey?
- What were the top bike stations for ending a journey?
- What was the gender breakdown of active riders?
- What was the average trip duration by age?
- How was birth year related to the length of a bike ride?
- Which bikes were most likely due for repair?
- How variable was bike utilization?
Since August is a beautiful time of the year to rent a bike, this data was used as a starting point to determine how many rides an investor could expect in the city of Des Moines. According to https://www.city-data.com, New York City had a population of 8,336,817 in 2019 and Des Moines had 214,237. See statistics below:
New York City Statistics
Des Moines Statistics
The number of rides in NYC was 2,344,224 which is 28% of the total population. Therefore, we might expect 60,241 rides in Des Moines.
Using the breakdown of rider types in New York City, we might expect 11406 (19%) customers and 48835 (81%) subscribers in Des Moines.
Knowing the peak usage hours for the month of August helped get a better idea of how many bikes will be needed in Des Moines, as well as identified which parts of the day the most bikes are needed. For example, if we need to do maintenance on a bike, knowing the peak usage hours will help us plan for the best time to schedule maintenance. Based on the bar chart, the top riding hours during August in New York City was between 5:00 p.m. to 7:00 p.m. and suggested bike maintenance should be performed between 2:00 a.m. and 5:00 a.m.
The following symbol map provided a visualization of the top 10 starting locations.
The following symbol map provided a visualization of the top 10 ending locations.
The following pie chart gave a breakdown of gender.
The following area chart gave the average trip duration by age. The general trend was that younger riders tend to use the bikes for longer periods of time.
The following treemap gave an idea of how often each bike was used, i.e. Bike 38124 was used the most with 479 uses.
The following packed bubbles visualization shows how long rides were and if there were bikes that need more attention than others.
A Bike Trip Analysis was prepared to address the following questions for one of the key stakeholders:
- What was the length of time that bikes were checked out for all riders and genders?
- What was the number of bike trips for all riders and genders for each hour of each day of the week?
- What was the number of bike trips for each type of user and gender for each day of the week?
Using Python and Pandas functions, the "tripduration" column was converted from an integer to a datetime datatype to get the time in hours, minutes, and seconds (00:00:00) in our dataframe.
Using Tableau, the following visualizations were created:
- A line graph displaying the number of bikes checked out by duration for all users, and the graph can be filtered by the hour.
- A line graph displaying the number of bikes that are checked out by duration for each gender by the hour, and the graph can be filtered by the hour and gender.
- A heatmap showing the number of bike trips for each hour of each day of the week.
- A heatmap showing the number of bike trips by gender for each hour of each day of the week, and the heatmap can be filtered by gender.
- A heatmap showing the number of bike trips for each type of user and gender for each day of the week, and you can only filter by user AND gender.
- Summary
- Two additional visualizations suggested for future analysis:
Select your questions. During this step, you'll consider which results you want to share with your audience. What do they want to see? How can we use that information to make their decision making process easier? Execute independent research. You'll need to look at other relevant pieces of information to build a bigger picture. Search other sources to find information that will make your visualization more powerful. Craft your Tableau story. This is when you create your story, primarily from worksheets and other visuals, with descriptions for each of them. Create a written analysis. The written analysis is intended to provide additional insight into what we're trying to convey to our audience. This is a good place to add extra detail so that everyone can get on the same page. After you've practiced creating Tableau stories, let's create a story for our investors. The purpose of this story is to help them determine whether they should invest in a bike-sharing program in Des Moines.