Skip to content

Latest commit

 

History

History
22 lines (17 loc) · 1.68 KB

File metadata and controls

22 lines (17 loc) · 1.68 KB

Tasks & To-Dos

  1. Data Munging and Cleaning - First clean the data set as much as we can, i.e filter only what we want, rename variables for clean format, take all null values from all the datasets. I have mostly done this and but I am still working on the 311 dataset. Sana has worked on the zipcodeconversion notebook. So in total we will have 3 jupyter notebooks- sharedmobility, 311 and sharemobility_merged_zip.
  2. Merge Zipcode to Shared mobility. This step is done by Sana today but would be helpful if can tally and crosscheck the data accuracy of the merge dataframe. Again analze this merged data set and clean up.
  3. Convert these two notebook datasets (shared_mobility_merged_zip and 311_complaints) to a clean_df.csv so that we create read it in a new notebook and start visualization using these csv files instead api requests everytime. Lets take up visualization tasks on Tuesday.
  4. Recognize all the hypothesis questions and tackle them one by one. Plotting comes as a part of this.
  5. Decide on visualization and glyphs types? Graphs should be “More Data, Less Ink”, Easy to read but not boring.
  6. Converting dataframe to sql. Need to watch Eds video on this.
  7. Create a a notebook ‘Presentation.ipynb’. This will be our final notebook for the presentation.

Notebooks

After cleaning and data munging:

  • shared_mobility_AODP
  • 311_AODP
  • shared_mobility_merged_zip (This can be done in shared_mobility_AODP too)

Convert these df into csv in their respective notebooks and then load them into a new notebook:

  • scooter_data_analysis (for analysis and plotting - this is our main master)

After saving plots and answering our various hypothesis Qs:

  • presentation.ipynb