Skip to content

MahsaBakhtiari/Crowdfunding_ETL

Repository files navigation

64410a99d52cf88103c4b88c_GDM_16_9

crowdfunding ETL

Members: Mahsa Bakhtiari, Dylan Kelly, Neel Chunara, Lena Hammoudyour

Overall Summary

This repository is for a mini ETL project which is meant to demonstrate the practice of building an ETL pipeline using Python and Pandas; the extraction and transformation of data through regular expressions/ Python dictionary methods. As well as the creation of four csv files used in the development of an ERD/ table schema, which is uploaded into a Postgres databse.

Category DataFrame

  • The creation of a DataFrame containing the "category_id", entries sequentially order for unique "cat"
  • DataFrame is extracted to csv titled "category.csv"

Subcategory DataFrame

  • The creation of a DataFrame containing the "subcategory_id", entries sequentially order for unique "cat"
  • DataFrame is extracted to csv titled "subcategory.csv"

Campaign DataFrame

  • The creation of a DataFrame containing the following columns: "cf_id","contact_id","company_id","description", "goal", "pledged", "outcome", "backers_count", "country", "currency", "launch_date", and "end_date".
  • DataFrame is extracted to csv titled "campaign.csv"

Contacts DataFrame

  • The creation of a DataFrame containing the following columns "contact_id", "first_name", "last_name", and "email
  • DataFrame is extracted to csv titled "contacts.csv"

Crowdfunding Database

  • The creation of database schema saved as "crowdfunding_db_schema.sql"

Contribution Note

  • Contributions to this project are welcome. If you find any issues or have suggestions for improvement, please submit a pull request or open an issue on the GitHub repository.

About

ETL pipeline mini group project development

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •