Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Resources #2

Open
rpodcast opened this issue Jul 13, 2023 · 1 comment
Open

Dataset Resources #2

rpodcast opened this issue Jul 13, 2023 · 1 comment

Comments

@rpodcast
Copy link
Contributor

rpodcast commented Jul 13, 2023

Lego Bricks

Download Links:

https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-09-06

https://rebrickable.com/downloads/

Predictive Analytics

Descriptive statistics / visualizations

Extra posts (not as useful)

Additional Kaggle data analyses https://www.kaggle.com/datasets/rtatman/lego-database
https://www.cathrinewilhelmsen.net/ingest-explore-lego-datasets-using-azure-synapse-analytics/

@mthomas-ketchbrook
Copy link
Contributor

I landed three of the LEGO datasets (inventories, inventory_sets, and sets) in a public AWS s3 bucket today. I also added a file that shows how to establish a connection to the s3 bucket and pull down the data directly from s3.

The script also contains some logic for automatically posting the Parquet data to s3, but I couldn't get this working... I ended up writing the .parquet files out locally and manually uploading them to s3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants