Curated list of Publicly available Big Data datasets. Uncompressed size in brackets. No Blockchains.
- CelebFaces Attributes - 1.2 GB
- Over 200k images of celbrities with 40 binary attribute annotations
- CommonCrawl (AWS) - A corpus of web crawl data composed of over 25 billion web pages.
- Semi-Structured (includes Metadata): 250 TB
- DBpedia - curated wikipedia data
- Freebase
- Freebase: 22 GB (250 GB)
- Freebase Deleted Triples: 2 GB (8 GB)
- Freebase/wikidata Mappings: 22 MB (243 MB)
- StackOverflow Data (BigQuery) - 182 GB
- Landsat 8 (AWS)
- Uber Self Driving Car Challenge
- 200 GB+ (compressed)
- EU Open Data Portal
- data.gov
- US Census
- data.gov.uk
- CIA World Factbook
- healthdata.gov
- digital.nhs.uk
- Gapminder
- National Centers for Environmental Information
These pages might link to datastes which are already in the list.