title | revision |
---|---|
Repository management roadmap |
2020-08-06 (Thu) 13:35:03 |
⒈. An awesome list of datasets to consider: https://github.com/datasets/awesome-data
-
The https://github.com/datasets holds many datasets; many of which appear to be outdated though — i.e. not updated in many years. Nonetheless they remain interesting in that they hold their sources. These should be investigated as a quick win to collect datasets of interest; and possibly feeding data back to the project.
As an example, they have an https://github.com/datasets/airport-codes which is simply a replicated (and converted) copy of a freely available online resource (https://ourairports.com/data/airports.csv). We have all necessary CSV manipulation tools within AIT and consequently only need to reference to master source. Source we can then source through GitHub actions.
Below is a filtered list of datasets from this project we should further investigate. When these are old and unmaintained we should probably ignore them, unless the master sources are still updated and the datasets' repository isn't.
- https://github.com/datasets/airport-codes
- https://github.com/datasets/cash-surplus-deficit
- https://github.com/datasets/co2-fossil-global
- https://github.com/datasets/co2-ppm
- https://github.com/datasets/cofog
- https://github.com/datasets/commodity-prices
- https://github.com/datasets/continent-codes
- https://github.com/datasets/corruption-perceptions-index
- https://github.com/datasets/country-codes
- https://github.com/datasets/cpi-change
- https://github.com/datasets/cpi-us
- https://github.com/datasets/cpi
- https://github.com/datasets/data
- https://github.com/datasets/data
- https://github.com/datasets/diagnosed-diabetes-prevalence
- https://github.com/datasets/expenditure-on-research-and-development
- https://github.com/datasets/five-thirty-eight-datasets
- https://github.com/datasets/gdp
- https://github.com/datasets/genome-sequencing-costs
- https://github.com/datasets/geodata
- https://github.com/datasets/geoip2-ipv4
- https://github.com/datasets/glacier-mass-balance
- https://github.com/datasets/global-temp-anomalies
- https://github.com/datasets/global-temp
- https://github.com/datasets/glwd
- https://github.com/datasets/historical-adoption-of-technology
- https://github.com/datasets/house-prices-fr
- https://github.com/datasets/icc-incoterms
- https://github.com/datasets/imf-weo
- https://github.com/datasets/imo-imdg-codes
- https://github.com/datasets/inflation
- https://github.com/datasets/land-matrix
- https://github.com/datasets/lme-large-marine-ecosystems
- https://github.com/datasets/media-types
- https://github.com/datasets/membership-to-copyright-treaties
- https://github.com/datasets/nasdaq-listings
- https://github.com/datasets/nyse-other-listings
- https://github.com/datasets/openflights
- https://github.com/datasets/pharmaceutical-drug-spending
- https://github.com/datasets/population-city
- https://github.com/datasets/population-global-historical
- https://github.com/datasets/population-growth-estimates-and-projections
- https://github.com/datasets/population-reference-bureau
- https://github.com/datasets/population
- https://github.com/datasets/ppp
- https://github.com/datasets/race-and-ethnicity-codes-us
- https://github.com/datasets/s-and-p-500-companies
- https://github.com/datasets/sea-level-rise
- https://github.com/datasets/socrata-opendata
- https://github.com/datasets/top-level-domain-names
- https://github.com/datasets/un-locode
- https://github.com/datasets/unece-package-codes
- https://github.com/datasets/world-cities
- https://github.com/datasets/world-development-indicators
- https://github.com/datasets/world-wealth-and-income-database