Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Github API to extract license info #17

Open
andreww opened this issue Mar 28, 2018 · 4 comments
Open

Use Github API to extract license info #17

andreww opened this issue Mar 28, 2018 · 4 comments

Comments

@andreww
Copy link

andreww commented Mar 28, 2018

Github scans repos for licence files, and if it finds them and understands the contents it annotates repositories with the licence. This is exposed via the API: https://developer.github.com/v3/repos/

We could (and probably should) extract this information via the API call rather than trying to scrape the content of the repo. However, this leaves us with an question: what do we do if there is a file but github didn't understand it. We should probably do a fallback check for this and tag such resources as having an "unknown" licence (rather than having putting "no licence" in the output).

While we are at it, it would be worth seeing if there is any other information we can use. For example, the API exposes parents of forked repos.

@andreww
Copy link
Author

andreww commented Mar 28, 2018

This is also where we will find the last repository update time.

@longr
Copy link

longr commented Mar 28, 2018

Just had a tinker with this earlier. Needs some playing with as it is not clear which field is the best to use in the return field.

@andreww
Copy link
Author

andreww commented Mar 28, 2018

The spdx_id value I think.

@andreww
Copy link
Author

andreww commented Apr 8, 2018

Hum - as far as I can see the update to pygithub to expose this bit of the API only got merged three days ago (PyGithub/PyGithub#734), so if this gets used we'll need an up-to-date pygithub!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants