-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add structure for licensing, possibly validated using Reuse or a similar standard? #21
Comments
Thank you, @ferdnyc, that makes a lot of sense! I wasn’t aware of the Reuse Project or its compliance tools, and after browsing through their website I’d be more than happy to integrate their tools (e.g. the pre-commit hook). Would you like to open a PR? |
Yeah, I just learned of it a month or two ago. But they seem to have a good and level-headed take on this stuff. It can definitely seem a bit off and pointless at first, having to deal with adding license tags to things like a (Though it's certainly a lot easier to start off that way, than to have to retroactively figure all this stuff out for existing code. ...Speaking most definitely from experience.) But as I noted earlier, there are a lot of parallels to semantic versioning/releasing: It's not a thing being imposed on a project, it's an ideal that the project's developers may choose to commit to, if they see the benefits as outweighing any additional restrictions or effort involved.
A pre-commit hook is certainly an option, for developers who want to keep themselves honest that way. I've always been a bit wary of Git hooks because they don't always play well with everyone's preferred mode of interacting with git. (Some UIs and tools have been fully integrated with all aspects of the git model, they gracefully recover from rejected commit attempts and notify the user in a clear and actionable way. Others... not so much.) More interesting, in my view, are the various CI tools they offer, their Marketplace GitHub Action being the most obviously useful in this context. A CI check that flags any licensing issues and blocks PR merges is a no-brainer for a project that wants to stay Reuse-compliant. (Won't prevent project committers from potentially pushing code with issues, if they have direct commit access to a branch... but it would mean that PR checks start failing after they do, through no fault of the PR submitter. 🙄 )
Sure, I can do that. What do I need to know about the current contents of the repo, license-wise? Are there any files from external sources that have to be treated specially, or is it all original work covered by the blanket MIT license? |
(I know I saw |
Adding a license checker to this template repo is indeed a great idea. I agree with @ferdnyc that adding the checker as a GitHub Actions workflow would be preferable. While the Reuse Project seems interesting, another action that could be considered is License Eye, which checks for valid license headers. Another interesting aspect to consider is third-party license analysis for open source compliance and risk assessment. One tool in this space is BlackDuck that enforces a predefined policy, but I couldn't find any open source version of it in the GitHub Actions Marketplace. We could alternatively consider this action for Python packages that goes through the dependencies transitively to check their licenses. Here is an example output for some of the dependencies (may not be direct) for the current repo:
|
Thank you @behnazh the great feedback! Looking at @pilosus’s pip license checker you mentioned, I wonder if would make sense to map to the SPDX License List, at which point the output would become machine-readable again, in line with the Reuse Project’s intentions. |
@behnazh Interesting! Reuse and the license checkers you mentioned actually have completely different aims, I would think that they're broadly compatible (though, really, they wouldn't be touching any of the same things so maybe "complementary" is a better word.) The focus with Reuse is on making your own code ready to be reused, by making sure all local files are tagged with a valid license identifier. It doesn't do anything with dependencies at all. (Reuse also avoids trying to boil an entire package down to a single license, because as code accumulates and files are pulled in from outside sources, the fact is that a single repo can contain files under a mix of 2, 3, 5, who-knows-how-many different licenses. (If a font is bundled, that's often a different license. Bundled icons, same thing. If there are data files included with the unit tests, those very often don't have any clear licensing at all. To assume they're licensed under the same terms as the overall project is often just wishful thinking. That being said, it would be an interesting extension to reuse, if there were a tool that could pull in dependency licenses and add that info to the local license metadata. Any additional license texts could even be added to the local ...BTW, am I the only one who finds it odd that |
The one sticking point for me, with Reuse, is that by default it would want every repo file to contain a license tag and a copyright statement (just "Year Author", by default). But since this is a template project, it seems odd to pre-fill that data, since really the user creating a project from the template should be doing that with their info. Obviously, the template can start in a failing state and that'll be a reminder to the user that they need to update the files to insert their copyright... but that's kind of a pain given the diversity of file types, some of which Reuse can't automatically recognize to insert the appropriate type of comment. |
Sorry to interject, |
Thank you all for your comments, I really appreciate that! It looks to me like the original thread has split into three somewhat independent facets:
I think point 1. was discussed above:
I think currently all files in the repo can be covered by the example blanket MIT license. But I also think it’d make sense to add a section to the README on how to move forward managing license information — wrt. all three points. As for point 2. we could consider identify or probably add the license checker action and it looks like the checker’s |
In this context, PEP 639 is currently being drafted and discussed here: python/peps#2164 (see this initial forum thread that spawned the PEP). I haven’t read all of it yet, though, but probably worth taking into consideration here. |
Thanks! Let us know if you have any questions or feedback. The scope of that PEP has been limited to project-wide license expressions as opposed to formally specifying per-file convention, though the standard As a general bit of feedback, for a template whose stated focus is on PEP 518 (which would tend to imply PEP 517), its a little surprising to see no Also, maybe worth considering cookiecutter (or a similar tool) to save a large amount of user effort and potential errors manually replacing everything, which would be particularly important when it comes to standardized license headers, copyright lines, license metadata, etc. |
Related discussion: PEP 639, Round 2: Improving license clarity with better package metadata |
FWIW, I hope to finally do what is hopefully the last big PEP update in the next couple days, after being busy with my research, work and other FOSS things. |
I opened PR #377 which uses the dependency-review-action which, in turn, has a |
@acrobat888 suggested to look at LicenseCheck and pip-licenses as well. |
Somewhat related is the flake8-copyright plugin. |
And see also PEP 639 for related discussion. |
Another interesting package is license-expression. |
Interesting project, I think you've got a pretty good start here, but one thing that strikes me as lacking is a repo-wide view of license management.
Clearly establishing license terms is so important in open-source projects (both for the author(s) of the code, and for anyone who's considering making use of it in their own projects), yet it's often handled haphazardly as an afterthought — or, just as often, not handled at all until someone pesters the maintainer into acting out of frustration. Typically that kind of thing causes problems down the line for someone (probably not the originator of the... "bespoke" license in question).
Adopting a standard for licensing right out of the gate is a simpler, saner solution. And it can be as simple as committing to standard practices defined by something like the Reuse project. Just like most projects will commit to following semantic versioning practices, adopting reuse-verified license compliance requires nothing more than a commitment to put in the work.
The text was updated successfully, but these errors were encountered: