-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copyright detection would be amazing #389
Comments
I'm also very much interested in this. I found https://github.com/JD-CSTx/license-maven-plugin which does exactly what is needed here for a different Maven plug-in. If @JD-CSTx agrees I would volunteer to take his code and try to add it to the cyclonedx-maven-plugin. |
Of course I agree, also: I couldn't disagree, even if I wanted to. It's a fork of the MojoHaus License Maven Plugin (which was abandon for a long time period), and is under the LPGL 3.0 License: https://www.mojohaus.org/license-maven-plugin/licenses.html. |
I started working on this at https://github.com/sithmein/cyclonedx-maven-plugin/tree/issue-389-copyright-detection . The Maven plug-in has a new configuration parameter This is only a first iteration but you can already give it a try by installing it locally (I bumped the version) and then running the new version on a project. One open question is about the format when there are multiple copyrights fond. CycloneDX only has a text field for |
You may to check out ScanCode toolkit (that I co-maintain) for this. This is considered as one of the best-in-class tools for copyright detection. This is in Python, not Java though. https://github.com/nexB/scancode-toolkit/tree/develop/src/cluecode |
I already tried it but the result were not really satisfactory. It reported quite a lot of nonsense in our case. And it took waaaay to much time, likely because it looked at each and every file. I don't believe this is necessary, though. If the publisher of an artifact doesn't bother providing copyright information in some usable way you cannot expect from users of that artifact to dig it up themselves by looking at every single file. My - totally non-legal - opinion. |
@sithmein re:
That's a bug to me then. Do you have you the input you used? |
This can't be implemented using regex. Been there and bought the T-shirt. Use a project like Javaparser and parse comment nodes from AST for java. For other files, find a suitable treesitter implementation. |
What do you mean with "it can't be implemented"? Obviously it works. It may not detect any kind of weird copyright notices but I doubt that any other approach will. |
Since many opensource licenses (e.g. MIT) require publishing a list of copyright attributions from the dependencies you use, it'd be awesome to have support for detecting copyrights in this tool to populate the CycloneDX "copyright" field and comply with this common requirement.
This could be implemented by using a regex (user-configurable would be great) to detect copyright messages from various standard locations inside the jar (a configurable set of globs) e.g. NOTICES, META-INF/MANIFEST.MF, README.* etc.
Even more amazing would be to do download the associated source jar from mavencentral in case the binary doesn't contain copyrights (but even just binary scanning would be a big win).
The text was updated successfully, but these errors were encountered: