Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement standard APIs to use Python frontend #2113

Closed
tuxology opened this issue Dec 18, 2022 · 5 comments
Closed

Implement standard APIs to use Python frontend #2113

tuxology opened this issue Dec 18, 2022 · 5 comments
Assignees

Comments

@tuxology
Copy link
Contributor

Currently the best way to use python fronted via APIs to generate CPG seems to be the use of Py2CpgOnFileSystem.buildCpg() API here: https://github.com/joernio/joern/blob/master/joern-cli/frontends/pysrc2cpg/src/main/scala/io/joern/pysrc2cpg/Py2CpgOnFileSystem.scala#L23 This is then followed by running default overlays like OSSdataflow separately.

However, for Javasrc2cpg as well as Jssrc2cpg, we have more "standard" way of creating CPGs like JsSrc2cpg.createCpgWithAllOverlays() (https://github.com/joernio/joern/blob/master/joern-cli/frontends/jssrc2cpg/src/main/scala/io/joern/jssrc2cpg/JsSrc2Cpg.scala#L44). Therefore, we need a way tostandardize the APIs for Python for better integration with downstream tools.

@ml86
Copy link
Contributor

ml86 commented Dec 20, 2022

The method JsSrc2cpg.createCpgWithAllOverlays() is basically a hack which @fabsx00 added a while back. It is not the way we want to go because it means that the languages depend on the OSS data flow engine.
There is already a PR which aimes for removal of this method which has been pending for quite a while now: #1879

@tuxology Can you elaborate on what you actually need as API or do you just want "one common way" of directly calling the frontend classes?

@tuxology
Copy link
Contributor Author

@ml86 @fabsx00 In general, I need "one common way" in which I can reliably generate CPGs with and without all overlays. It seems to me that createCpg and createCpgWithAllOverlays is the most standard way right now - I see this standarad API used in Javasrc2cpg, jssrc2cpg, c2cpg etc. I find current Python frontend's generation is quite hacky and we can use these two APIs as standard. Frontend CLIs can be adjusted to use with/without overlay APIs based on OSSdataflow engine maturity.

I already managed to modify the Python frontend without too many deep changes in the code to use this API and adjusted the tests as well. Will send a PR soon.

@ml86
Copy link
Contributor

ml86 commented Dec 20, 2022

@tuxology py2cpg should also implement the X2CpgFrontend trait. A PR for this would be very welcome. But this PR should not deprive the py2cpg tests to be run without going via the filesystem which was the reason to abstract over the input files via inputProviders here:

class Py2Cpg(inputProviders: Iterable[Py2Cpg.InputProvider], outputCpg: Cpg) {

As for createCpgWithAllOverlays: This is a non standard API currently only existing in jssrc2cpg. This entry point will most likely be removed in the future. So we wont add it for other frontends.
But i guess you meant createCpgWithOverlays which come in via X2CpgFrontend instead of createCpgWithAllOverlays?

@tuxology
Copy link
Contributor Author

@ml86 The new frontend API I am working on for Py2Cpg is using the X2CpgFrontend trait for sure. Here is a Draft PR: #2124 . I am not done on this yet, but if you want to leave early comments, please do. I did change the tests and I removed the class that implemented filesystem specific CPG ops (I didn't understand the need for it since X2CpgFrontend already allows files to be created on FS, exclusion of files etc.) I did keep the method that implemented some basic filtering of files.

@tuxology
Copy link
Contributor Author

tuxology commented Jan 2, 2023

Closing this since this is merged

@tuxology tuxology closed this as completed Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants