-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JavaSrc2Cpg: Infer type by Namespace and arg/parameter size #4434
base: master
Are you sure you want to change the base?
JavaSrc2Cpg: Infer type by Namespace and arg/parameter size #4434
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type inference is meant to be more sound than type recovery, so an additional constraint I'd prefer you add is that, if we're ignoring argument types, that no other method has the same number of args, otherwise we may be calling either method.
candidateMethodsIter.find(isMatchingMethod(_, call, callNameParts, ignoreArgTypes = ignoreArgTypes)).flatMap { | ||
method => | ||
val otherMatchingMethod = | ||
candidateMethodsIter.find(isMatchingMethod(_, call, callNameParts, ignoreArgTypes = ignoreArgTypes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An iterator can only be used once, and I see it used twice. Rather get rid of candidateMethodsIter
and just call candidateMethods.start
or candidateMethods.iterator
when you need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for calling it twice is
- The first iterator call gives us the first match and the iterator stops there
- we use the same iterator to continue the search to see if we get another matching method
This is an optimization than calling the iterator from the start again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, iterators cannot be used more than once. I've verified this in a shell for you e.g.
scala> val x = Iterator(1, 2, 3)
val x: Iterator[Int] = <iterator>
scala> x.toList
val res2: List[Int] = List(1, 2, 3)
scala> x.toList
val res3: List[Int] = List()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional (maybe a comment is required in code to explain this) and is an optimisation to avoid having to traverse the entire iterator twice.
The first call to candidateMethodsIter.find
consumes iterator elements until the first match, but stops at this point. The second find
call then continues the search until a second match is found.
To illustrate with an example considering only the first find
call:
scala> val x = Iterator(1, 2, 3, 2, 4)
val x: Iterator[Int] = <iterator>
scala> x.find(_ == 2)
val res3: Option[Int] = Some(2)
scala> x.toList
val res4: List[Int] = List(3, 2, 4)
And for both calls:
scala> val x = Iterator(1, 2, 3, 2, 4)
val x: Iterator[Int] = <iterator>
scala> x.find(_ == 2)
val res5: Option[Int] = Some(2)
scala> x.find(_ == 2)
val res6: Option[Int] = Some(2)
scala> x.toList
val res7: List[Int] = List(4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, find
won't consume the iterator fully. Instead for condition = true
, find will return the matched item to the flatMap for further processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah now I understand, TDIL! Thanks
Yes, this is already handled. The pass will only proceed and infer if we find only a single method matching the criteria |
The changes in this PR seem to achieve the goal they intend to, but this was actually the way it was done in the |
@fabsx00 and @DavidBakerEffendi I think it would be good to have another discussion about type inference in javasrc2cpg. We've run into a couple of situations where legitimate bugs were hidden by various type inference mechanisms (which would've been discovered easily if the signatures contained Ideally, the CPG created by javasrc2cpg would only contain type information we know from the This is a discussion to have separate from this PR though. |
@khemrajrathore, based on @johannescoetzee's message, I think it would be faster to override this in Privado's side if this is an urgent feature. |
@johannescoetzee Another option we discussed was if we added a flag that allows for this behaviour to be enabled, and is off by default. Would this be a good compromise? |
@DavidBakerEffendi @khemrajrathore An off-by-default flag would work, although in my opinion this would be a temporary measure until we reach a decision on how to handle inference in general. As such, I suggest adding it as a hidden flag with a description saying this is temporary to avoid issues if we want to remove it later |
This PR has the following changes
TypeInferencePass
, acall
is linked to amethod
If-
Namespace
matches and-
Number of arguments in call
andNumber of parameter in method
matches and-
Type of arguments in call
andType of parameter in method
matchescall to a method
, we try to link If-
Namespace
matches and-
Number of argument in call
andNumber of parameter in method
matches