-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Expand UnscopedFind
offenses in "graphql" namespaces
#30
feat: Expand UnscopedFind
offenses in "graphql" namespaces
#30
Conversation
node | ||
.ancestors | ||
.select { |n| n.class_type? || n.module_type? } | ||
.any? { |n| n.identifier.to_s.downcase.include?("graphql") } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Just curious if we should match the string if it contains the "graphql"
string anywhere even if it's a subset of some larger string?
What if instead we used:
n.identifier.to_s.match?(/\bGraphQL\b/)
If we want to match GraphQL
and Graphql
or similar, then we could add the /i
regexp modifier to make it case insensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mm interesting, so I had intentionally written it to be at least flexible enough to match GraphQL
and Graphql
, thinking that I could easily imagine someone making the choice to downcase the QL
, and thinking that that decision didn't feel super relevant to me, as to whether this cop should still apply.
Down to hear any other opinions on the question! Either way, I think the regex is a nicer touch (particularly with the word boundary) so I can switch to the case insensitive version and then see what others think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should note that I think the odds of another string containing the subset graphql
, but it being something else, is pretty low. It's more as a matter of practice that I try to match strings in a way that would be correct in more instances.
I have seen production bugs caused by using String#include?
where a stricter match wouldn't have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one question about the use of String#include?
and asking if we should be more precise in what we are matching.
platform tafn
/task https://app.asana.com/0/1200240516476402/1206132040061266/f will review this coming week! |
node | ||
.ancestors | ||
.select { |n| n.class_type? || n.module_type? } | ||
.any? { |n| n.identifier.to_s.match(/\bGraphQL\b/i) } | ||
.any? { |n| n.identifier.to_s.match(GRAPHQL_PATTERN) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I would suggest using String#match?
because it satisfies the requirements, but also has a simpler return value; and is a "simpler primitive".
In code I tend to try to follow the principle of least power, which means that if I have 2 choices for a language construct, and both satisfy the requirements, I will choose the approach that has "fewer degrees of freedom" -- that is, it either has fewer or more constrained inputs or a more constrained return type.
In this case we are moving from a method with the MatchData return value to a Boolean. Sure, the evaluation of MatchData works due to how ruby evaluates non-nil objects in boolean context, but the Boolean return value of #match?
has even fewer ways it can be used (or misused!). At some point in the future another developer is going to read or change this code, and our intentions are more clear with a Boolean return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain lgtm -- very straight forward change. thanks for putting this together!
Invalidated by push of 43ec18a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain lgtm
def graphql_namespace?(node) | ||
return true if processed_source.path&.match?(GRAPHQL_PATTERN) | ||
|
||
node | ||
.ancestors | ||
.select { |n| n.class_type? || n.module_type? } | ||
.any? { |n| n.identifier.to_s.match?(GRAPHQL_PATTERN) } | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: This seems to be mixing concerns a bit, since it matches a file path against a pattern and then proceeds to match the stringified representation of the AST against the regexp. It works, but I would suggest splitting it out so it's more obvious.
The identifier.to_s
serialized to a longer string and I thought it might be better to match precisely on the name exactly.
What I would propose is to use RuboCop's built-in matching syntax to match the nodes rather than doing all of it by hand. The existing code uses some the matching syntax already so it's not much of a stretch, eg:
# Add under the two `def_node_matcher` calls, returns only nodes matching the regexp:
def_node_search :find_namespace_nodes, <<~PATTERN, name: GRAPHQL_PATTERN
(const _ %name)
PATTERN
# At the end of on_send:
add_offense(node) if find_param_arg(arg_nodes) || graphql_file? || graphql_namespace?(node)
# ...
def graphql_file?
processed_source.path&.match?(GRAPHQL_PATTERN)
end
def graphql_namespace?(node)
node
.each_ancestor(:class, :module)
.any? { |ancestor| find_namespace_nodes(ancestor).any? }
end
EDIT: I looked at the RuboCop pattern matching language and sadly there is no expression to say "match ancestors" to eliminate that extra #each_ancestors
call. I would've loved it if I could have #graphql_namespace?
just be something simple like: find_namespace_nodes(node).any?
.. unfortunately no luck. This is the simplest I could make it, but if I'm missing something lmk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! This works well, and it's nice to use the intended tools for this. I also tried for a while to see if even the syntax you have here could be more concise using a smarter matcher instead of a full search, but ended up coming up with nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also FYI, I changed your method name to find_graphql_namespace_nodes
since I felt like there was context missing from it)
Invalidated by push of 91f62da
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the improved name for the node search method.
platform lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain LGTM!
Summary
Currently, the
UnscopedFind
cop currently identifies violations of trust-root-chaining by looking for different types of custom finds that contain "user input"—specifically, that a non-trust-root-chained find contains a direct or indirect reference toparams
. For ruby applications using GraphQL, this cop will almost never find violations, since GraphQL implementations will idiomatically have no reference toparams
.Here, I'm attempting to bridge the
UnscopedFind
gap for GraphQL implementations by also considering relevant find nodes in violation with the cop when inside of a namespace matching "graphql". Since popular packages likegraphql-ruby
promote the usage ofGraphQL
namespaces, I think this is the highest-fidelity fix I can imagine to expand the relevance of this cop outside of REST, without fully removing theparams
-based logic for the current cop./domain @smudge @effron
/domain @lindan-betterment
/platform @Betterment/ruby-platform-reviewers