feat: Expand `UnscopedFind` offenses in "graphql" namespaces #30

willlockwood · 2023-12-08T01:02:50Z

Summary

Currently, the UnscopedFind cop currently identifies violations of trust-root-chaining by looking for different types of custom finds that contain "user input"—specifically, that a non-trust-root-chained find contains a direct or indirect reference to params. For ruby applications using GraphQL, this cop will almost never find violations, since GraphQL implementations will idiomatically have no reference to params.

Here, I'm attempting to bridge the UnscopedFind gap for GraphQL implementations by also considering relevant find nodes in violation with the cop when inside of a namespace matching "graphql". Since popular packages like graphql-ruby promote the usage of GraphQL namespaces, I think this is the highest-fidelity fix I can imagine to expand the relevance of this cop outside of REST, without fully removing the params-based logic for the current cop.

/domain @smudge @effron
/domain @lindan-betterment
/platform @Betterment/ruby-platform-reviewers

dkubb · 2023-12-08T01:22:42Z

lib/rubocop/cop/betterment/unscoped_find.rb

+          node
+            .ancestors
+            .select { |n| n.class_type? || n.module_type? }
+            .any? { |n| n.identifier.to_s.downcase.include?("graphql") }


question: Just curious if we should match the string if it contains the "graphql" string anywhere even if it's a subset of some larger string?

What if instead we used:

n.identifier.to_s.match?(/\bGraphQL\b/)

If we want to match GraphQL and Graphql or similar, then we could add the /i regexp modifier to make it case insensitive.

Mm interesting, so I had intentionally written it to be at least flexible enough to match GraphQL and Graphql, thinking that I could easily imagine someone making the choice to downcase the QL, and thinking that that decision didn't feel super relevant to me, as to whether this cop should still apply.

Down to hear any other opinions on the question! Either way, I think the regex is a nicer touch (particularly with the word boundary) so I can switch to the case insensitive version and then see what others think?

I should note that I think the odds of another string containing the subset graphql, but it being something else, is pretty low. It's more as a matter of practice that I try to match strings in a way that would be correct in more instances.

I have seen production bugs caused by using String#include? where a stricter match wouldn't have.

dkubb

I left one question about the use of String#include? and asking if we should be more precise in what we are matching.

platform tafn

6f6d6172 · 2023-12-08T22:15:41Z

/task https://app.asana.com/0/1200240516476402/1206132040061266/f

will review this coming week!

dkubb · 2023-12-11T18:09:41Z

lib/rubocop/cop/betterment/unscoped_find.rb

          node
            .ancestors
            .select { |n| n.class_type? || n.module_type? }
-            .any? { |n| n.identifier.to_s.match(/\bGraphQL\b/i) }
+            .any? { |n| n.identifier.to_s.match(GRAPHQL_PATTERN) }


suggestion: I would suggest using String#match? because it satisfies the requirements, but also has a simpler return value; and is a "simpler primitive".

In code I tend to try to follow the principle of least power, which means that if I have 2 choices for a language construct, and both satisfy the requirements, I will choose the approach that has "fewer degrees of freedom" -- that is, it either has fewer or more constrained inputs or a more constrained return type.

In this case we are moving from a method with the MatchData return value to a Boolean. Sure, the evaluation of MatchData works due to how ruby evaluates non-nil objects in boolean context, but the Boolean return value of #match? has even fewer ways it can be used (or misused!). At some point in the future another developer is going to read or change this code, and our intentions are more clear with a Boolean return value.

6f6d6172

domain lgtm -- very straight forward change. thanks for putting this together!

Invalidated by push of 43ec18a

6f6d6172

domain lgtm

dkubb · 2023-12-15T00:21:04Z

lib/rubocop/cop/betterment/unscoped_find.rb

+        def graphql_namespace?(node)
+          return true if processed_source.path&.match?(GRAPHQL_PATTERN)
+
+          node
+            .ancestors
+            .select { |n| n.class_type? || n.module_type? }
+            .any? { |n| n.identifier.to_s.match?(GRAPHQL_PATTERN) }
+        end


suggestion: This seems to be mixing concerns a bit, since it matches a file path against a pattern and then proceeds to match the stringified representation of the AST against the regexp. It works, but I would suggest splitting it out so it's more obvious.

The identifier.to_s serialized to a longer string and I thought it might be better to match precisely on the name exactly.

What I would propose is to use RuboCop's built-in matching syntax to match the nodes rather than doing all of it by hand. The existing code uses some the matching syntax already so it's not much of a stretch, eg:

# Add under the two `def_node_matcher` calls, returns only nodes matching the regexp: def_node_search :find_namespace_nodes, <<~PATTERN, name: GRAPHQL_PATTERN (const _ %name) PATTERN # At the end of on_send: add_offense(node) if find_param_arg(arg_nodes) || graphql_file? || graphql_namespace?(node) # ... def graphql_file? processed_source.path&.match?(GRAPHQL_PATTERN) end def graphql_namespace?(node) node .each_ancestor(:class, :module) .any? { |ancestor| find_namespace_nodes(ancestor).any? } end

EDIT: I looked at the RuboCop pattern matching language and sadly there is no expression to say "match ancestors" to eliminate that extra #each_ancestors call. I would've loved it if I could have #graphql_namespace? just be something simple like: find_namespace_nodes(node).any? .. unfortunately no luck. This is the simplest I could make it, but if I'm missing something lmk.

Very nice! This works well, and it's nice to use the intended tools for this. I also tried for a while to see if even the syntax you have here could be more concise using a smarter matcher instead of a full search, but ended up coming up with nothing.

(also FYI, I changed your method name to find_graphql_namespace_nodes since I felt like there was context missing from it)

Invalidated by push of 91f62da

dkubb

I like the improved name for the node search method.

platform lgtm

smudge

domain LGTM!

This PR updates the betterlint version to `1.7.0` to include #29 and #30 in a new gem release.

feat: Expand UnscopedFind offenses in "graphql" namespaces

ad7cd91

dkubb reviewed Dec 8, 2023

View reviewed changes

Incorporate regex suggestion

d3660d2

Add violation condition for when file path matches graphql pattern

8d76335

dkubb reviewed Dec 11, 2023

View reviewed changes

6f6d6172 previously approved these changes Dec 13, 2023

View reviewed changes

Use match? instead of match in graphql regex match

43ec18a

willlockwood requested review from dkubb and 6f6d6172 December 13, 2023 21:17

6f6d6172 previously approved these changes Dec 13, 2023

View reviewed changes

dkubb reviewed Dec 15, 2023

View reviewed changes

Use the intended rubocop node search

91f62da

dkubb approved these changes Dec 15, 2023

View reviewed changes

smudge approved these changes Dec 15, 2023

View reviewed changes

willlockwood mentioned this pull request Dec 18, 2023

chore: Update version to 1.7.0 #31

Merged

smudge merged commit 0040628 into Betterment:main Dec 18, 2023
4 checks passed

smudge pushed a commit that referenced this pull request Dec 18, 2023

chore: Update version to 1.7.0 (#31)

39dfc45

This PR updates the betterlint version to `1.7.0` to include #29 and #30 in a new gem release.

willlockwood deleted the lockwood/graphql-unscoped-find branch December 18, 2023 22:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Expand `UnscopedFind` offenses in "graphql" namespaces #30

feat: Expand `UnscopedFind` offenses in "graphql" namespaces #30

willlockwood commented Dec 8, 2023

dkubb Dec 8, 2023 •

edited

Loading

willlockwood Dec 8, 2023

dkubb Dec 11, 2023

dkubb left a comment •

edited

Loading

6f6d6172 commented Dec 8, 2023

dkubb Dec 11, 2023 •

edited

Loading

6f6d6172 left a comment

6f6d6172 left a comment

dkubb Dec 15, 2023 •

edited

Loading

willlockwood Dec 15, 2023

willlockwood Dec 15, 2023

dkubb left a comment

smudge left a comment

feat: Expand UnscopedFind offenses in "graphql" namespaces #30

feat: Expand UnscopedFind offenses in "graphql" namespaces #30

Conversation

willlockwood commented Dec 8, 2023

Summary

dkubb Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

willlockwood Dec 8, 2023

Choose a reason for hiding this comment

dkubb Dec 11, 2023

Choose a reason for hiding this comment

dkubb left a comment • edited Loading

Choose a reason for hiding this comment

6f6d6172 commented Dec 8, 2023

dkubb Dec 11, 2023 • edited Loading

Choose a reason for hiding this comment

6f6d6172 left a comment

Choose a reason for hiding this comment

6f6d6172 left a comment

Choose a reason for hiding this comment

dkubb Dec 15, 2023 • edited Loading

Choose a reason for hiding this comment

willlockwood Dec 15, 2023

Choose a reason for hiding this comment

willlockwood Dec 15, 2023

Choose a reason for hiding this comment

dkubb left a comment

Choose a reason for hiding this comment

smudge left a comment

Choose a reason for hiding this comment

feat: Expand `UnscopedFind` offenses in "graphql" namespaces #30

feat: Expand `UnscopedFind` offenses in "graphql" namespaces #30

dkubb Dec 8, 2023 •

edited

Loading

dkubb left a comment •

edited

Loading

dkubb Dec 11, 2023 •

edited

Loading

dkubb Dec 15, 2023 •

edited

Loading