feat: optimizer rewrite #5676

kylemumma · 2024-03-21T17:38:14Z

https://getsentry.atlassian.net/browse/SNS-2669
This PR should get all of riccardos broken tests passing.

What motivated the rewrite?

I needed to implement new functionality to get riccardos tests passing
It was gunna be a lot of work to do
After talking to enoch we thought of an alternative approach, this turned out to a better solution that would be faster to implement

Major Changes:

pretty much full optimizer rewrite
optimizer feature flag on by default
fixed breaking integration tests

heres how it works now:

use a visitor to get all the conditional aggregates (ex. sumMergeIf) in the ast
or all the conditions in these functions, and add this to the where clause.

ex:

SELECT sumIf(val, metric_id in [1,2,3] and status=200),
       avgIf(val, metric_id=7), 
       maxIf(val, metric_id=5 and status=400)
WHERE org_id=1
...

will generate the following

(metric_id in [1,2,3] and status=200) or (metric_id=7) or (metric_id=5 and status=400)

and add it to the where clause

SELECT sumIf(val, metric_id in [1,2,3] and status=200),
       avgIf(val, metric_id=7), 
       maxIf(val, metric_id=5 and status=400)
WHERE org_id=1 and (
        (metric_id in [1,2,3] and status=200) or (metric_id=7) or (metric_id=5 and status=400)
    )
...

codecov · 2024-03-21T17:59:22Z

Codecov Report

Attention: Patch coverage is 93.79845% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 89.91%. Comparing base (4788281) to head (c744ea9).

✅ All tests successful. No failed tests found ☺️

Files	Patch %	Lines
...y/processors/logical/filter_in_select_optimizer.py	86.53%	7 Missing ⚠️
snuba/query/dsl.py	90.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5676      +/-   ##
==========================================
+ Coverage   89.86%   89.91%   +0.04%     
==========================================
  Files         900      900              
  Lines       43770    43758      -12     
  Branches      301      301              
==========================================
+ Hits        39333    39344      +11     
+ Misses       4395     4372      -23     
  Partials       42       42

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tests/test_metrics_sdk_api.py

snuba/query/processors/logical/filter_in_select_optimizer.py

tests/query/processors/test_filter_in_select_optimizer.py

snuba/query/processors/logical/filter_in_select_optimizer.py

tests/query/processors/test_filter_in_select_optimizer.py

tests/test_metrics_sdk_api.py

snuba/query/processors/logical/filter_in_select_optimizer.py

tests/query/processors/test_filter_in_select_optimizer.py

snuba/query/processors/logical/filter_in_select_optimizer.py

kylemumma · 2024-03-26T20:49:49Z

snuba/query/conditions.py

+    """This function is deprecated please use snuba.query.dsl.binary_condition"""
+    from snuba.query.dsl import binary_condition as dsl_binary_condition
+
+    return dsl_binary_condition(function_name, lhs, rhs)


couldnt use this function in dsl.py bc of circular import with conditions.py. I thought this belonged in dsl.py anyways so I moved it.

kylemumma · 2024-03-26T20:51:57Z

snuba/query/mql/parser.py

+            op="processor", description="filter_in_select_optimize"
+        ):
+            if settings is None:
+                FilterInSelectOptimizer().process_query(query, HTTPQuerySettings())


I would like to point out that in this context settings can be None, LogicalQueryProcessor interface requires settings not None, thus I had to create dummy object.

Since settings arent used by my processor it doesnt matter, but the fact that i have to create this dummy instead of None smells to me

kylemumma · 2024-03-26T20:56:35Z

tests/query/parser/test_formula_mql_query.py

I decided to just get these tests working rather than do a refactor that loosens coupling between parsing and post_processing/optimization. Maybe its something I could think about in the future.

Evan made an interesting point about end-to-end testing as a single source of truth, and the increased confidence it brings in the pipeline,
the tradeoff seems to be that this testing file has to change any time another stage is added. Im not sure what the right answer is.

…ndition

getsentry-bot · 2024-03-28T17:54:04Z

PR reverted: af29c13

This reverts commit f746e1d. Co-authored-by: kylemumma <24424170+kylemumma@users.noreply.github.com>

kylemumma commented Mar 21, 2024

View reviewed changes

tests/test_metrics_sdk_api.py Show resolved Hide resolved

kylemumma mentioned this pull request Mar 21, 2024

feat: mql test update for snuba change getsentry/sentry#67444

Merged

kylemumma commented Mar 21, 2024

View reviewed changes