Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Inline array filters #3028

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

nasdf
Copy link
Member

@nasdf nasdf commented Sep 18, 2024

Relevant issue(s)

Resolves #2857

Description

This PR adds three new filter types for inline array types: _all, _any, and _none.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

Added integration tests

Specify the platform(s) on which this was tested:

  • MacOS

@nasdf nasdf added feature New feature or request area/query Related to the query component labels Sep 18, 2024
@nasdf nasdf added this to the DefraDB v0.14 milestone Sep 18, 2024
@nasdf nasdf self-assigned this Sep 18, 2024
@nasdf nasdf requested a review from a team September 18, 2024 00:03
Copy link

codecov bot commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 94.40559% with 16 lines in your changes missing coverage. Please review.

Project coverage is 79.42%. Comparing base (5b58c19) to head (670f3b2).

Files with missing lines Patch % Lines
internal/connor/all.go 75.00% 5 Missing and 2 partials ⚠️
internal/connor/any.go 75.00% 5 Missing and 2 partials ⚠️
internal/connor/none.go 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3028      +/-   ##
===========================================
+ Coverage    79.38%   79.42%   +0.04%     
===========================================
  Files          333      336       +3     
  Lines        25837    26072     +235     
===========================================
+ Hits         20509    20706     +197     
- Misses        3864     3889      +25     
- Partials      1464     1477      +13     
Flag Coverage Δ
all-tests 79.42% <94.41%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/connor/and.go 54.55% <ø> (ø)
internal/connor/connor.go 100.00% <100.00%> (ø)
internal/connor/eq.go 94.87% <100.00%> (+2.71%) ⬆️
internal/planner/mapper/mapper.go 88.83% <100.00%> (ø)
internal/request/graphql/schema/generate.go 88.06% <100.00%> (+0.10%) ⬆️
internal/request/graphql/schema/manager.go 98.67% <100.00%> (+0.23%) ⬆️
internal/request/graphql/schema/types/base.go 100.00% <100.00%> (ø)
internal/connor/none.go 60.00% <60.00%> (ø)
internal/connor/all.go 75.00% <75.00%> (ø)
internal/connor/any.go 75.00% <75.00%> (ø)

... and 15 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b58c19...670f3b2. Read the comment docs.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very nice and thanks for all the tests.

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I just requested few changes.

// matching if all of them match.
func all(condition, data any) (bool, error) {
switch t := data.(type) {
case []string:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: I think it's better to use reflection here to avoid repetitive code:

func all(condition, data any) {
    val := reflect.ValueOf(v)
    if val.Kind() != reflect.Slice {
        return false, client.NewErrUnhandledType("data", data)
    }

    for i := 0; i < val.Len(); i++ {
        m, err := eq(condition, val.Index(i).Interface())
            if err != nil {
                return false, err
            } else if !m {
                return false, nil
            }
        }
    }
    return true, nil
}

Or at least use generic:

func allInArr[T any](slice []T, condition any) (bool, error) {
	for _, item := range slice {
		m, err := eq(condition, item)
		if err != nil {
			return false, err
		}
		if !m {
			return false, nil
		}
	}
	return true, nil
}

func all(condition, data any) (bool, error) {
	switch t := data.(type) {
	case []string:
		return allInArr(t, condition)
	case []immutable.Option[string]:
		return allInArr(t, condition)
	case []int64:
		return allInArr(t, condition)
// ...

This will also improve test coverage.

Copy link
Collaborator

@fredcarle fredcarle Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reflect is a bit slow and not fully supported in wasm so lets try to avoid it for now. I like the generic idea though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with Fred, we should avoid reflect, especially in a performance sensitive area like this. Extracting the shared logic to a generic func would be good though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// matching if any of them match.
func anyOp(condition, data any) (bool, error) {
switch t := data.(type) {
case []string:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: please use the same approach as with all

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 53 to 56
if arr.HasValue() {
data = arr.Value()
} else {
data = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: this could also be turned into generic code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +48 to +61
Fields: gql.InputObjectConfigFieldMap{
"_any": &gql.InputObjectFieldConfig{
Description: anyOperatorDescription,
Type: op,
},
"_all": &gql.InputObjectFieldConfig{
Description: allOperatorDescription,
Type: op,
},
"_none": &gql.InputObjectFieldConfig{
Description: noneOperatorDescription,
Type: op,
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: this can be extracted into a function and used everywhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about the extraction of the full InputObjectFieldConfig blocks to a shared func (it couples them together, and hides what each InputObject contains, yet the enforcement of consistency is nice).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is to keep them as is because it makes modifying each individual type easier.

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestQueryInlineStringArrayWithAllFilter(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: we try to stick to the following test naming convention:
<Category>_<Condition|Prerequisite>_<Result|Expectation>
In this case it could be
TestQueryInlineStringArray_WithAllFilter_Succeeds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I did think we agreed to not force that suggestion upon developers though, especially the last block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that that last part is optional

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +145 to +147
Users(filter: {testScores: {_all: {_lt: 70}}}) {
name
}
Copy link
Contributor

@islamaliev islamaliev Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: would be also great to see a reassurance (in form of another test) that this new op works with compound conditions. For example:
filter: {testScores: {_all: {_and: [{_lt: 70}, {_gte: 0}]}}}
And to do the same with other new ops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added two compound tests.

notNullBooleanOpBlock := schemaTypes.NotNullBooleanOperatorBlock()
notNullStringOpBlock := schemaTypes.NotNullStringOperatorBlock()
notNullJSONOpBlock := schemaTypes.NotNullJSONOperatorBlock(jsonScalarType)
notNullBlobOpBlock := schemaTypes.NotNullBlobOperatorBlock(blobScalarType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I'm curious - why did you pull the existing stuff out of the return and into temp vars? And then add new func calls within the return?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made sense to me organizationally. Also the temp vars are used as inputs to the calls in the return.

Name: "BooleanListOperatorBlock",
Description: "These are the set of filter operators available for use when filtering on [Boolean] values.",
Fields: gql.InputObjectConfigFieldMap{
"_any": &gql.InputObjectFieldConfig{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Independent of Islam's suggestion, it might be nice to extract the names to consts in the client/request package and then also reference them in the connor package as well as here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of scope for this PR imo, but I will create an issue for this.

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/query Related to the query component feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable documents filtering by inline array fields
4 participants