Skip to content

Commit

Permalink
[cases](array_contains)add cases for array_contains supporting invert…
Browse files Browse the repository at this point in the history
…ed index and fix stopwords as query string (apache#35299)

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

## Further comments

If this is a relatively large or complex change, kick off the discussion
at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why
you chose the solution you did and what alternatives you considered,
etc...
  • Loading branch information
amorynan committed May 30, 2024
1 parent 3c11f09 commit fad30bf
Show file tree
Hide file tree
Showing 48 changed files with 7,999 additions and 2 deletions.
23 changes: 21 additions & 2 deletions be/src/vec/functions/array/function_array_index.h
Original file line number Diff line number Diff line change
Expand Up @@ -129,13 +129,32 @@ class FunctionArrayIndex : public IFunction {
return Status::Error<ErrorCode::INVERTED_INDEX_EVALUATE_SKIPPED>(
"Inverted index evaluate skipped, param_value is nullptr or value is null");
}
if (iter->get_inverted_index_reader_type() ==
segment_v2::InvertedIndexReaderType::FULLTEXT) {
// parser is not none we can not make sure the result is correct in expr combination
// for example, filter: !array_index(array, 'tall:120cm, weight: 35kg')
// here we have rows [tall:120cm, weight: 35kg, hobbies: reading book] which be tokenized
// but query is also tokenized, and FULLTEXT reader will catch this row as matched,
// so array_index(array, 'tall:120cm, weight: 35kg') return this rowid,
// but we expect it to be filtered, because we want row is equal to 'tall:120cm, weight: 35kg'
return Status::Error<ErrorCode::INVERTED_INDEX_EVALUATE_SKIPPED>(
"Inverted index evaluate skipped, FULLTEXT reader can not support array_index");
}
std::unique_ptr<InvertedIndexQueryParamFactory> query_param = nullptr;
RETURN_IF_ERROR(InvertedIndexQueryParamFactory::create_query_value(
param_value->type, &param_value->value, query_param));
if (is_string_type(param_value->type)) {
RETURN_IF_ERROR(iter->read_from_inverted_index(
Status st = iter->read_from_inverted_index(
data_type_with_name.first, query_param->get_value(),
segment_v2::InvertedIndexQueryType::EQUAL_QUERY, num_rows, roaring));
segment_v2::InvertedIndexQueryType::EQUAL_QUERY, num_rows, roaring);
if (st.code() == ErrorCode::INVERTED_INDEX_NO_TERMS) {
// if analyzed param with no term, we do not filter any rows
// return all rows with OK status
bitmap->addRange(0, num_rows);
return Status::OK();
} else if (st != Status::OK()) {
return st;
}
} else {
RETURN_IF_ERROR(iter->read_from_inverted_index(
data_type_with_name.first, query_param->get_value(),
Expand Down
1,000 changes: 1,000 additions & 0 deletions regression-test/data/inverted_index_p0/array_contains/documents-1000.json

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !sql --
2

-- !sql --
3

Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0 0

-- !sql --
0 \N

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0

-- !sql --
0 0

-- !sql --
0 \N

-- !sql --
0

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !sql --
2 ["I am a person"]

-- !sql --

Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !sql --

-- !sql --

Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !select1 --
1 2017-10-01 10 1 ["Beijing China"] ["Software Developer"]
2 2017-10-01 10 1 ["Beijing China"] ["Communication Engineer"]
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]
4 2017-10-02 10 0 ["Beijing China"] ["Both a teacher and a scientist"]
5 2017-10-02 10 1 ["Shenzhen China"] ["teacher"]
6 2017-10-03 10 1 ["Hongkong China"] ["Architectural designer"]

-- !select2 --
1 2017-10-01 10 1 ["Beijing China"] ["Software Developer"]
2 2017-10-01 10 1 ["Beijing China"] ["Communication Engineer"]
4 2017-10-02 10 0 ["Beijing China"] ["Both a teacher and a scientist"]

-- !select3 --
1 2017-10-01 10 1 ["Beijing China"] ["Software Developer"]
2 2017-10-01 10 1 ["Beijing China"] ["Communication Engineer"]

-- !select4 --

-- !select5 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select6 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select7 --

-- !select8 --

-- !select9 --

-- !select10 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select11 --

Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !select1 --
1 2017-10-01 10 1 ["Beijing China"] ["Software Developer"]
2 2017-10-01 10 1 ["Beijing China"] ["Communication Engineer"]
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]
4 2017-10-02 10 0 ["Beijing China"] ["Both a teacher and a scientist"]
5 2017-10-02 10 1 ["Shenzhen China"] ["teacher"]
6 2017-10-03 10 1 ["Hongkong China"] ["Architectural designer"]

-- !select2_v1 --

-- !select3_v1 --

-- !select4_v1 --

-- !select5_v1 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select6_v1 --

-- !select7_v1 --

-- !select8_v1 --

-- !select9_v1 --

-- !select10_v1 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select11_v1 --

Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
-- This file is automatically generated. You should know what you did if you want to edit this
-- !select1 --
1 2017-10-01 10 1 ["Beijing China"] ["Software Developer"]
2 2017-10-01 10 1 ["Beijing China"] ["Communication Engineer"]
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]
4 2017-10-02 10 0 ["Beijing China"] ["Both a teacher and a scientist"]
5 2017-10-02 10 1 ["Shenzhen China"] ["teacher"]
6 2017-10-03 10 1 ["Hongkong China"] ["Architectural designer"]

-- !select2_v1 --

-- !select3_v1 --

-- !select4_v1 --

-- !select5_v1 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select6_v1 --

-- !select7_v1 --

-- !select8_v1 --

-- !select9_v1 --

-- !select10_v1 --
3 2017-10-01 10 1 ["Shanghai China"] ["electrical engineer"]

-- !select11_v1 --

Loading

0 comments on commit fad30bf

Please sign in to comment.