Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various fixes for Elasticsearch 5 compatibility #1299

Merged
merged 20 commits into from
Oct 25, 2024
Merged

Conversation

haarg
Copy link
Member

@haarg haarg commented Oct 24, 2024

  • Fixes a few queries that I missed before using nested.filter or not.
  • Fix some ES parameters to use JSON booleans.
  • Ignore string to text/keyword type conversion done by Elasticsearch 5.
  • Monkey patch ElasticSearchX::Model to avoid search_type: scan.
  • Remove some options from the mapping that are no longer supported on Elasticsearch 5, but we don't appear to be using.
  • various other query fixes
  • remove payload from a completion field
  • don't try to limit _source in nested query
  • fix max aggregation to use syntax compatible with es5 scripting
  • update pod field
    • remove term_vector and fielddata settings, which seem unneeded and prevent upgrade to text or keyword
    • convert main pod field to analyzed, as we never need exact matches and indexing is problematic when large
    • disable doc_values as unneeded and problematic for long values
  • convert all queries to use _source rather than fields. Requires a hack in ESXM.
  • cope with hits.total being an object (ES7)

@haarg haarg force-pushed the haarg/es5-compat branch 2 times, most recently from 842b04a to 739a174 Compare October 24, 2024 14:53
haarg added 20 commits October 25, 2024 12:38
To support newer Elasticsearch versions, we want a new
Search::Elasticsearch version. It still supports older versions, as long
as the correct module is installed and it is instructed to use it.

We've already made the necessary changes to explicitly configure it to
use Search::Elasticsearch::Client::2_0::Direct, so we are now free to
upgrade.
Elasticsearch 5 wants these to be booleans.
Elasticsearch 5 doesn't support payloads on type: completion fields. We
were storing data in the payload, but not using it at all. Remove it.
This isn't supported in the same way in newer ES, and we aren't fetching
an excessive amount of data, so it doesn't really matter.
Use types from MooseX::Types::ElasticSearch instead of
ElaticSearchX::Model::Document::Types for ES and Location types. It's
just a reexport in ElaticSearchX::Model::Document::Types anyway.
ElasticSearchX::Model's set delete method uses search_type: scan, but
that is not supported in Elasticsearch 5. Add a module to monkey patch
ESXM to use sort: _doc instead.
Elasticsearch 5 will automatically convert string mapping types to be
text or keyword types. We want to support both ES 2.4 and 5, so keep the
declared mappings using the string type. When we check this against what
is deployed, ignore the difference if a string is converted to text or
keyword.
As far as I can tell, these are unused. These fields are using the
string type, which will be automatically be converted to text or keyword
types. With term_vector and fielddata, this automatic conversion can't
be done. Removing them allow the same mapping to work on both
Elasticsearch 2.4 and 5.
ES5 doesn't support "fields" to select qhat to return. Instead we can
use _source.
Newer ES versions don't allow selecting fields. Instead, you need to use
stored_fields, but it works rather differently. _source will do
basically everything we need, and works on both old and new ES.
We don't need pod in both analyzed and non-analyzed form. We could
remove the pod.analyzed field, but it would be more complicated to
upgrade. Trying to index pod as non-analyzed with doc_values will
be a problem for large pod content.

Make both the main pod field and pod.analyzed analyzed, with doc_values
off. We can consider dropping the sub-field in the future.
Elasticsearch 6 return hits.total as an object. Add a function that can
cope with both forms, and use it in all of the places we retrieve the
total.
We're nearly prepared to upgrade to Elasticsearch 5, and the code should
work with the new version. Add the client prerequisite, so using ES5 is
just a configuration change.
mickeyn
mickeyn previously approved these changes Oct 25, 2024
@haarg haarg dismissed mickeyn’s stale review October 25, 2024 15:33

The merge-base changed after approval.

@haarg haarg merged commit a7c947c into master Oct 25, 2024
1 check passed
@haarg haarg deleted the haarg/es5-compat branch October 25, 2024 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants