Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindex all questions when adding a new one #2526

Merged
merged 9 commits into from
Oct 9, 2024
Merged

Conversation

javitonino
Copy link
Contributor

@javitonino javitonino commented Oct 9, 2024

Changes

Always send all splits to the index each time

This makes it so we send the entire conversation field to the index when adding a new question. This way the granularity of the index is the field and it doesn't have to know about splits and could make our code simpler.

Remove splits from paragraphs_to_delete and sentences_to_delete

We are always sending all splits, so we can reindex the entire field

Remove per-paragraph deletion from paragraphs_to_delete

This was started a few months ago and the changes are already propagated everywhere

Remove vectors on field deletion

When deleting a field, delete_metadata() was called which deleted only from paragraph index, not vectors index.

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.22%. Comparing base (099a07a) to head (b805ac8).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2526      +/-   ##
==========================================
- Coverage   86.28%   86.22%   -0.07%     
==========================================
  Files         376      376              
  Lines       23580    23566      -14     
==========================================
- Hits        20347    20320      -27     
- Misses       3233     3246      +13     
Flag Coverage Δ
nucliadb 73.60% <61.53%> (-0.05%) ⬇️
nucliadb-ingest 28.72% <100.00%> (-0.04%) ⬇️
nucliadb-reader 24.72% <53.84%> (-0.01%) ⬇️
nucliadb-search 37.78% <61.53%> (+0.01%) ⬆️
nucliadb-standalone 46.86% <46.15%> (-0.02%) ⬇️
nucliadb-train 44.91% <46.15%> (+0.02%) ⬆️
nucliadb-writer 39.32% <53.84%> (-0.01%) ⬇️
nucliadb_dataset 55.45% <ø> (ø)
nucliadb_models 85.11% <ø> (ø)
nucliadb_sdk 80.11% <ø> (ø)
nucliadb_sidecar 87.24% <ø> (-1.03%) ⬇️
nucliadb_telemetry 86.55% <ø> (ø)
nucliadb_utils 84.03% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

replace_field was always True
replace_splits contained deleted_splits, we now always delete them when
replacing the field (which is always true)
@javitonino javitonino merged commit 2d7269a into main Oct 9, 2024
51 checks passed
@javitonino javitonino deleted the conversation_index branch October 9, 2024 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants