Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(search): Rips out SOLR indexing pipeline #4791

Merged
merged 70 commits into from
Dec 11, 2024

Conversation

ERosendo
Copy link
Contributor

@ERosendo ERosendo commented Dec 5, 2024

This PR addresses #4726

This PR includes the following changes:

  • Removes scorched dependency.

  • Updates the settings module to clean up SOLR env vars.

  • Removes Solr health check

  • Removes custom save and as_search_dict methods from models.

  • Refactors AudioFactory class to stop using the index argument in hooks.

  • Updates admin classes to remove custom methods that were previously used to index data to Solr.

  • Removes Solr-specific admin utility functions.

  • Removes the normalize_search_dicts helper.

  • Renames solr_list to extract_solr_field_values to better reflect its purpose.

  • Removes the following test classes: EmptySolrTestCase, SolrTestCase and IndexedSolrTestCase

  • Deletes the following commands:

    • add_parallel_citations.py
    • import_columbia.py
    • cl_get_ftm_ids.py
    • cl_update_index.py

Copy link

semgrep-app bot commented Dec 5, 2024

Semgrep found 4 use-django-environ findings:

You are using environment variables inside django app. Use django-environ as it a better alternative for deployment.

Ignore this finding from use-django-environ

cl/scrapers/tasks.py Outdated Show resolved Hide resolved
This commit streamlines the send_alerts_and_webhooks method by removing the return of Recap document IDs. It now returns an empty list to indicate successful completion.
This commit removes the unnecessary type import from Scorched.
This commit removes the helper method for building SOLR date ranges, as it's no longer needed.
@ERosendo ERosendo force-pushed the 4726-feat-rip-out-solr-indexing-pipeline branch from 14762fb to 05d233d Compare December 10, 2024 14:21
@ERosendo ERosendo marked this pull request as ready for review December 10, 2024 17:26
Copy link
Member

@mlissner mlissner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major. A couple more things we can delete, one feature we should upgrade, and I think we're there. Pretty easy to review, really.

Comment on lines 209 to 210
# transcation block. Trigger a single update as well, if
# required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole step is removed, so we can remove the comment about it.

Suggested change
# transcation block. Trigger a single update as well, if
# required.
# transcation block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright!


# Opinion
search_list = []
text_template = loader.get_template("indexes/opinion_text.txt")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything in the cl.search.templates.indexes directory can be removed, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I'll remove the directory

@@ -17,15 +16,13 @@ def health_check(request: HttpRequest) -> JsonResponse:
"""Check if we can connect to various services."""
is_redis_up = check_redis()
is_postgresql_up = check_postgresql()
is_solr_up = check_solr()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an Elastic health check here, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! That's a great idea.

@ERosendo
Copy link
Contributor Author

@mlissner I've implemented the changes you suggested.

@ERosendo ERosendo requested a review from mlissner December 11, 2024 02:05
Copy link
Member

@mlissner mlissner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWESOME! 💀

@mlissner mlissner enabled auto-merge December 11, 2024 05:55
@mlissner mlissner disabled auto-merge December 11, 2024 06:16
@mlissner mlissner merged commit b6d2f60 into main Dec 11, 2024
15 checks passed
@mlissner mlissner deleted the 4726-feat-rip-out-solr-indexing-pipeline branch December 11, 2024 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants