Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call "insert" method when reindexing #53

Open
thejcannon opened this issue Jun 7, 2019 · 0 comments
Open

Call "insert" method when reindexing #53

thejcannon opened this issue Jun 7, 2019 · 0 comments

Comments

@thejcannon
Copy link

thejcannon commented Jun 7, 2019

Realistically, there should be little difference in calling insert_foo vs. update_foo since the writer's insert and update are essentially the same thing (and are interchangeable). However there is a semantic difference between those two methods that clients might care about.

An example is that I might write my own Whoosheer that uses two models: User and Review, and is used to search for Reviews (for this example, the user's name is being indexed too, so people can search for reviews by the user's name).
insert_review is straightforward,
insert_user is a no-op (since there isn't any reviews for that user yet).
update_review is also straightforward.
update_user is a bit more complex. In update_user I loop through all the user's reviews and update the document for each review.

Now if a user changes something meaningless (like say their email preferences), then the update_user code will still trigger and I'll loop through all their reviews and update the documents for a net-zero change. To mitigate this, I've implemented a short-circuit in the update_foo methods to bail early if none of the fields we care about have changed. I've left the insert_foo methods unchanged because we'll always add to the document when a Review gets inserted.

Now, when I reindex, because flask_whooshee calls update_foo instead of insert_foo my "clever" code notices no fields have changed and bails early, leaving my index barren and dry.


P.S. If you're interested in the "unchanged fields" optimization, it looks something like this (I've implemented it as a decorator, but omitted that part for brevity):
from sqlalchemy import inspect

fields_i_care_about = [...]

def update_foo(cls, writer, target):
   unmodified_fields = inspect(target).unmodified
   if all(map(lambda key: key in unmodified_fields, fields_i_care_about)):
      return
   
   ...
thejcannon pushed a commit to thejcannon/flask-whooshee that referenced this issue Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant