Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SpeciesImageService output usable with bie-index #896

Open
adam-collins opened this issue Apr 11, 2024 · 1 comment
Open

Make SpeciesImageService output usable with bie-index #896

adam-collins opened this issue Apr 11, 2024 · 1 comment
Assignees
Milestone

Comments

@adam-collins
Copy link
Contributor

To make the biocache-service species autocomplete similar in output bie-index, an image field was included.

It has come to my attention that

  1. This is unnecessary for the autocomplete response as images are not shown in autocomplete drop downs today.
  2. It does not apply the configurable preferred and required image fq filters that bie-index uses.
  3. It does not include the images from the preferred/hidden species lists that contain this information.

From a bie-index perspective, querying biocache-service requires an estimated 1 million requests. Biocache-service accomplishes similar in about a minute. It is reasonable to expect this update to take place every time occurrences change, i.e. daily.

The refactoring of the bie-index can take advantage of the output of an improved SpeciesImageService.

  1. shorter downtime waiting for bie-index index creation
  2. daily updates to bie-index image field instead of weekly (for the biocache-service source, not the lists based sources)
  3. significantly lower traffic to biocache-service during an images update

Currently this information is stored using lft values. It is a goal (#885) to replace these. For this task, a mapping of id:[lft, rgt] must exist. The current process to get this information is a query to namematching-ws with an id. i.e. download the contents of the lucene names index via namematching-ws every time this is required.

Tasks

  1. For SpeciesImageService, add config for preferred (fq list ordered by preference) and required (list of fq) image fqs.
  2. For SpeciesImageService, when fetching images, apply all requiredImageFqs
  3. For SpeciesImageService, when fetching images, apply, one at a time, preferredImageFqs.
  4. Add webservice (or extend suitable existing service) to respond with SpeciesImageService data (SpeciesImagesDTO)
  5. Add webservice (or extend suitable existing service) to respond with SpeciesCountsService data (SpeciesCountDTO)
  6. Look at how lft/rgt is stored in the lucene index and how namematching-ws responds with it. This is with a view to adding a csv file containing id,lft,rgt in archives.ala, beside the lucene index, for use by biocache-service (optional to lower traffic) and the refactored bie-index (because the DwCA does not have this information).
@adam-collins adam-collins self-assigned this Apr 11, 2024
adam-collins added a commit that referenced this issue Apr 12, 2024
…ecies-image-service

#896 allow preference filters for SpeciesImageService
@adam-collins adam-collins added this to the 3.5.0 milestone Apr 19, 2024
@adam-collins
Copy link
Contributor Author

now working for atlas-index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant