Improve ability to handle concurrent requests #270

paulalbert1 · 2018-08-13T15:56:00Z

Top: current system for retrieval. Bottleneck is present when saving PubMed articles

Bottom: proposed design for making both retrieval and storage of PubMed articles using a bunch of "workers" (mostly likely EC2 instances).

Also, the "retrieval" call from ReCiter to PubMed service will be async. ReCiter will be notified of the success/failure of retrieval (list of pmids retrieved successfully, list of pmids retrieved unsucessfully) via a callback.

PubMed service will orchestrate the retrieval "workers" (retrieve PubMed articles) and storage "workers" (storing PubMed articles into DynamoDB). PubMed Retrieval Service still does what it does now: accept PubMed queries and returns results of the queries.

Q: How do you divvy up work among the workers? 100 each?
A: # pubmed_articles / # workers

paulalbert1 assigned jl987-Jie Aug 13, 2018

sarbajitdutta added enhancement Priority labels Apr 19, 2019

sarbajitdutta self-assigned this Apr 19, 2019

paulalbert1 removed the Priority label Nov 25, 2020

paulalbert1 unassigned sarbajitdutta Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ability to handle concurrent requests #270

Improve ability to handle concurrent requests #270

paulalbert1 commented Aug 13, 2018

Improve ability to handle concurrent requests #270

Improve ability to handle concurrent requests #270

Comments

paulalbert1 commented Aug 13, 2018