Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Transform Manager To Its Own Microservice #792

Open
BenGalewsky opened this issue Jun 15, 2024 · 1 comment
Open

Migrate Transform Manager To Its Own Microservice #792

BenGalewsky opened this issue Jun 15, 2024 · 1 comment

Comments

@BenGalewsky
Copy link
Contributor

BenGalewsky commented Jun 15, 2024

Problem

Several critical high frequency operations related to transforms are made using REST calls through the single threaded flask app. This has several problems:

  1. Messages can be lost when the flask server is busy
  2. Microservices and the client have to implement retry logic in all of their interactions with the app
  3. We need to scale up instances of the app even though much of its time is spent blocked on I/O operations
  4. The app has become quite complicated since it provides so many services

Approach

Migrate the transformer manager functionality out of the app and into a message driven microservice. Microservice architecture calls on a clear separation of database concerns within each service. We will respect this by having the app only read transform related tables, while the new transformer manager service will perform all of the writes on:

  • Transform Requests
  • Transform Results
  • Datasets
  • Dataset Files

Assumptions

  1. We will implement the new Transformer Manager microservice using the Celery framework
  2. The SQL Alchemy models will need to be shared between the app and the transformer manager. They will be shared via a new directory at the top level of the ServiceX monorepo so we don't have to introduce a new library to our build pipeline
  3. The celery task definitions must be shared between the TranformManager service and the other services. We will use the signatures feature of Celery to accomplish this.
  4. The transform submit REST POST operation will be hollowed out so it returns quickly. It will perform some high level validation, generate the request ID and put the record on a queue before returning the request ID back to the client.
  5. The DID finder library will be modified to use Celery tasks instead of PUTs to the app
  6. The transformer sidecar will be modified to use Celery tasks instead of PUTs to the app
@gordonwatts
Copy link
Collaborator

This looks wonderful - we've been talking around the edges of this, and I like how you've clearly drawn a boarder around what should be extracted here.

One thing I'd add - there is no mention of getting error messages back to the user when various things fail. Before the re-write above occurs, it would be good to at least understand how that will work. Losing, for example, errors from the codegen part would make life a lot worse. This might mean really thinking about this and the database and having some clear internal rules for how errors are reported.

I don't now that this has to be part of this work - but this work has the potential to make the UX worse than it is now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants