Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQS: Add support for external back-pressure management #1251

Open
loicrouchon opened this issue Oct 3, 2024 · 0 comments
Open

SQS: Add support for external back-pressure management #1251

loicrouchon opened this issue Oct 3, 2024 · 0 comments

Comments

@loicrouchon
Copy link

Type: Feature

Is your feature request related to a problem? Please describe.
The current SQS listener implementation has options to limit the concurrency within the application, however, it does not support an external data-source informing about downstream services back-pressure.

The use case I have in mind is to stop processing new messages from an upstream queue if a downstream queue reached a certain number of visible messages.
In our use case, several producers are publishing messages to a downstream queue. Some producers are less critical than others and should only publish in messages in the downstream queue at low peak times. For those producers we need to be informed of back-pressure of the downstream queue to decide if we should produce new messages (i.e. consume from the upstream queue).

Another use case would be the one described in #481

Describe the solution you'd like
The solution I'm thinking of would be to implement a hook for reporting downstream services back-pressure into some BackPressureHandler/BatchAwareBackPressureHandler implementation.
I would put out of scope the implementation of said hooks (downstream queue number of visible messages, rate limiter, ...) which would be (at least for now) to be implemented by applications.

I can propose to implement such feature, but before doing so, I'd like to discuss the direction.

So far I've seen there's only one implementation of the BatchAwareBackPressureHandler: the SemaphoreBackPressureHandler.
My idea is to build something around it (either a wrapper, a sub-class, or evolving the current implementation with more features, we'll see when going in the implementation phase).
The reason for keeping the initial SemaphoreBackPressureHandler is that it limits the concurrency within the application, while the downstream back-pressure concept is there to stop processing or reduce processing speed if downstream systems cannot follow.

So this new implementation would be a kind of SemaphoreBackPressureHandler which would call a BackPressureMeter in request(int) and requestBatch() methods. Depending on the current downstream back-pressure and the configured maxBackPressure it would accordingly increase (up to totalPermits) or reduce (down to 0) the number of permits on the internal semaphore. It might also be needed to update the drain(Duration) method to take the reduced capacity into account.

New methods would be required on the ContainerOptions interface:

  • #downstreamBackPressureMeter(BackPressureMeter meter): specifies how to measure the back-pressure of downstream systems.
  • #maxDownstreamBackPressure(int maxDownstreamBackPressure): specifies what is the max back-pressure that can be put on the downstream system.

The BackPressureMeter interface could look like this.

public interface BackPressureMeter {

    int currentBackPressure();
}

For applications with multiple replicas, the effects would be dependent on how the back-pressure is measured in the provided BackPressureMeter implementation. For example:

  • An local back-pressure meter (e.g. in-memory rate limiter, ...) would only allow to account for the back-pressure set by the current application instance. If there are multiple replicas of your application, you would need to configure the maxDownstreamPressure accordingly.
  • A distributed back-pressure meter (e.g. distributed rate limiter, AWS number of visible messages, ...) would provide an actual view of the downstream system, but with multiple replicas of your application, each might consider they have full latitude to work up to the maxDownstreamBackPressure. This would result in an actual downstream back-pressure being slightly over the the maxDownstreamBackPressure, but then the BackPressureMeter would kick-in and the consumption would be stopped and resumed on the next polling where the downstream back-pressure went down. This would result under load in an oscillation of the actual downstream back-pressure around the maxDownstreamBackPressure.

@tomazfernandes and @maciejwalkowiak, tagging you for feedback as you both participated in #479

Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant