Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gRPC is here! Starting with Vanilla's ratelimit provider. The old message broker based RPC has been kept for now for backwards compatibility.
My idea is that new (public) proto files go in
/proto
and are symlinked into the projects that use them. For instance, theratelimit.proto
is both symlinked into Vanilla as well as Tea. This ensures both projects always work on the same file, without accidentally drifting apart.Furtheremore, the gRPC ratelimit implementation uses a non-blocking approach as oppsed to the previous one, which delayed sending back the ratelimit grant until the actual time it has been granted. Now, the
ReserveQuota
RPC request immediately returns with a millisecond-precision unix timestamp, indicating the time the client is allowed to send the request at. It's then up to the client to delay for the duration until that timestamp.For this, ratelimits are internally stored as partially filled slots - in the case of global ratelimit, one slot per second. Each slot has a capacity of e.g. 50 quotas. Each incoming ratelimit request during the same second will take up one capacity from the first slot that still has capacity left. Once a current slot has no capacity left, a new slot is created for the next second, and the caller will receive a timestamp one (or more, depending on the filled slots) seconds in the future.
So, if e.g. all slots are empty, and suddenly 120 ratelimit requests arrive at the same time for some reason, the first 50 will receive a timeout of 0, the next 50 of 1 second, and the last 20 of 2 seconds. If, two seconds later, another ratelimit request comes in, the current slot will of course still have the 20 quotas reserved from before, and only 30 free quotas. One more second later, all slots will be free again. I hope this explanation made sense.
This has multiple advantages. It avoids keeping connections and associated resources open for too long, and the caller is not left wondering whether their quota request even arrived in the first place. It also (in the future) allows implementing batch quota requests, should should be helpful with e.g. the bans service (if that will get implemented at some point), so you could request e.g. 10 quotas at once.
Kotlin has also been upgraded to v2 btw.