fix: no timeout for events route #98

xstefanox · 2023-09-20T19:23:38Z

The current implementation defines a 10m server timeout for every route exposed by the Docker API.

Detected issue

There are some containers that need to listen for Docker events and trigger particular action when an event is received, for example swarm-cronjob.

The /events route is meant to stream forever but the timeout imposed by HAProxy is abruptly closing the connection when the timeout is reached.

Proposed solution

With this change, HAProxy will be configured with no timeout only for the /events route, while preserving the existing configuration for other routes.

Implementation details

A dedicated backend with timeout set to 0 is configured in HAProxy; this backend is used to handle only the /events route, which is matched using the same regular expression already used to choose whether the route is accessible or not depending on the related environment variable.

pedrobaeza · 2023-09-21T06:24:58Z

A good practice is to always set a timeout for everything, so changing this for all the uses is not correct. If you provide a non-default way to set for your case no timeout, it can be accepted, but not unconditionally.

xstefanox · 2023-09-21T07:25:54Z

Hi @pedrobaeza,
no, we are not going to remove the timeout for every route, but only for the /events route.

The reason why I didn't provide an environment variable to override this timeout is because this API call is meant to never terminate by design.
The /events route is a sort of custom implementation of Server Sent Events made by Docker: the client asks for the streaming of events emitted by Docker and listens to them forever, because an event stream is not meant to terminate unless the server is shutdown or restarted.

If you manually GET /events from your terminal connecting directly to the Docker daemon, you will find out that your client never terminates; instead, if you connect through the Docker Socket Proxy, the connection is terminated after 10 minutes in a non-clean way, which is an incorrect behaviour for an event stream.

Example output using HTTPie:

❯ http GET http://localhost:2375/events
HTTP/1.1 200 OK
api-version: 1.42
content-type: application/json
date: Wed, 20 Sep 2023 07:00:27 GMT
docker-experimental: false
ostype: linux
server: Docker/23.0.6 (linux)
transfer-encoding: chunked


http: error: ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))

This behaviour causes the crash of some services that expect to consume the Docker events stream, for example the one that I mentioned in my previous post, which terminates with a fatal error and is then automatically restarted exactly every 10 minutes.

I agree with you that setting reasonable network timeouts is a good practice, but for this particular use case, applying a timeout of any value is a nonsense, that's the reason why I'm simply proposing to remove the timeout at all.

pedrobaeza · 2023-09-21T16:13:08Z

OK, thanks for the explanations. Seems reasonable. @yajo as you were the creator of this, do you see it correct?

yajo approved these changes Sep 22, 2023

View reviewed changes

fix: no timeout for events route

0185748

pedrobaeza force-pushed the fix/events-timeout branch from b9ba05e to 0185748 Compare September 25, 2023 15:20

pedrobaeza merged commit bb675df into Tecnativa:master Sep 25, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: no timeout for events route #98

fix: no timeout for events route #98

xstefanox commented Sep 20, 2023

pedrobaeza commented Sep 21, 2023

xstefanox commented Sep 21, 2023

pedrobaeza commented Sep 21, 2023

fix: no timeout for events route #98

fix: no timeout for events route #98

Conversation

xstefanox commented Sep 20, 2023

Detected issue

Proposed solution

Implementation details

pedrobaeza commented Sep 21, 2023

xstefanox commented Sep 21, 2023

pedrobaeza commented Sep 21, 2023