GH-5124 configurable http timeouts #5125

hmottestad · 2024-09-13T12:56:56Z

GitHub issue resolved: #5124

Briefly describe the changes proposed in this PR:

introduced timeout for the underlying Apache HttpClient
made them configurable
made different timeouts for when using with a SPARQL SERVICE call

PR Author Checklist (see the contributor guidelines for more details):

my pull request is self-contained
I've added tests for the changes I made
I've applied code formatting (you can use mvn process-resources to format from the command line)
I've squashed my commits where necessary
every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

hmottestad · 2024-09-18T08:17:09Z

	/**
 	 * Default HTTP connection timeout in milliseconds for general use. Set to 1 hour.
 	 */
 	public static final int DEFAULT_CONNECTION_TIMEOUT = 60 * 60 * 1000; // 1 hour

 	/**
 	 * Default HTTP connection request timeout in milliseconds for general use. Set to 10 days.
 	 */
 	public static final int DEFAULT_CONNECTION_REQUEST_TIMEOUT = 10 * 24 * 60 * 60 * 1000; // 10 days

 	/**
 	 * Default HTTP socket timeout in milliseconds for general use. Set to 10 days.
 	 */
 	public static final int DEFAULT_SOCKET_TIMEOUT = 10 * 24 * 60 * 60 * 1000; // 10 days

 	// Default timeout values for SPARQL SERVICE calls

 	/**
 	 * Default HTTP connection timeout in milliseconds for SPARQL SERVICE calls. Set to 10 minutes.
 	 */
 	public static final int DEFAULT_SPARQL_CONNECTION_TIMEOUT = 10 * 60 * 1000; // 10 minutes

 	/**
 	 * Default HTTP connection request timeout in milliseconds for SPARQL SERVICE calls. Set to 6 hours.
 	 */
 	public static final int DEFAULT_SPARQL_CONNECTION_REQUEST_TIMEOUT = 6 * 60 * 60 * 1000; // 6 hours

 	/**
 	 * Default HTTP socket timeout in milliseconds for SPARQL SERVICE calls. Set to 6 hours.
 	 */
 	public static final int DEFAULT_SPARQL_SOCKET_TIMEOUT = 6 * 60 * 60 * 1000; // 6 hours

hmottestad · 2024-09-18T08:20:26Z

connection timeout is the amount of time it will wait for the TCP connection to be established
socket timeout is the amount of time it will wait for data on the socket, network level
connection request timeout is the amount of time it will wait to get a connection from the pool

hmottestad · 2024-09-18T08:32:52Z

The socket timeout should be sufficiently long that a complex query won't timeout while waiting for a response. When queries are sent using the SERVICE call then we assume that there is a remote server that we don't really have control over, so we don't want to wait as long as we otherwise would on a query to our own workbench/server.

The connection timeout should be long enough not to be disruptive for background tasks or batch request that we would prefer to wait for the server to respond rather than error out. We don't want it to be too long though, in case the server actually isn't responsive. For SERVICE calls we want to fail a bit faster, but still give the server time to respond if it is overloaded.

Finally the connection request timeout is how long we are willing to wait for a connection from the pool. Since we own the connection pool we should be in control of how many connections we have and should be patient if the pool is empty. Here it's important not to break things for our users. Since the timeout was infinite before, then we should be fairly generous now. Since SERVICE calls are performed as part of a SPARQL query we should consider it more acceptable to timeout sooner.

All 6 timeouts are configurable through system properties:

	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.http.connectionTimeout} for specifying the HTTP
 	 * connection timeout in milliseconds for general use. Default is 1 hour.
 	 *
 	 * <p>
 	 * The connection timeout determines the maximum time the client will wait to establish a TCP connection to the
 	 * server. A default of 1 hour is set to allow for potential network delays without causing unnecessary timeouts.
 	 * </p>
 	 */
 	public static final String CONNECTION_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.http.connectionTimeout";

 	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.http.connectionRequestTimeout} for specifying the
 	 * HTTP connection request timeout in milliseconds for general use. Default is 10 days.
 	 *
 	 * <p>
 	 * The connection request timeout defines how long the client will wait for a connection from the connection pool. A
 	 * longer timeout is acceptable here since operations like large file uploads may need to wait for an available
 	 * connection.
 	 * </p>
 	 */
 	public static final String CONNECTION_REQUEST_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.http.connectionRequestTimeout";

 	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.http.socketTimeout} for specifying the HTTP socket
 	 * timeout in milliseconds for general use. Default is 10 days.
 	 *
 	 * <p>
 	 * The socket timeout controls the maximum period of inactivity between data packets during data transfer. A longer
 	 * timeout is appropriate for large data transfers, ensuring that operations are not interrupted prematurely.
 	 * </p>
 	 */
 	public static final String SOCKET_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.http.socketTimeout";

 	// System property constants for SPARQL SERVICE timeouts

 	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.sparql.http.connectionTimeout} for specifying the
 	 * HTTP connection timeout in milliseconds when used in SPARQL SERVICE calls. Default is 10 minutes.
 	 *
 	 * <p>
 	 * A shorter connection timeout is set for SPARQL SERVICE calls to quickly detect unresponsive endpoints in
 	 * federated queries, improving overall query performance by avoiding long waits for unreachable servers.
 	 * </p>
 	 */
 	public static final String SPARQL_CONNECTION_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.sparql.http.connectionTimeout";

 	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.sparql.http.connectionRequestTimeout} for specifying
 	 * the HTTP connection request timeout in milliseconds when used in SPARQL SERVICE calls. Default is 6 hours.
 	 *
 	 * <p>
 	 * This timeout controls how long the client waits for a connection from the pool when making SPARQL SERVICE calls.
 	 * A shorter timeout than general use ensures that queries fail fast if resources are constrained, maintaining
 	 * responsiveness.
 	 * </p>
 	 */
 	public static final String SPARQL_CONNECTION_REQUEST_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.sparql.http.connectionRequestTimeout";

 	/**
 	 * Configurable system property {@code org.eclipse.rdf4j.client.sparql.http.socketTimeout} for specifying the HTTP
 	 * socket timeout in milliseconds when used in SPARQL SERVICE calls. Default is 6 hours.
 	 *
 	 * <p>
 	 * The socket timeout for SPARQL SERVICE calls is set to a shorter duration to detect unresponsive servers during
 	 * data transfer, ensuring that the client does not wait indefinitely for data that may never arrive.
 	 * </p>
 	 */
 	public static final String SPARQL_SOCKET_TIMEOUT_PROPERTY = "org.eclipse.rdf4j.client.sparql.http.socketTimeout";

…sible defaults.

aschwarte10

Thanks for providing the PR for configurable timeouts. Some feedback from my side

I would suggest to decrease the default timeouts, specifically the connect timeout (see inside comments)
it should be understood that the instance is shared via the repository manager to all managed SPARQL or HTTP repositories

Note that in our application we have made the timeouts also configurable, essentially by supplying a custom HttpClientBuilder to the client session manager

Maybe you find some helpful insights here.

aschwarte10 · 2024-09-28T05:28:13Z

.../http/client/src/main/java/org/eclipse/rdf4j/http/client/SharedHttpClientSessionManager.java

 				.setRetryHandler(retryHandlerStale)
 				.setServiceUnavailableRetryStrategy(serviceUnavailableRetryHandler)
 				.useSystemProperties()
-				.setDefaultRequestConfig(RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD).build())
+				.setDefaultRequestConfig(requestConfig)


we are also using a configurable value for the max number of connections

.setMaxConnPerRoute(maxConnections) .setMaxConnTotal(maxConnections)

That's a very good idea. What are your defaults?

I think that part of the issue is simply that the default number of connections is too low, maybe even 1, which is quite terrible for the way that RDF4J typically does joins.

We have currently 25 connections as default value (for quite some time). Note that we haven't done systematic benchmarks on the impact of adjusting the number of connections.

aschwarte10 · 2024-09-28T05:31:22Z

.../http/client/src/main/java/org/eclipse/rdf4j/http/client/SharedHttpClientSessionManager.java

+	/**
+	 * Default HTTP connection timeout in milliseconds for general use. Set to 1 hour.
+	 */
+	public static final int DEFAULT_CONNECTION_TIMEOUT = 60 * 60 * 1000; // 1 hour


the connection timeout should be typically rather small (in the range of seconds): this is the timeout for establishing the first connection. In our application we have set the default to 5 seconds

Maybe you can explain the different timeouts also better in the javadoc?

https://www.baeldung.com/httpclient-timeout#properties

There is an explanation on the timeout variables, but not on the default constants.

I'm essentially afraid of breaking existing code, since the default timeouts for the Apache HttpClient is infinite.

How about having the default be 60 seconds?

How about having the default be 60 seconds?

I think for 99% of practical cases this is very long: this timeout is only for establishing the first connection (which is usually really fast or does not work at all). As a user you would want to know rather fast if the connection cannot be established (i.e. if the packets go into the nirvana). See also my comment w.r.t query timeout below

60s is definitely better than 1 hour though

aschwarte10 · 2024-09-28T05:37:17Z

.../http/client/src/main/java/org/eclipse/rdf4j/http/client/SharedHttpClientSessionManager.java

+	/**
+	 * Default HTTP socket timeout in milliseconds for general use. Set to 10 days.
+	 */
+	public static final int DEFAULT_SOCKET_TIMEOUT = 10 * 24 * 60 * 60 * 1000; // 10 days


I would suggest to define this value in the range of expected query timeouts, 10 days sounds unreasonable.

What the value does:

the time waiting for data – after establishing the connection; maximum time of inactivity between two data packets

so, in essence how long the connection remains open waiting between two packets in an established transfer. We have configured default values of 60 seconds for this.

What I'm worried about here is what happens if you have some computationally heavy operation on the server side, SHACL validation or maybe inference. If the users actions trigger this then I'm curious what happens if the timeout is low.

I know we have some sort of ping thread that makes sure that one or maybe both parties are still alive during a transaction, but if that thread is sharing the same http client pool then it might be a bit brittle.

It's essentially the same situation as for the other timeouts. The default in the Apache HttpClient is infinite, and I don't want to break any existing code.

Ok, so the main use-case is to distinguish interactive from long-running queries. I am not sure whether this is possible, but can the read timeout be co-related with the query timeout on the query that is using the specific connection?

hmottestad · 2024-09-28T08:58:53Z

I'm a bit wary of setting the timeouts low since the defaults in the Apache HttpClient are, very unfortunately, infinite. And that's what our users have been living with until now. So 10 days timeout is essentially me hoping that there is no unforeseen use case that today runs for several days. I agree that 10 days is unreasonably long for a timeout.

Maybe we can do a phased rollout of the timeouts. Have them very lax now, then tighten them for RDF4J 5.2 and then again for 5.3?

I haven't really looked into other ways of making this configurable since the user that reported the issue is using the Workbench and needs something that can be configured on the command line.

hmottestad · 2024-09-28T09:58:16Z

And thank you for the review @aschwarte10 :)

aschwarte10 · 2024-09-29T06:08:37Z

Maybe some additional thougts of direction:

From our experience we also use "low" timeouts on the HTTP level, as the query timeout configured on RDF4J API level currently only becomes active once the first result item is transferred (i.e. the result is wrapped in a timelimit iteration). If something goes wrong on the HTTP level and there are no timeouts, the user will not get any feedback and the query processing actually just continues.

A second thought regarding configurability: typically you would want to distinguish interactive from long-running analytical queries. The knowledge is only known by the user, who can on the RDF4J side set an appropriate query timeout on the query (or more globally). Maybe it is worth taking this into consideration, i.e. is it possible to co-relate the HTTP level timeouts with the query timeouts set on the RDF4J level - not sure though if the HTTP timeouts can really be adjusted per query execution.

An alternate way: maybe in the workbench you start with explicitly configuring longer default timeouts, while the default timeouts in the implementation code are reasonably small. Not sure if there is a common entry point for starting up the workbench, when looking at the container there clearly is, for plain-processes I do not know how it is usually started.

Might also be worth to get another opinion on timeouts. I guess the most important is as always to have some defined timeout to avoid endless processing without any feedback.

…squashed commit) Squashed commits: [9aa87b594c] GH-5124 introduce number of connections and also reduce timeouts

GH-5124 make the three main HttpClient timeouts configurable with sen…

20416e2

…sible defaults.

hmottestad force-pushed the GH-5124-configurable-http-timeouts branch from 16db967 to 20416e2 Compare September 18, 2024 08:53

hmottestad marked this pull request as ready for review September 18, 2024 08:54

hmottestad requested review from barthanssens and aschwarte10 September 18, 2024 08:54

aschwarte10 reviewed Sep 28, 2024

View reviewed changes

hmottestad added 2 commits October 21, 2024 15:31

GH-5124 introduce number of connections and also reduce timeouts (+1 …

2ece387

…squashed commit) Squashed commits: [9aa87b594c] GH-5124 introduce number of connections and also reduce timeouts

GH-5134 add test

00f8465

hmottestad enabled auto-merge October 21, 2024 20:27

hmottestad merged commit 5d25e15 into develop Oct 21, 2024
9 checks passed

hmottestad deleted the GH-5124-configurable-http-timeouts branch October 21, 2024 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-5124 configurable http timeouts #5125

GH-5124 configurable http timeouts #5125

hmottestad commented Sep 13, 2024 •

edited

Loading

hmottestad commented Sep 18, 2024 •

edited

Loading

hmottestad commented Sep 18, 2024

hmottestad commented Sep 18, 2024

aschwarte10 left a comment

aschwarte10 Sep 28, 2024

hmottestad Sep 28, 2024

aschwarte10 Sep 29, 2024

aschwarte10 Sep 28, 2024

hmottestad Sep 28, 2024

hmottestad Sep 28, 2024

aschwarte10 Sep 29, 2024

aschwarte10 Sep 28, 2024

hmottestad Sep 28, 2024

aschwarte10 Sep 29, 2024

hmottestad commented Sep 28, 2024 •

edited

Loading

hmottestad commented Sep 28, 2024

aschwarte10 commented Sep 29, 2024

GH-5124 configurable http timeouts #5125

GH-5124 configurable http timeouts #5125

Conversation

hmottestad commented Sep 13, 2024 • edited Loading

hmottestad commented Sep 18, 2024 • edited Loading

hmottestad commented Sep 18, 2024

hmottestad commented Sep 18, 2024

aschwarte10 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmottestad commented Sep 28, 2024 • edited Loading

hmottestad commented Sep 28, 2024

aschwarte10 commented Sep 29, 2024

hmottestad commented Sep 13, 2024 •

edited

Loading

hmottestad commented Sep 18, 2024 •

edited

Loading

hmottestad commented Sep 28, 2024 •

edited

Loading