-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cbtls_info() to report "Need to write more data" when called on exit … #5342
base: v3.2.x
Are you sure you want to change the base?
Conversation
…of a handshake function The current code does seem to miss to cater for the cases when a TLS handshake exits and there is data left that needs either to be extracted from the BIO and/or sent to the other side. Differentiating and logging such cases within cbtls_info() might aid in further debugging any stalled handshakes followed up by timing out requests we experience with the up to date 3.2.x head.
It may also be useful to check for It may also be good to apply a similar patch to |
Perhaps this patch might help, too? I haven't tried it though. |
Thanks for having a look at this. I don't seem to get more info with patches applied but the server seems to stall some time after attempting a RadSec connection to a home_server. I'm providing more info from the previous non-patched run. This is an "SSL_connect() returned WANT_WRITE" with "(TLS) has 0 bytes in the buffer" situation. TLS handshake magic? AFAIU, for a BIO_read() to be called from tls_handshake_send() there currently have to be some un-encrypted data in 'clean_in'. The condition in tls_handshake_send() looks like this: excerpt from a debug log follows: ... |
On Jun 19, 2024, at 12:39 PM, martinsta ***@***.***> wrote:
Thanks for having a look at this. I don't seem to get more info with patches applied but the server seems to stall some time after attempting a RadSec connection to a home_server. I'm providing more info from the previous non-patched run. This is an "SSL_connect() returned WANT_WRITE" with "(TLS) has 0 bytes in the buffer" situation. TLS handshake magic?
Yes. There appears to be data in SSL which we need to write out.
I've attached a patch from the current v3.2.x branch. It should replace the previous patch for src/main/tls_listen.c
I *hope* this works. Looking at the code and OpenSSL, it should be the correct fix,
|
Thanks, I'd be happy to test the new patch. I'm afraid the attachment's gone astray. |
This patch should work. |
Thanks for another round. With the patch applied and BIO_read() getting called the server reports "Debug: (TLS) SSL_connect() writing 0 bytes to the network". What's that? SSL_connect() left in handshake, SSL_get_error() returns WANT_WRITE and there is no data available to read from BIO? Does it need to retry? Or is this a different king of issue? 30 seconds later the original request timeouts, gets rejected, the server then logs "Debug: Proxy SSL socket has data to read" (possibly also in relation to another request) and locks up. excerpt from a debug log follows: Wed Jun 19 23:12:13 2024 : Debug: (TLS) Trying new outgoing proxy connection to proxy (0.0.0.0, 0) -> home_server (a.b.c.d, 2083) |
On Jun 20, 2024, at 7:07 AM, martinsta ***@***.***> wrote:
Thanks for another round. With the patch applied and BIO_read() getting called the server reports "Debug: (TLS) SSL_connect() writing 0 bytes to the network".
That's annoying.
What's that? SSL_connect() left in handshake, SSL_get_error() returns WANT_WRITE and there is no data available to read from BIO? Does it need to retry? Or is this a different king of issue?
WANT_WRITE means that OpenSSL wants to write more data to the network.
But when we call BIO_read() to get data from OpenSSL, it returns "no more data".
So I don't know what's going on.
30 seconds later the original request timeouts, gets rejected, the server then logs "Debug: Proxy SSL socket has data to read" (possibly also in relation to another request) and locks up.
That's unfortunate.
You can edit src/main/tls_listen.c, in the proxy_tls_recv() function. Maybe have it print out how much data was read from the network.
And also instrument proxy_tls_read() to see what's going on there, and why it is likely returning "no data".
The main issue is that I can't reproduce the issue here, so it's very hard to debug it.
|
…sync parameter fd is currently missing O_NONBLOCK property that should be set via nested fr_socket_client_tcp() -> fr_nonblock() -> fcntl() calls. this will fix the issue FreeRADIUS#5342 FreeRADIUS#5342
…of a handshake function
The current code does seem to miss to cater for the cases when a TLS handshake exits and there is data left that needs either to be extracted from the BIO and/or sent to the other side. Differentiating and logging such cases within cbtls_info() might aid in further debugging any stalled handshakes followed up by timing out requests we experience with the up to date 3.2.x head.