Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disabling ClientChannel without waiting for pending requests #144

Open
xlukem opened this issue Jul 2, 2024 · 9 comments
Open

disabling ClientChannel without waiting for pending requests #144

xlukem opened this issue Jul 2, 2024 · 9 comments

Comments

@xlukem
Copy link

xlukem commented Jul 2, 2024

when communicating with one of our modbus server devices via a rodbus client, we noticed that this specific server device is unable to handle multiple modbus connections at once.
this should not be an issue.

however, during an active connection via rodbus, rodbus does not seem to register a connection fault when the connection is interrupted due to a third device starting to communicate over modbus with the modbus server.

rodbus only reports response_timeouts as modbus requests wont get answered anymore while there is still a active tcp connection and modbus requests still get acknowledged by the server with a TCP ACK

now we wanted to resolve this issue by reconnecting the rodbus client by disabling and re-enabling the rodbus ClientChannel
however this does not work as there are still requests piled up in the rodbus queue which seemed to needed to all be "timed out" before the client channel gets disabled

is there a way to either disable the ClientChannel directly or let all queued up requests fail at once?

also, unfortunately we are unable to change the behaviour of this specific modbus server device

@jadamcrain
Copy link
Member

@xlukem Does the server not send a TCP FIN or RST? It just stops answering requests but keeps the connection open?

I agree that Rodbus should be able to enable/disable in this situation. I have a good idea of how this should be implemented on the main task loop.

That said, I wish there was a good way to detect this condition and gracefully handle the poor behavior from this device without the user (you) having to monitor for this condition and initiate an enable / disable. One potential solution would be for the main task loop to implement this logic, i.e. a have a "maximum number of request timeouts" parameter after which the current connection is dropped and a re-connection happens.

@xlukem
Copy link
Author

xlukem commented Jul 2, 2024

there actually does seem to be a TCP RST frame.. however the destination port seems weird, i cant find this port again anywhere in the trace
but yes, this seems to be an issue with the modbus server we are dealing with, it is generally very poorly designed

image

IP .204: modbus server
IP .1: rodbus
IP .44: third device interrupting connection

I agree that Rodbus should be able to enable/disable in this situation. I have a good idea of how this should be implemented on the main task loop.

awesome!

One potential solution would be for the main task loop to implement this logic, i.e. a have a "maximum number of request timeouts" parameter after which the current connection is dropped and a re-connection happens.

yes, thats what we currently try to do by keeping track of failed requests
having this implemented by the library would be a nice addition

@xlukem
Copy link
Author

xlukem commented Jul 3, 2024

another thing i have noticed is that rodbus does not report a connectivity problem when the ethernet connection is interrupted

image

here rodbus only reports single timeouts for each message sent until the requests queue is empty or the modbus server is connected again (and sends an RST) (also, this is another modbus server thats working more reliable than the IP .204)

would it be possible to implement a ClientChannel specific channel timeout?

@jadamcrain
Copy link
Member

jadamcrain commented Jul 4, 2024

Not sure I quite understand what you mean by "when the ethernet connection is interrupted". Are you pulling the ethernet cable in this scenario or is there a network failure of sorts? Does Rodbus detect that the connection is down via the channel state callbacks or does it think that the connection is there, but the remote device just isn't responding?

@xlukem
Copy link
Author

xlukem commented Jul 4, 2024

yes, i interrupted the connection by pulling the ethernet cable
in this scenario rodbus does not report any changes about the ClientState via the PrintingClientStateListener and the request callbacks only report a response_timeout

@jadamcrain
Copy link
Member

Thanks for the additional info. I hope to be able to look at this next week.

@jadamcrain
Copy link
Member

jadamcrain commented Jul 7, 2024

I believe this is the classic problem of detecting a dead socket, i.e. one where the peer disappears without a graceful shutdown.

I'm not surprised that the client would return response_timeout in this situation, nor do I think ClientStateListener would fire an event immediately.

Eventually, the OS would eventually decide that socket is dead for a couple of reasons:

  1. Transmitted data is not being acknowledge. Usually writing to a socket will eventually get the OS to time it out.

  2. The TCP keep-alive kicks in. This shouldn't be the case here since you are writing.

Does the connection time-out eventually, just not in a reasonable time period?

The best solution to this situation might be the same solution we proposed before:

After a configurable number of response timeouts, we can close the connection and force a reconnect.

@xlukem
Copy link
Author

xlukem commented Jul 9, 2024

Does the connection time-out eventually, just not in a reasonable time period?

thank you for your advice, after further inspection i have seen that the connection does time out eventually
on our production device this timeout seems to be about 15 min
on another device i have seen a 1 min timeout, so it does seem to be device specific

After a configurable number of response timeouts, we can close the connection and force a reconnect.

we'd love to see that as part of the library in the future

@jadamcrain
Copy link
Member

@xlukem Thanks for confirming the behavior is as expected. We definitely want this feature for the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants