Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebSocket connection error with errno 115 with v4.3.3 #3149

Open
vikashm05 opened this issue May 21, 2024 · 13 comments
Open

WebSocket connection error with errno 115 with v4.3.3 #3149

vikashm05 opened this issue May 21, 2024 · 13 comments

Comments

@vikashm05
Copy link

vikashm05 commented May 21, 2024

Hi @Team,

Earlier, I was using version 3.1.1 and the websocket connection was established successfully but when we upgraded the websocket version to 4.3.3, getting an error "Connection in progress" i.e. 115. It stucks inside the connecting state for infinite. So, please suggest.

Please find the logs below:

2024-05-21 08:22:04:LWS: [2024/05/21 08:22:04:8163] D: [wsicli|9b|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_via_info: adoption cb 19 to h1 websock_protocol

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5338] D: _realloc: size 1192: __lws_wsi_create_with_role

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5338] D: : lws_role_transition: wsistate 0x200, ops (unset)

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5339] D: : lws_vhost_bind_wsi: vh default: wsi none/none, count_bound_wsi 157

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5339] D: _realloc: size 368: client ws struct

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5339] D: : lws_role_transition: wsistate 0x10000200, ops h1

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5340] I: : lws_client_connect_via_info: role binding to h1

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5340] I: : lws_client_connect_via_info: vh default protocol binding to websock_protocol

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5341] D: _realloc: size 4: user space

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5342] I: : lws_client_connect_via_info: : h1 websock_protocol entry

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5342] D: _realloc: size 289: client stash

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5342] N: __lws_lc_tag: ++ [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo] (157)

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5343] I: lws_header_table_attach: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: ah (nil) (tsi 0, count = 156) in

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5343] D: _realloc: size 984: ah struct

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5344] D: _realloc: size 4096: ah data

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5344] I: _lws_create_ah: created ah 0x5599c819cb70 (size 4096): pool length 157

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5344] I: lws_header_table_attach: did attach wsi [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: ah 0x5599c819cb70: count 157
(on exit)

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5346] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: __lws_set_timeout: 10 secs, reason 25

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5346] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_http_client_connect_via_info2: stash 0x5599c816ff30

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5346] D: _realloc: size 45: strdup

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5347] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_2_dnsreq: new conn on no pipeline flag

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5347] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lwsi_set_state: lwsi_set_state 0x10000200 -> 0x10000201

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5348] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_2_dnsreq: lookup us-east.cloud.homen
etworks.com:443

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5355] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_getaddrinfo46: getaddrinfo 'us-east.cloud.homenetwo
rks.com' says 0

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5356] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_sort_dns: sort_dns: 0x5599c8079660

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5356] D: _realloc: size 104: lws_sort_dns

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5357] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_sort_dns: unsorted entry (af 2) 184.73.182.58

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5357] D: _realloc: size 104: lws_sort_dns

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5357] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_sort_dns: unsorted entry (af 2) 44.194.17.211

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5358] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_sort_dns_dump: 1: (2)184.73.182.58, gw (0)(unset), idi: 0, lbl: 0
, prec: 0

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5359] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_sort_dns_dump: 2: (2)44.194.17.211, gw (0)(unset), idi: 0, lbl: 0
, prec: 0

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5359] I: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_3_connect: peer_route_uidx 0

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5359] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_3_connect: WAITING_CONNECT

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5360] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lwsi_set_state: lwsi_set_state 0x10000201 -> 0x10000202

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5360] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: __insert_wsi_socket_into_fds: tsi=0, sock=167, pos-in-fds=158

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5361] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: _lws_change_pollfd: fd 167 events 1 -> 1

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5361] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: __lws_set_timeout: 20 secs, reason 2

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5362] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: lws_client_connect_3_connect: connect: fd 167 errno: 115

2024-05-21 08:23:12:LWS: [2024/05/21 08:23:12:5363] D: [wsicli|9c|WS/h1/default/us-east.cloud.homenetwo]: _lws_change_pollfd: fd 167 events 1 -> 5

Thanks and Regards,
Vikash

@lws-team
Copy link
Member

What platform is it? Same problem on main branch?

@vikashm05
Copy link
Author

vikashm05 commented May 21, 2024 via email

@vikashm05
Copy link
Author

from attached logs, any pointers as i am not able to figure out whether its client side issue or server side.

@lws-team
Copy link
Member

It won't be strange if we start the comnection and if we check, her 'connection in progress', ie, it's ongoing.

Did you look with tcpdump or whatever what's actually happening on the wire?

@vikashm05
Copy link
Author

I am able to see the packets sending/receiving from client to server and vice versa but not sure why connection has not been established.

@vikashm05
Copy link
Author

Please confirm any specific API changes from 3.1.1 to 4.3.3 that would be the source of this problem, if at all possible.

@vikashm05
Copy link
Author

Hi Team,

I have added lws_sul_scheduler API to resolve the websocket connection error i.e. Connection in progress "errno 115" but after adding this API, I am getting error at initial stage only i.e. "callback_websocket: connection attempts exhausted".

Also, I tried to downgrade the websocket version but having same issue "Connection in progress "errno 115"" after 3.1 available version
Not sure , how to fix it. please advise if its familiar.

Please find detail lws log:

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6041] D: _realloc: size 7416: context

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6042] N: lws_create_context: LWS: 4.3.3-LastSuccessful-gpon-onl-development-1.2.0.47, NET CLI SRV H1 H2 WS ConMon IPV6-on

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6043] I: lws_create_context: Event loop: poll

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6043] D: _realloc: size 64: lws_smd_register

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6044] I: lws_smd_register: peer 0x55fb0370d3b0 (count 1) registered

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6044] D: _realloc: size 8192: fds table

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6045] I: lws_create_context: ctx: 6152B (2056 ctx + pt(1 thr x 4096)), pt-fds: 1024, fdmap: 8192

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6045] I: lws_create_context: http: ah_data: 4096, ah: 984, max count 1024

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6046] D: _realloc: size 8192: lws_lookup

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6046] I: lws_plat_init: mem: platform fd map: 8192 B

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6047] D: _realloc: size 1264: __lws_wsi_create_with_role

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6047] D: : lws_role_transition: wsistate 0x200, ops pipe

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6048] N: __lws_lc_tag: ++ [wsi|0|pipe] (1)

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6048] D: [wsi|0|pipe]: __insert_wsi_socket_into_fds: tsi=0, sock=4, pos-in-fds=0

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6049] I: rops_pt_init_destroy_netlink: creating netlink skt

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6049] D: _realloc: size 1264: __lws_wsi_create_with_role

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6049] D: : lws_role_transition: wsistate 0x200, ops netlink

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6050] N: __lws_lc_tag: ++ [vh|0|netlink] (1)

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6050] D: [vh|0|netlink]: __insert_wsi_socket_into_fds: tsi=0, sock=5, pos-in-fds=1

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6051] D: rops_pt_init_destroy_netlink: starting netlink coldplug wait

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6052] I: lws_context_init_ssl_library: Compiled with OpenSSL support

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6052] I: lws_context_init_ssl_library: Doing SSL library init

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6053] I: lws_server_get_canonical_hostname: canonical_hostname = TCX08-20-03-02

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6053] D: _realloc: size 896: lws_create_vhost

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6054] N: __lws_lc_tag: ++ [vh|1|default||-1] (2)

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6054] D: _realloc: size 168: vh plugin table

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6055] D: _realloc: size 48: same vh list

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6055] I: [vh|1|default||-1]: lws_create_vhost: Creating Vhost 'default' (serving disabled), 2 protocols, IPv6 on

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6059] D: _realloc: size 72: client ctx tcr

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6059] I: lws_tls_client_create_vhost_context: vh default: created new client ctx 0

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6060] I: created client ssl context for default

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6061] E: callback_websocket: connection attempts exhausted

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6062] I: lws_context_init_extensions: LWS_MAX_EXTENSIONS_ACTIVE: 1

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6062] I: lws_create_context: mem: per-conn: 1264 bytes + protocol rx buf

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6063] I: lws_plat_drop_app_privileges: not changing group

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6063] I: lws_plat_drop_app_privileges: not changing user

2024-05-29 05:30:43:LWS: [2024/05/29 05:30:43:6064] D: lws_cancel_service:

@vikashm05
Copy link
Author

Our Websocket server is running on quite old version [not sure about the exact version as its running over cloud]. So, is there compatibility issue with the latest version?

Note: I am able to successfully establish connection with 3.1.0 and only getting "connection in progress" error after 3.2.0 or onwards version.

@lws-team
Copy link
Member

So, is there compatibility issue with the latest version?

Even you don't know the version of your server, I don't know what I can say to this.

Around that time the code changed in lws to have a more compatible approach (ie, with lwip) for async client connection checking. You should look at what's happening here

lws_client_connect_check(struct lws *wsi, int *real_errno)

there's also a dependency on your libc and kernel version to define the errno meanings. If you see it arrives at a non 115 errno but doesn't understand it's connected, that might not be hard to solve.

@vikashm05
Copy link
Author

We have added the debug traces in the code and found that " lws_client_connect_check" is not calling in our code. Any dependency on compilation flags to call the mentioned function?

@lws-team
Copy link
Member

It should create an event on the fd when the connect (later in that file I think) completes, then it will come and check the connection status using that function. It sounds like on your platform (kernel, tcp stack, libc), something else happening.

@vikashm05
Copy link
Author

vikashm05 commented Jun 3, 2024

Thanks for the input.
We have added a debug log in our code and found that below check is returning a failure.
Filename: connect3.c : lws_client_connect_3_connect
if (lws_fi(&wsi->fic, "conn_cb_rej") ||
user_callback_handle_rxflow(wsi->a.protocol->callback, wsi,
LWS_CALLBACK_CONNECTING, wsi->user_space,
(void *)(intptr_t)wsi->desc.sockfd, 0)) {

            lwsl_wsi_debug(wsi, "goto failed1");
            goto failed1;
    }

Could you please suggest something with above code as I am not able to figure out.

@lws-team
Copy link
Member

lws-team commented Jun 3, 2024

It's calling your wsi callback function, if it's ws connection, it's literally your protocol callback function for the ws protocol. This should return 0 for everything not handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants