Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No callbacks: no successful callbacks and no failed callbacks #8387

Open
Alexmego opened this issue Apr 26, 2024 · 4 comments
Open

No callbacks: no successful callbacks and no failed callbacks #8387

Alexmego opened this issue Apr 26, 2024 · 4 comments
Labels
bug Bug in existing code

Comments

@Alexmego
Copy link

Alexmego commented Apr 26, 2024

No Reproducible! No Reproducible! No Reproducible

Environment:

okhttp version : 4.9.3 & the application protocol is Http2.0.
okhttp timeout configure
截屏2024-04-26 上午10 35 45

Describe:

After the OkHttp network request is sent out, there were no callbacks, no successful callbacks, and no failed callbacks.
The probability of this occurrence is approximately 0.5%.

Process:

Below is the complete event sequence printed by the EventListener for two requests.

  • 
Between 17:07:32 --- 17:07:51, there were no callbacks.

  • 
After 17:07:51, another okhttp request was sent and callback arrived.

OkHttpPrint.txt

loginfo:

9 results - 2 files

~/Downloads/A_log_0311_20240424_170747:
803283: 04-24 17:07:32.949266 3489 3489 I XX_APP_OkHttpPrint: callStart: https://xx.cn/api/search/haha/v1/rgeo?centerPoint=40.05594%2C116.26804
803349: 04-24 17:07:32.964995 3489 16482 I XX_APP_OkHttpPrint: connectStart:
805021: 04-24 17:07:32.983984 3489 16482 I XX_APP_OkHttpPrint: secureConnectStart:

~/Downloads/A_log_0312_20240424_170809:
709118: 04-24 17:07:51.892152 3489 16482 I XX_APP_OkHttpPrint: dnsStart:
709626: 04-24 17:07:51.894966 3489 16482 I XX_APP_OkHttpPrint: dnsEnd: 3
709657: 04-24 17:07:51.895477 3489 16482 I XX_APP_OkHttpPrint: connectStart:
713087: 04-24 17:07:51.929039 3489 16482 I XX_APP_OkHttpPrint: secureConnectStart:
719294: 04-24 17:07:51.967419 3489 16482 I XX_APP_OkHttpPrint: connectFailed:
719313: 04-24 17:07:51.967720 3489 16482 I XX_APP_OkHttpPrint: callFailed: https://xx.cn/api/search/haha/v1/rgeo?centerPoint=40.05607%2C116.26761

log.txt

Note:

Occasional, low-probability, cannot provide a reproducible program.

No Reproducible

@Alexmego Alexmego added the bug Bug in existing code label Apr 26, 2024
@yschimke
Copy link
Collaborator

There is not enough to go on. Try upgrading to 4.12 or even 5 alphas, and see if it still happens.

But closing as we can't help with this without a repro.

Maybe try a site like stack overflow.

@swankjesse
Copy link
Collaborator

Seeing secureConnectStart() but no secureConnectEnd() or connectFailed() tells us a lot about where it’s getting stuck. I wonder if the TLS handshake is broken in a way that’s causing them to not return.

Could you capture a thread dump of a process that has demonstrated this behavior? That’d be amazing to diagnose this further.

@Alexmego
Copy link
Author

Seeing secureConnectStart() but no secureConnectEnd() or connectFailed() tells us a lot about where it’s getting stuck. I wonder if the TLS handshake is broken in a way that’s causing them to not return.

Could you capture a thread dump of a process that has demonstrated this behavior? That’d be amazing to diagnose this further.

In a release environment, the probability is too low to effectively capture the thread state and actively attempt to reproduce it, but it will not occur.

However, in this case, I can be certain that the timeout is invalid, there are no callbacks, and no Exception information. If a network request is initiated again, everything is normal again.

In my app, there are over 40000 requests, and 10-20 of them are in this situation. Although the impact is not significant, QA has caught abnormal phenomena, and R&D needs to provide a reasonable explanation for this. Currently, I have not made effective progress.

@Endeavour233
Copy link
Contributor

Endeavour233 commented May 5, 2024

secureConnectStart

The reason why connectTimeout does not work is that the time consumed in TLS handshake is not counted into connection time in the current implementation. And because the connection is not successfully established, readTimeout and writeTimeout don't take effect. Can you try to set callTimeout? I think your eventlistener would observe a canceled event then(though there was still no any callback - because when it's timeout, we just cancel the call, which would not trigger a CancelException in the case when the timeout is caused by a never-return method).

// RealCall.kt
  private val timeout =
    object : AsyncTimeout() {
      override fun timedOut() {
        this@RealCall.cancel()
      }
    }.apply {
      timeout(client.callTimeoutMillis.toLong(), MILLISECONDS)
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in existing code
Projects
None yet
Development

No branches or pull requests

4 participants