-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
调用/v1/chat/completions接口,用jmeter10并发进行压测,压测1分钟xinference就挂了,xinference==0.11.3 #1811
Comments
我在使用vllm的 /v1/chat/completions 接口的时候,是可以正常使用的,而且速度是要比xinference快。 |
这个问题是xoscar库的问题,目前已经合并到0.3.2版本 xorbitsai/xoscar#87 |
升级了好像还是有问题,报错信息基本一致,好像是 |
升级后重启xinference了吗?粘贴下报错日志看看 |
版本是
|
换成vllm引擎呢,之前InvalidStateError: invalid state之后,会导致整个接口挂掉,无法继续响应任何请求,即使推理引擎还是正常的。 |
This issue is stale because it has been open for 7 days with no activity. |
用vllm引擎,出现同样问题。请问有什么解决方案吗? |
+1 |
升级到最新版。 |
用了最新版也还是一样。压测的时候并发7开始就会有请求报错,并发16全部失败 |
报错日志贴一下。 |
2024-09-25 02:58:02,170 xinference.api.restful_api 1 INFO Disconnected from client (via refresh/close) Address(host='127.0.0.1', port=38154) during chat. |
Error for prompt with length 5520: Traceback (most recent call last): |
请问有解法吗
|
用我们的 benchmark 也能复现? |
就是用的xinference提供的benchmark/benchmark_serving.py 0.15.2 |
Describe the bug
我们在压测xinference时候发现,V100 2卡,调用/v1/chat/completions接口,stream参数是True,模型用qwen-14b-chat,用jmeter10并发进行压测,压测1分钟xinference就挂了,如果stream是False,是可以的.
报错日志
requtirements.txt
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: