The input dimensions received by subsequent nodes in ensemble mode are incorrect #7383

SeibertronSS · 2024-06-27T10:01:51Z

I built an LLM inference topology, including preprocessing inference and postprocessing. Each time the inference node only outputs the latest token_id to the postprocessing node, but sometimes the postprocessing node receives a lot of token_ids at one time, for example:

[8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 99662, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 99662, 99808, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 99662, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 8908, 8908, 234, 8908, 8908, 234, 114, 8908, 8908, 234, 8908, 8908, 234, 114, 103081, 99662, 99808, 99219, 9909]

When I request the inference node alone, I don't receive a similar response. This phenomenon is very similar to the duplication of memory, and the dimension of the token_id received by postprocessing will be doubled with each iteration of the model, and finally a token_id with billions of dimensions will be obtained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The input dimensions received by subsequent nodes in ensemble mode are incorrect #7383

The input dimensions received by subsequent nodes in ensemble mode are incorrect #7383

SeibertronSS commented Jun 27, 2024

The input dimensions received by subsequent nodes in ensemble mode are incorrect #7383

The input dimensions received by subsequent nodes in ensemble mode are incorrect #7383

Comments

SeibertronSS commented Jun 27, 2024