Batch inference issue and left padding #35

bshao001 · 2023-03-11T22:51:31Z

Hi there,

Thanks for the project firstly. I saw a method called get_padding_mask in tf_utils.py file. It is combined with attention mask. It is designed to resolve the padding issue in batch inference or training only?

I see that with very a few padding in the left can still make good predictions/generations, but if use a large batch for inference, which cause a long left padding, the predictions get very incorrect. Do you have any suggestions for that?

Looking forward to your response. Thanks.

bshao001 · 2023-03-11T22:53:27Z

I was thinking to perform left padding at training time as well. But in most cases, we may not need batch inference, which does not need padding at all, which can be messed up if we trained that way.

akanyaani · 2023-03-12T06:06:01Z

Hi @bshao001,

Padding mask also works for inference, so you can do the inference using the right padding.
With left padding, this implementation will not give you the correct result

bshao001 · 2023-03-12T17:25:00Z

Thanks for your quick response. I will give it a try when the model trained with a much larger dataset is ready.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch inference issue and left padding #35

Batch inference issue and left padding #35

bshao001 commented Mar 11, 2023

bshao001 commented Mar 11, 2023

akanyaani commented Mar 12, 2023

bshao001 commented Mar 12, 2023

Batch inference issue and left padding #35

Batch inference issue and left padding #35

Comments

bshao001 commented Mar 11, 2023

bshao001 commented Mar 11, 2023

akanyaani commented Mar 12, 2023

bshao001 commented Mar 12, 2023