The generated results are different when using greedy search during generation #65

FrostML · 2023-03-14T13:23:32Z

Thank you very much for your work. I got a problem when I ran BLOOM-176B on 8*A100.

I followed the README.md and executed the following command. To be specific, I set do_sample = true and top_k = 1 which I thought it was equivalent to greedy search:

python -m inference_server.cli --model_name bigscience/bloom --model_class AutoModelForCausalLM --dtype bf16 --deployment_framework hf_accelerate --generate_kwargs '{"min_length": 100, "max_new_tokens": 100, "do_sample": true, "top_k": 1}'

However, the generated outputs of several forwards were different with the same inputs. This situation happened occasionally.

Do you have any clues or ideas about this?

My env info:

CUDA 11.7
nccl 2.14.3

accelerate 0.17.1
Flask 2.2.3
Flask-API 3.0.post1
gunicorn 20.1.0
pydantic 1.10.6
huggingface-hub 0.13.2

The text was updated successfully, but these errors were encountered:

mayank31398 · 2023-03-14T15:09:59Z

Hi, do_sample = true and top_k = 1 should be fine but the correct way to do it is just do_sample = False.
This is weird. I don't this is a bug in the code in this repository.
But will try to give it a shot.
Can you try with just do_sample = False?

FrostML · 2023-03-20T12:29:46Z

Hi @mayank31398 Sorry for the late reply.
It was ok with do_sample=False. The results were all the same.
But I still can't figure out why sampling can't work properly. Do you know who or which repo I can turn to for some help?

richarddwang · 2023-03-22T03:47:41Z

Refer to https://huggingface.co/blog/how-to-generate. Because sampling is designed to incorporate randomness into picking the next word.

FrostML · 2023-03-22T03:56:49Z

But the k is 1. There shouldn't be any randomness. @richarddwang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The generated results are different when using greedy search during generation #65

The generated results are different when using greedy search during generation #65

FrostML commented Mar 14, 2023 •

edited

Loading

mayank31398 commented Mar 14, 2023

FrostML commented Mar 20, 2023

richarddwang commented Mar 22, 2023

FrostML commented Mar 22, 2023 •

edited

Loading

The generated results are different when using greedy search during generation #65

The generated results are different when using greedy search during generation #65

Comments

FrostML commented Mar 14, 2023 • edited Loading

mayank31398 commented Mar 14, 2023

FrostML commented Mar 20, 2023

richarddwang commented Mar 22, 2023

FrostML commented Mar 22, 2023 • edited Loading

FrostML commented Mar 14, 2023 •

edited

Loading

FrostML commented Mar 22, 2023 •

edited

Loading