We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
...
在使用批量预测指令时,如何能够将输入数据中的 extrainfo1 和 extrainfo2 字段保留到 generated_predictions.jsonl 文件中?此外,我发现输出的 prompt 字段中未包含输入中的 <image> 标记。
extrainfo1
extrainfo2
generated_predictions.jsonl
prompt
<image>
torchrun ${DISTRIBUTED_ARGS} src/train.py \ --stage sft \ --do_predict \ --predict_with_generate \ --use_fast_tokenizer \ --flash_attn auto \ --model_name_or_path ${MODEL_NAME_OR_PATH} \ --eval_dataset ${eval_dataset} \ --output_dir $OUTPUT_PATH \ --template qwen2_vl \ --finetuning_type full \ --do_sample False \ --max_new_tokens 4 \ --repetition_penalty 1 \ --length_penalty 1 \ --num_beams 1 \ --overwrite_cache \ --overwrite_output_dir \ --per_device_eval_batch_size 2 \ --ddp_timeout 9000 \ --logging_steps 1 \ --cutoff_len 4096 \ --bf16
每行数据如下:
{ "messages": [ {"content": "...", "role": "user"}, {"content": "...", "role": "assistant"} ], "images": [], "extrainfo1": "...", "extrainfo2": "..." }
希望在 generated_predictions.jsonl 文件中保留 extrainfo1 和 extrainfo2 字段,生成的字段应包括:
{ "prompt": "...", "label": "...", "predict": "...", "extrainfo1": "...", "extrainfo2": "..." }
No response
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
System Info
...
Reproduction
问题描述
在使用批量预测指令时,如何能够将输入数据中的
extrainfo1
和extrainfo2
字段保留到generated_predictions.jsonl
文件中?此外,我发现输出的prompt
字段中未包含输入中的<image>
标记。使用指令
输入数据格式
每行数据如下:
期望输出
希望在
generated_predictions.jsonl
文件中保留extrainfo1
和extrainfo2
字段,生成的字段应包括:目前行为
generated_predictions.jsonl
中缺少extrainfo1
和extrainfo2
字段。prompt
字段中丢失了<image>
标记。相关问题
<image>
标记是否应出现在prompt
字段中?Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: