Some tools not well used with BREAK_ON_TOOLUSE=false #421

ErikBjare · 2025-01-24T15:01:15Z

I was testing deepseek/deepseek-reasoner with GPTME_BREAK_ON_TOOLUSE=false and it didn't use the right tmux session because it proposed 4 commands in one go, but it needed the output of the first command (session id) to use the others, so it hallucinated a bad id which got the wrong tmux session contents.

Includes all non-streaming usage as well, since the behavior is equivalent to BREAK_ON_TOOLUSE=false.

This might be partly caused by prompting that it will immediately get responses to all tooluses, which isn't true when BREAK_ON_TOOLUSE=false or --no-stream.

Also just an idea: We could force the break on tooluse in the non-streaming API by simply stripping everything after and pretend we didn't get it, but it would waste a lot of tokens.

The text was updated successfully, but these errors were encountered:

ErikBjare · 2025-01-24T15:03:40Z

@gptme what do you think? read the relevant code and give a thoughtful but concise reply, including possible prompt adjustments or other fixes

ErikBjare · 2025-01-24T15:06:32Z

@gptme you need to read gptme/llm/init.py to find the relevant code, you can find the tmux stuff in gptme/tools/tmux.py

ErikBjare · 2025-01-24T17:00:30Z

Deepseek R1 was also confused when it tried to apply 4 patches but the last one failed, it then retried several of the patches, leading to duplication.

bjsi · 2025-01-26T14:33:45Z

I'm not up to date on this, but maybe consider passing stop sequences? Like:

stop=['</tool-use>']

ErikBjare · 2025-01-26T15:11:46Z

@bjsi The current stop/abort method works fine, this is for when we don't want to break on tool-use, letting it make multiple tooluses per message.

But I guess the stop token might be a good addition for the non-streaming endpoint :)

bjsi · 2025-01-26T16:11:52Z

Ohhh gotcha, makes sense

…

On Sun, 26 Jan 2025 at 15:12, Erik Bjäreholt ***@***.***> wrote: @bjsi <https://github.com/bjsi> The current stop/abort method works fine, this is for when we *don't* want to break on tool-use, letting it make multiple tooluses per message. — Reply to this email directly, view it on GitHub <#421 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AN3UCA6DTZKFMBUMV7LOKET2MT3MPAVCNFSM6AAAAABVZ3W366VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJUGQ3DGMZQGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

This comment has been minimized.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some tools not well used with BREAK_ON_TOOLUSE=false #421

Some tools not well used with BREAK_ON_TOOLUSE=false #421

ErikBjare commented Jan 24, 2025 •

edited

Loading

ErikBjare commented Jan 24, 2025

This comment has been minimized.

ErikBjare commented Jan 24, 2025

This comment has been minimized.

ErikBjare commented Jan 24, 2025

bjsi commented Jan 26, 2025 •

edited

Loading

ErikBjare commented Jan 26, 2025 •

edited

Loading

bjsi commented Jan 26, 2025 via email

Some tools not well used with BREAK_ON_TOOLUSE=false #421

Some tools not well used with BREAK_ON_TOOLUSE=false #421

Comments

ErikBjare commented Jan 24, 2025 • edited Loading

ErikBjare commented Jan 24, 2025

This comment has been minimized.

ErikBjare commented Jan 24, 2025

This comment has been minimized.

ErikBjare commented Jan 24, 2025

bjsi commented Jan 26, 2025 • edited Loading

ErikBjare commented Jan 26, 2025 • edited Loading

bjsi commented Jan 26, 2025 via email

ErikBjare commented Jan 24, 2025 •

edited

Loading

bjsi commented Jan 26, 2025 •

edited

Loading

ErikBjare commented Jan 26, 2025 •

edited

Loading