Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tools not well used with BREAK_ON_TOOLUSE=false #421

Open
ErikBjare opened this issue Jan 24, 2025 · 8 comments
Open

Some tools not well used with BREAK_ON_TOOLUSE=false #421

ErikBjare opened this issue Jan 24, 2025 · 8 comments

Comments

@ErikBjare
Copy link
Owner

ErikBjare commented Jan 24, 2025

I was testing deepseek/deepseek-reasoner with GPTME_BREAK_ON_TOOLUSE=false and it didn't use the right tmux session because it proposed 4 commands in one go, but it needed the output of the first command (session id) to use the others, so it hallucinated a bad id which got the wrong tmux session contents.

Includes all non-streaming usage as well, since the behavior is equivalent to BREAK_ON_TOOLUSE=false.

This might be partly caused by prompting that it will immediately get responses to all tooluses, which isn't true when BREAK_ON_TOOLUSE=false or --no-stream.

Also just an idea: We could force the break on tooluse in the non-streaming API by simply stripping everything after and pretend we didn't get it, but it would waste a lot of tokens.

@ErikBjare
Copy link
Owner Author

@gptme what do you think? read the relevant code and give a thoughtful but concise reply, including possible prompt adjustments or other fixes

This comment has been minimized.

@ErikBjare
Copy link
Owner Author

@gptme you need to read gptme/llm/init.py to find the relevant code, you can find the tmux stuff in gptme/tools/tmux.py

This comment has been minimized.

@ErikBjare
Copy link
Owner Author

Deepseek R1 was also confused when it tried to apply 4 patches but the last one failed, it then retried several of the patches, leading to duplication.

@bjsi
Copy link
Contributor

bjsi commented Jan 26, 2025

I'm not up to date on this, but maybe consider passing stop sequences? Like:

stop=['</tool-use>']

@ErikBjare
Copy link
Owner Author

ErikBjare commented Jan 26, 2025

@bjsi The current stop/abort method works fine, this is for when we don't want to break on tool-use, letting it make multiple tooluses per message.

But I guess the stop token might be a good addition for the non-streaming endpoint :)

@bjsi
Copy link
Contributor

bjsi commented Jan 26, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants