Jailbreak Role Settings prompts began to fail after 2-3k tokens of messages. #3225
Henry-Suen
started this conversation in
LLM Usage | 语言模型研究
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've tried jailbreak prompts when using GPT 4o and sonnet3.5.
They all work great on NSFW story writing for the first dozens of messages.
But they will fail after 2-3k tokens of messages, which is far less than the context length they are capable of.
I've learned that "The attention mechanism might focus more on recent or relevant information." so the role settings will be ignored after a long convo.
So I have a request proposal of "Resending the role settings prompts every N tokens". But before that, I am open to any other solutions.
I've tried "Limit history message count" which works at the expense of cutting off the story former context.
Other chat setting combinations don't work well.
Beta Was this translation helpful? Give feedback.
All reactions