Malformed JSON Error with codegemma-7b and Hugging Face TGI in VS Code #2380

CMobley7 · 2024-09-24T22:04:13Z

Before submitting your bug report

I believe this is a bug. I'll try to join the Continue Discord for questions
I'm not able to find an open issue that reports the same bug
I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS: Mac 14.7
- Continue: v0.8.52
- IDE: VS Code 1.93.1
- Model: `codegemma-7b` (served via Hugging Face TGI on port 80)
- config.json:
  
{
  "models": [
    {
      "title": "CodeGemma Chat",
      "provider": "huggingface-tgi",
      "model": "codegemma-7b-it",
      "apiBase": "http://ip_address
    }
  ],
  "tabAutocompleteModel": {
    "title": "CodeGemma Code Completion",
    "provider": "huggingface-tgi",
    "model": "codegemma-7b",
    "apiBase": "http://ip_address"
  },
  "tabAutocompleteOptions": {
    "maxPromptTokens": 1000,
    "useCache": true,
    "multilineCompletions": "auto",
    "debounceDelay": 200
  },
  "allowAnonymousTelemetry": false,
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
      "description": "Write unit tests for highlighted code"
    }
  ],
  "contextProviders": [
    {
      "name": "code",
      "params": {}
    },
    {
      "name": "docs",
      "params": {}
    },
    {
      "name": "diff",
      "params": {}
    },
    {
      "name": "terminal",
      "params": {}
    },
    {
      "name": "problems",
      "params": {}
    },
    {
      "name": "folder",
      "params": {}
    },
    {
      "name": "codebase",
      "params": {}
    }
  ],
  "slashCommands": [
    {
      "name": "edit",
      "description": "Edit selected code"
    },
    {
      "name": "comment",
      "description": "Write comments for the selected code"
    },
    {
      "name": "share",
      "description": "Export the current chat session to markdown"
    },
    {
      "name": "cmd",
      "description": "Generate a shell command"
    },
    {
      "name": "commit",
      "description": "Generate a git commit message"
    }
  ]
}

Description

I am encountering a persistent
Malformed JSON sent from server: {"error":"Input validation error: `stop` supports up to 4 stop sequences. Given: 16","error_type":"validation"}
error when using the codegemma-7b model for tab autocompletion with the Hugging Face TGI provider in VS Code. The error message indicates an issue with the stop parameter exceeding the allowed number of stop sequences, despite various attempts to configure it correctly.

Additional context

Both my models (codegemma-7b for autocompletion and codegemma-7b-it for chat) are running through Hugging Face TGI on port 80.
The chat feature with codegemma-7b-it works without any issues.
The problem seems isolated to the tab autocompletion functionality with codegemma-7b.

Possible issue

While it's most likely user error, it appears that Continue might not be correctly handling the stop sequences defined in the codegemmaFimTemplate when the codegemma or gemma template is used. This could lead to Continue sending an incorrect number of stop sequences in the request to the TGI server, causing the "Malformed JSON" error.

To reproduce

Initial Configuration: Started with the following basic configuration in config.json:

"tabAutocompleteModel": {
    "title": "CodeGemma Code Completion",
    "provider": "huggingface-tgi",
    "model": "codegemma-7b",
    "apiBase": "http://ip_address" 
  }

Gradual Adjustments: Made the following incremental changes to the configuration based on troubleshooting suggestions, documentation, and analysis of the codegemmaFimTemplate:
- Added "template": "codegemma" (also tried "template": "gemma")
- Set "maxStopWords": 4 (later changed to 2 and then 1)
- Modified "completionOptions.stop" to various combinations, including:
  - ["<|fim_prefix|>", "<|fim_suffix|>", "<|fim_middle|>", "<|file_separator|>", "<end_of_turn>", "<eos>"]
  - ["<end_of_turn>", "<eos>"]
  - ["<eos>"]
Error persists: The same "Malformed JSON" error occurred after each configuration change, with the number of given stop sequences varying (e.g., 15 in the initial error message).

Expected behavior

The tab autocompletion should function correctly without any "Malformed JSON" errors.

The text was updated successfully, but these errors were encountered:

sestinj · 2024-09-29T20:54:04Z

@CMobley7 thank you for the detailed write up! I just committed here (3d1f577) so that we will by default have a limit of 4 and here (c932454) so that you can further set maxStopWords yourself. This will be in the next pre-release!

sestinj · 2024-09-29T21:04:24Z

Also for reference I opened an issue with TGI so that this behavior can be improved out of the box: huggingface/text-generation-inference#2584

CMobley7 · 2024-09-30T15:39:19Z

Hi @sestinj

Thank you for the quick fix on the Malformed JSON error! I really appreciate it.

I restarted my TGI server with --max-stop-sequences set to a high number and I thankfully am no longer running into the Malformed JSON error. 👍

However, I'm now running into a different problem. While the chat functionality works perfectly with my chat model in continue_tutorial.py, the autocomplete feature only returns <eos>.

I've tried this on both the pre-release and regular release versions of Continue. I also restarted my TGI server with --max-stop-sequences set to various values between 20 and 100. While I no longer get the Malformed JSON error, the autocomplete still only generates <eos> when using continue_tutorial.py and my own python files.

Here are some of the things I've tried:

Stop word adjustments: Tried various combinations of stop sequences, including a minimal set.
Config changes based on Issue Error generating autocompletion with Qwen2.5-Coder-7B and vllm #2388: I came across this issue (#2388) and tried adjusting my config.json accordingly:

"tabAutocompleteModel": {
    "title": "CodeGemma Code Completion",
    "provider": "huggingface-tgi",
    "model": "codegemma-7b",
    "apiBase": "http://ip_address",
    "completionOptions": {
      "stop": [
        "<|fim_prefix|>",
        "<|fim_suffix|>",
        "<|fim_middle|>",
        "<|file_separator|>",
        "<end_of_turn>",
        "<eos>"
      ]
    }
  },
  "allowAnonymousTelemetry": false,
  "tabAutocompleteOptions": {
    "multilineCompletions": "never",
    "template": "You are a helpful assistant.<|fim_prefix|>{{{ prefix }}}<|fim_suffix|>{{{ suffix }}}<|fim_middle|>"
  },

Here's a snippet from the TGI server logs:

generate_stream{parameters=GenerateParameters { best_of: Some(1), temperature: Some(0.01), repetition_penalty: None, frequency_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: false, max_new_tokens: Some(2048), return_full_text: None, stop: ["<fim_prefix>", "<fim_suffix>", "<fim_middle>", "<file_sep>", "<|endoftext|>", "</fim_middle>", "</code>", "\n\n", "\r\n\r\n", "/src/", "#- coding: utf-8", "```", "\ndef", "\nclass", "\n\"\"\"#"], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None, adapter_id: None } total_time="29.455474ms" validation_time="748.47µs" queue_time="55.637µs" inference_time="28.651475ms" time_per_token="28.651475ms" seed="Some(18379601338758716305)"}

Could this be related to how Continue is handling the response from TGI, or perhaps something specific to the codegemma model when used for autocompletion?

Any further suggestions you have would be greatly appreciated!

CMobley7 · 2024-10-01T15:29:38Z

@sestinj , I switched to using vLLM containers due to wider continue.dev community adoption than TGI. While I still ran into the issue mentioned in #2388, I was able to solve them by changing my provider from vllm to openai. I'm happy to help debug the <eos> issue a little more or close it. Let me know what you'd prefer.

Thanks for all your hard work! You have a great product/project!

sestinj self-assigned this Sep 24, 2024

dosubot bot added area:autocomplete Relates to the auto complete feature ide:vscode Relates specifically to VS Code extension kind:bug Indicates an unexpected problem or unintended behavior priority:medium Indicates medium priority labels Sep 24, 2024

sestinj mentioned this issue Sep 29, 2024

Remove max_stop_sequences by default huggingface/text-generation-inference#2584

Open

4 tasks

RomneyDa added the needs-triage Waiting to be triaged label Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Malformed JSON Error with codegemma-7b and Hugging Face TGI in VS Code #2380

Malformed JSON Error with codegemma-7b and Hugging Face TGI in VS Code #2380

CMobley7 commented Sep 24, 2024 •

edited

Loading

sestinj commented Sep 29, 2024

sestinj commented Sep 29, 2024

CMobley7 commented Sep 30, 2024

CMobley7 commented Oct 1, 2024

Malformed JSON Error with codegemma-7b and Hugging Face TGI in VS Code #2380

Malformed JSON Error with codegemma-7b and Hugging Face TGI in VS Code #2380

Comments

CMobley7 commented Sep 24, 2024 • edited Loading

Before submitting your bug report

Relevant environment info

Description

To reproduce

Expected behavior

sestinj commented Sep 29, 2024

sestinj commented Sep 29, 2024

CMobley7 commented Sep 30, 2024

CMobley7 commented Oct 1, 2024

CMobley7 commented Sep 24, 2024 •

edited

Loading