Build your own plugin

Optillm supports a simple plugin system to extend the capabilities of the proxy. You can use it to run any code as part of the request <--> response call to a LLM.

To do so, you need to create a python file and put it in the /plugins folder.

The plugin needs only two things:

SLUG (a unique string that will be the name of the plugin. This is what the user will put in to use the plugin from the proxy)
Implement the run method (the method will get the initial query, system prompt, a API client (optional) and model (optional) and return a tuple with the final response and tokens used)

SLUG = "your_plugin_name"

def run(system_prompt, initial_query: str, client=None, model=None) -> Tuple[str, int]:
    # Implement your code here
    return final_response, tokens_used

When the proxy starts it will load all the files from the /plugins folder. It will then route the requests based on the SLUG you set and call the run method in the file. In the above example running the proxy with your_plugin_name-gpt-4o-mini will route the request to the plugin's run method. There are several plugins that are already implemented as shown here. You can use them as reference.

Note

Plugins may do anything, they do not have to call an LLM or return the response. E.g. the readurls plugin fetches the content of all the URLs in the request and adds it to the context.

Plugins can also be chained together with & or | operators. For instance, if you like to read the content of all URLs in a request and then process the request with memory as the context may become very large to process directly, you can combine them as readurls&memory-gpt-4o-mini. The & operator will run the plugins one after the other in a pipeline, taking the output of the previous stage as the input to the next one.

On the other hand the | operator will run both the plugins/approaches in parallel and return a response with multiple completions in a list. E.g. if you want to run the request with rto approach and also want to run it with executecode plugin, you can use rto|executecode-gpt-4o-mini. This will return a list with 2 completions one for rto and another for executecode.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build your own plugin

Clone this wiki locally