Dockerfile POC #55

jhamon · 2023-10-02T03:18:12Z

I was playing around with docker this weekend just to refresh my memory on some stuff I used to know and realized it was fairly simple to build a container that runs the resin server. Up to you guys if you want to merge this or not, but I thought it might be educational for anybody who hasn't worked much with docker before to see some of the commands in action.

Problem

We need to build a docker image of this project so that people can more easily run and deploy it.

Solution

Implement a basic Dockerfile. Build the image and run a container locally to verify it works.

How to use it

# Build the image from the Dockerfile in the . directory
$ docker build -t resin-proxy .

# Run a container using the resin-proxy image
#    -it option tells it to print output interactively so we can see what's happening inside the container
#    --rm option tells docker to delete the container after it is stopped
#    --env-file option tells it where to find environment variables (e.g. OPENAI_API_KEY, etc)
#    -p option maps container ports to host ports.
$ docker run -it --rm --env-file ./.env -p 8000:8000 resin-proxy
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
WARNING:root:Failed to import splade encoder. If you want to use splade, install the splade extra dependencies by running: `pip install pinecone-text[splade]`
Starting Resin service on 0.0.0.0:8000
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

# Now, in another terminal window, try hitting the resin endpoint.
$ curl -X POST http://localhost:8000/context/chat/completions -H 'Content-Type: application/json' -d '{"stream": false, "messages": [{ "role": "user", "content": "What do you know about Pinecone?" }]}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1431  100  1333  100    98    252     18  0:00:05  0:00:05 --:--:--   328
{
  "id": "544076fa-83a2-4b18-bbc1-c9edca2d82f6",
  "object": "chat.completion",
  "created": 1696216058,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Pinecone is a vector database designed for storing and querying high-dimensional vectors. It provides fast and efficient semantic search over these vector embeddings. By integrating OpenAI's LLMs (large language models) with Pinecone, it combines deep learning capabilities for embedding generation with efficient vector storage and retrieval. Pinecone surpasses traditional keyword-based search by offering contextually-aware and precise results. It is ideal for various use cases such as semantic text search, question-answering, visual search, recommendation systems, and more. Pinecone can handle very large scales of hundreds of millions and even billions of vector embeddings. It is a fully managed, cloud-native vector database with a simple API and no infrastructure hassles. It offers benefits like ultra-low query latency, live index updates, and the ability to combine vector search with metadata filtering or keyword search for more relevant results. Pinecone is SOC 2 Type II compliant and GDPR-ready. Source: Pinecone documentation"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 2393,
    "completion_tokens": 192,
    "total_tokens": 2585
  }
}

jhamon · 2023-10-02T03:38:07Z

This POC actually shows a security problem, IMO. Callers to the proxied openai endpoint should still have to pass a -H "Authorization: Bearer $OPENAI_API_KEY" request header, otherwise you're just opening up a free-for-all openai endpoint.

miararoy

@jhamon Thanks for this 🙏

miararoy · 2023-10-02T11:50:11Z

@jhamon
Re: security.

this is a design choice (namely to not auth on the server), the reason is that this project authentication is on the user, but in case someone is running this in a public server, this will allow anyone to use the LLM underneath, and not a bug. for exmaple, any real deployment will probably use resin as internal API and will not expose it directly, and will use cloud level security, (IP whitelists and instance profiles)

igiloh-pinecone · 2023-10-02T17:42:02Z

@jhamon thank you very much for your contribution!!! This is definitely a step in the direction we're planning to take.

We actually have a very similar (but slightly more elaborate) Dockerfile template that we're currently using for Pinecone's support bot. We are planning to port it to this project as well.
I'm closing this PR for now, but we'll definitely come back to this feature very shortly.

Starter Dockerfile

7ff9b89

jhamon requested review from miararoy and igiloh-pinecone October 2, 2023 03:19

jhamon marked this pull request as ready for review October 2, 2023 03:22

miararoy reviewed Oct 2, 2023

View reviewed changes

igiloh-pinecone closed this Oct 2, 2023

igiloh-pinecone deleted the jhamon/docker-poc branch November 6, 2023 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerfile POC #55

Dockerfile POC #55

jhamon commented Oct 2, 2023 •

edited

Loading

jhamon commented Oct 2, 2023 •

edited

Loading

miararoy left a comment

miararoy commented Oct 2, 2023

igiloh-pinecone commented Oct 2, 2023

Dockerfile POC #55

Dockerfile POC #55

Conversation

jhamon commented Oct 2, 2023 • edited Loading

Problem

Solution

How to use it

jhamon commented Oct 2, 2023 • edited Loading

miararoy left a comment

Choose a reason for hiding this comment

miararoy commented Oct 2, 2023

igiloh-pinecone commented Oct 2, 2023

jhamon commented Oct 2, 2023 •

edited

Loading

jhamon commented Oct 2, 2023 •

edited

Loading