Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated remaining references of pinecone to remote #22

Open
wants to merge 1 commit into
base: pgvr/main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ with the power of remote vector databases, by introducing a new remote vector in

```sql
CREATE TABLE products (name text, embedding vector(1536), price float);
CREATE INDEX my_remote_index ON products USING pinecone (embedding, price) with (host = 'my-pinecone-index.pinecone.io');
CREATE INDEX my_remote_index ON products USING remote (embedding, price) with (host = 'my-pinecone-index.pinecone.io');
-- [insert, update, and delete billions of records in products]
SELECT * FROM products WHERE price < 40.0 ORDER BY embedding <-> '[...]' LIMIT 10; -- pinecone performs this query, including the price predicate

Expand Down Expand Up @@ -78,19 +78,19 @@ sudo apt-get install libcurl4-openssl-dev

Set the pinecone API key in the postgres configuration. For example,
```sql
ALTER DATABASE mydb SET pinecone.api_key = 'xxxxxxxx-xxxx-xxxx-xxxx–xxxxxxxxxxxx';
ALTER DATABASE mydb SET remote.pinecone_api_key = 'xxxxxxxx-xxxx-xxxx-xxxx–xxxxxxxxxxxx';
```

## Index Creation

There are two ways to specify the pinecone index:
- By providing the host of an existing pinecone index. For example,
```sql
CREATE INDEX my_remote_index ON products USING pinecone (embedding) with (host = 'example-23kshha.svc.us-east-1-aws.pinecone.io');
CREATE INDEX my_remote_index ON products USING remote (embedding) with (host = 'example-23kshha.svc.us-east-1-aws.pinecone.io');
```
- By specifying the `spec` of the pinecone index. For example,
```sql
CREATE INDEX my_remote_index ON products USING pinecone (embedding) with (spec = '"spec": {
CREATE INDEX my_remote_index ON products USING remote (embedding) with (spec = '{
"serverless": {
"region": "us-west-2",
"cloud": "aws"
Expand All @@ -103,16 +103,16 @@ All spec options can be found [here](https://docs.pinecone.io/reference/api/cont

- Place your pinecone index in the same region as your postgres instance to minimize latency.
- Make use of connection pooling to run queries in postgres concurrently. For example, use `asyncpg` in python.
- Records are sent to the remote index in batches. Therefore pgvector-remote performs a local scan of the unflushed records before every query. To disable this set `pinecone.max_buffer_scan` to 0. For example,
- Records are sent to the remote index in batches. Therefore pgvector-remote performs a local scan of the unflushed records before every query. To disable this set `remote.max_buffer_scan` to 0. For example,
```sql
ALTER DATABASE mydb SET pinecone.max_buffer_scan = 0;
ALTER DATABASE mydb SET remote.max_buffer_scan = 0;
```
- You can adjust the number of vectors sent in each request and the number of concurrent requests per batch using `pinecone.vectors_per_request` and `pinecone.requests_per_batch` respectively. For example,
- You can adjust the number of vectors sent in each request and the number of concurrent requests per batch using `remote.pinecone_vectors_per_request` and `remote.requests_per_batch` respectively. For example,
```sql
ALTER DATABASE mydb SET pinecone.vectors_per_request = 100; --default
ALTER DATABASE mydb SET pinecone.requests_per_batch = 40; --default
ALTER DATABASE mydb SET remote.pinecone_vectors_per_request = 100; --default
ALTER DATABASE mydb SET remote.requests_per_batch = 40; --default
```
- You can control the number of results returned by pinecone using `pinecone.top_k`. Lowering this parameter can decrease latencies, but keep in mind that setting this too low could cause fewer results to be returned than expected.
- You can control the number of results returned by pinecone using `remote.top_k`. Lowering this parameter can decrease latencies, but keep in mind that setting this too low could cause fewer results to be returned than expected.

## Docker

Expand Down
6 changes: 3 additions & 3 deletions test.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@
"source": [
"# set the pinecone api key\n",
"cur = conn.cursor()\n",
"cur.execute(\"ALTER SYSTEM SET pinecone.api_key TO 'your-api-key-here'\")\n",
"cur.execute(\"SHOW pinecone.api_key\")\n",
"cur.execute(\"ALTER SYSTEM SET remote.pinecone_api_key TO 'your-api-key-here'\")\n",
"cur.execute(\"SHOW remote.pinecone_api_key\")\n",
"print(cur.fetchall())"
]
},
Expand All @@ -78,7 +78,7 @@
"cur = conn.cursor()\n",
"import json\n",
"basic_spec = {'serverless': {'cloud': 'aws', 'region': 'us-west-2'}}\n",
"cur.execute(\"CREATE INDEX test_index ON test USING pinecone (vec) with (spec = '%s')\" % json.dumps(basic_spec))\n"
"cur.execute(\"CREATE INDEX test_index ON test USING remote (vec) with (spec = '%s')\" % json.dumps(basic_spec))\n"
]
},
{
Expand Down