Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Error when running the server and client locally #69

Open
AaronDinesh opened this issue Aug 6, 2024 · 8 comments
Open

Assertion Error when running the server and client locally #69

AaronDinesh opened this issue Aug 6, 2024 · 8 comments

Comments

@AaronDinesh
Copy link

Hello! I am currently using vectordb in a personal project and wanted to get the server and client running. I started the server as per the instruction in the README, similarly with the client. The code I use is below

# server.py
from docarray import DocList
import numpy as np
from vectordb import InMemoryExactNNVectorDB, HNSWVectorDB
from docarray import BaseDoc
from docarray.typing import NdArray

class ToyDoc(BaseDoc):
  text: str = ''
  embedding: NdArray[128]

# Specify your workspace path
db = InMemoryExactNNVectorDB[ToyDoc](workspace='./workspace_path')

# Index a list of documents with random embeddings
doc_list = [ToyDoc(text=f'toy doc {i}', embedding=np.random.rand(128)) for i in range(1000)]
db.index(inputs=DocList[ToyDoc](doc_list))

with db.serve(protocol='grpc', port=12345, replicas=1, shards=1) as service:
   service.block()
# client.py
from docarray import BaseDoc
from docarray.typing import NdArray

class ToyDoc(BaseDoc):
  text: str = ''
  embedding: NdArray[128]


from vectordb import Client

# Instantiate a client connected to the server. In practice, replace 0.0.0.0 to the server IP address.
client = Client[ToyDoc](address='grpc://0.0.0.0:12345')

# Perform a search query
results = client.search(inputs=DocList[ToyDoc]([query]), limit=10)

However when I run the server and the client I get an AssertionError from the server

# Last line of server output

           assert len(docs) == len(matched_documents) == len(matched_scores)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       AssertionError

Here is the full error from the server

ERROR  indexer/rep-0@28106 AssertionError()                                                                                              [08/06/24 13:52:39]
        add "--quiet-error" to suppress the exception details
       Traceback (most recent call last):
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/runtimes/worker/request_handling.py", line 1106,
       in process_data
           result = await self.handle(
                    ^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/runtimes/worker/request_handling.py", line 720,
       in handle
           return_data = await self._executor.__acall__(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 749, in __acall__
           return await self.__acall_endpoint__(req_endpoint, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 881, in
       __acall_endpoint__
           return await exec_func(
                  ^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 839, in exec_func
           return await get_or_reuse_loop().run_in_executor(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/concurrent/futures/thread.py", line 58, in run
           result = self.fn(*self.args, **self.kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/decorators.py", line 325, in
       arg_wrapper
           return fn(executor_instance, *args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/vectordb/db/executors/inmemory_exact_indexer.py", line 54,
       in search
           return self._search(docs, *args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/vectordb/db/executors/inmemory_exact_indexer.py", line 42,
       in _search
           assert len(docs) == len(matched_documents) == len(matched_scores)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       AssertionError

Is there something I am missing? Or is this a known issue?

@JoanFM
Copy link
Member

JoanFM commented Aug 6, 2024

Is there anything indexed?

@AaronDinesh
Copy link
Author

AaronDinesh commented Aug 10, 2024

Is there anything indexed?

Things should be indexed. From my understanding that's what db.index does in server.py ?

Is there any documentation for this library outside of the README in the GitHub repo?

@JoanFM
Copy link
Member

JoanFM commented Aug 10, 2024

if u did not index any documentz there will be none.

There is no other documentation except for this repo.

@AaronDinesh
Copy link
Author

if u did not index any documentz there will be none.

There is no other documentation except for this repo.

But I did index some documents. In the server.py file I have this line

db.index(inputs=DocList[ToyDoc](doc_list))

To the best of my knowledge this is the indexing operation?

@JoanFM
Copy link
Member

JoanFM commented Aug 10, 2024

ok missed that. Will check there. Can you sharr the list of versions for vectordb, jina and docarray dependencies?

@AaronDinesh
Copy link
Author

vectordb==0.0.21
  docarray==0.40.0
    numpy==1.26.1
    orjson==3.10.6
    pydantic==1.10.17
    rich==13.7.1
    types-requests==2.31.0.6
    typing-inspect==0.9.0
  jina==3.27.2
    aiofiles==24.1.0
    aiohttp==3.10.1
    docarray==0.40.0
    docker==7.1.0
    fastapi==0.112.0
    filelock==3.15.4
    grpcio==1.57.0
    grpcio-health-checking==1.57.0
    grpcio-reflection==1.57.0
    jcloud==0.3
    jina-hubble-sdk==0.39.0
    numpy==1.26.1
    opentelemetry-api==1.19.0
    opentelemetry-exporter-otlp==1.19.0
    opentelemetry-exporter-otlp-proto-grpc==1.19.0
    opentelemetry-exporter-prometheus==0.41b0
    opentelemetry-instrumentation-aiohttp-client==0.40b0
    opentelemetry-instrumentation-fastapi==0.40b0
    opentelemetry-instrumentation-grpc==0.40b0
    opentelemetry-sdk==1.19.0
    packaging==24.1
    pathspec==0.12.1
    prometheus_client==0.20.0
    protobuf==4.25.4
    pydantic==1.10.17
    python-multipart==0.0.9
    PyYAML==6.0.1
    requests==2.32.3
    urllib3==1.26.19
    uvicorn==0.23.1
    uvloop==0.19.0
    websockets==12.0

Here are all the dependencies for vectordb, jina and docarray. Here are the main versions vectordb==0.0.21, docarray==0.40.0, jina==3.27.2.

@JoanFM
Copy link
Member

JoanFM commented Aug 19, 2024

What works, is to index when the server has started, so move the code to index to:

#db.index(inputs=DocList[ToyDoc](doc_list))

with db.serve(protocol='grpc', port=12345, replicas=1, shards=1) as service:
    service.index(inputs=DocList[ToyDoc](doc_list))  
    service.block()

@JoanFM
Copy link
Member

JoanFM commented Aug 19, 2024

I see it may not be so clear in the docs, but the DB behaves slightly different when it is started as a service or when used as a simple Python object. So you have to use them in a coherent manner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants