Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend repository schema to include sentence embedder #15

Merged
merged 8 commits into from
Aug 21, 2024

Conversation

mehmetcanay
Copy link
Member

@mehmetcanay mehmetcanay commented Aug 13, 2024

Extended both SQL and Weaviate repository schemas to include sentence embedders.
Made necessary changes to functions utilizing these classes.
Added a new function allowing the user to filter based on the terminology and model.
Added new tests for each repository for the function retrieving all the stored sentence embedders.

@mehmetcanay mehmetcanay linked an issue Aug 13, 2024 that may be closed by this pull request
@mehmetcanay mehmetcanay added the enhancement New feature or request label Aug 13, 2024


class MPNetAdapter(EmbeddingModel):
def __init__(self, model="sentence-transformers/all-mpnet-base-v2"):
logging.getLogger().setLevel(logging.INFO)
self.mpnet_model = SentenceTransformer(model)
self.model = SentenceTransformer(model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model property is a str in GPT4 and an object in MPNet, that shouldnt be the case if it is implementing the same class

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the issue by renaming the model attribute of GPT4Adapter to model_name.

@@ -18,6 +17,15 @@ def __init__(self, name: str, id: str) -> object:
self.id = id


class SentenceEmbedder(Base):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont need a seperate Table for a single string property

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The embedding model can just be a single string property in the embedding class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored the code accordingly.

for item in result['data']['Get']['Mapping']:
sentence_embedders.add(item["hasSentenceEmbedder"])
except Exception as e:
raise RuntimeError(f"Failed to fetch terminologies: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it failed to fetch terminologies here?

Copy link
Member Author

@mehmetcanay mehmetcanay Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typo on my side, sorry. Fixed in the latest push.

@tiadams tiadams merged commit fa768b9 into main Aug 21, 2024
4 checks passed
@tiadams tiadams deleted the extend-weaviate-schema branch August 21, 2024 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend Repository Schema to Store Embedding Model
2 participants