Create Semantic Search Engine with Amazon Sagemaker and Amazon OpenSearch

This repository provides the infrustracture to create a semantic search engine using Amazon SageMaker and Amazon OpenSearch Services. The overall goal is to leverage the power of Large Language Models (both pretrained and finetuned) to enhance the relevancy and accuracy of search results on geo.ca.

Model Fine-Tuning

Semantic search engines go beyond mere keyword matching to interpret the intent and contextual meaning of search queries. Unlike traditional keyword search, semantic search can handle natural language queries and complex requests, recognizing synonyms and variations. We fine-tuned the Sentence-Transformer models to improve search relevancy on geospatial metadata search. Details on model fine-tuning are available in the repository semantic-search-model-evaluation.

Semantic Search Architecture

The Semantic Search Architecture is a serverless setup consisting of the following steps:

Finetuning the chosen sentence-transformer models on HPC.
Saving the finetuned models in an AWS S3 bucket.
Loading the model to SageMaker and hosting a SageMaker Endpoint using the finetuned model.
Creating a Vector Index in the Amazon OpenSearch domain, embedding the text dataset (in this case, geospatial metadata) into vectors using the finetuned models, and loading the vectors into the Vector Index using a Lambda function.
Creating a Lambda function to call SageMaker Endpoints to generate embeddings from user search queries, performing K-Nearest Neighbors (KNN) search on the OpenSearch Vector Index and sending the query results back to the API gateway.
The API gateway sends the search results to the frontend and returns search results to the users.

CloudFormation Deployment

Detailes to be added.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
deployment		deployment
image		image
model		model
src		src
.gitattributes		.gitattributes
Demo1-Keyword-search.ipynb		Demo1-Keyword-search.ipynb
Demo2-Semantic-search-with-pretain-model.ipynb		Demo2-Semantic-search-with-pretain-model.ipynb
Demo3-Semantic-search-with-pretrain-model-fullstack.ipynb		Demo3-Semantic-search-with-pretrain-model-fullstack.ipynb
Demo4-Deploy-semantic-search-with-finetuned-model.ipynb		Demo4-Deploy-semantic-search-with-finetuned-model.ipynb
README.md		README.md
geocore-semantic-search-with-opensearch.yml		geocore-semantic-search-with-opensearch.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Create Semantic Search Engine with Amazon Sagemaker and Amazon OpenSearch

Model Fine-Tuning

Semantic Search Architecture

CloudFormation Deployment

About

Releases 1

Packages

Contributors 2

Languages

Canadian-Geospatial-Platform/semantic-search-with-amazon-opensearch

Folders and files

Latest commit

History

Repository files navigation

Create Semantic Search Engine with Amazon Sagemaker and Amazon OpenSearch

Model Fine-Tuning

Semantic Search Architecture

CloudFormation Deployment

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages