inf1-sentence-transformers

Sentence Transformers on EC2 Inf1 and Amazon SageMaker

EC2 Inf1

trace-model.py takes a static batch-size and name of model to build
inference.py runs the traced model on a single core (Run this 4 times to start 4 different processes)

Amazon SageMaker

Build out the static batch-size traced models
Pull the models on to EFS on SageMaker studio (just upload the model files)
Update batch size and deploy.

Usage

EC2 - Build

To run on EC2, after installing the relevant packages You can also change model_id using --model_id option python trace-model.py --batch_size 50

EC2 - Inference

Modify the batch_size and simply run the following four times. Every process runs in a separate neuron core so it has to be started in background 4 times. NEURON_RT_NUM_CORES=1 python inference.py & Use Neuron Top (neuron-top) utility on EC2 Inf1

SageMaker Inference

Simply upload the model folders and the notebook and run it through SageMaker

Neuron Top output

Four copies of the same model loaded to 4 different cores on inf1.xlarge on EC2

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inf1-sentence-transformer.ipynb		inf1-sentence-transformer.ipynb
inf1-sentence-transformers.png		inf1-sentence-transformers.png
inference.py		inference.py
trace-model.py		trace-model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

inf1-sentence-transformers

EC2 Inf1

Amazon SageMaker

Usage

EC2 - Build

EC2 - Inference

SageMaker Inference

Neuron Top output

About

Releases

Packages

Languages

License

DarkSector/inf1-sentence-transformers

Folders and files

Latest commit

History

Repository files navigation

inf1-sentence-transformers

EC2 Inf1

Amazon SageMaker

Usage

EC2 - Build

EC2 - Inference

SageMaker Inference

Neuron Top output

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages