Best place to host Whisper #327
Replies: 6 comments 5 replies
-
Hello, I have implemented faster-whisper with the large-v2 model in a professional environment. It is hosted on old hardware with 8GB RAM and GPU GTX1060 6GB VRAM, ubuntu os. This is an excellent price-performance ratio. |
Beta Was this translation helpful? Give feedback.
-
AWS g4dn.xlarge EC2 |
Beta Was this translation helpful? Give feedback.
-
Are there any updates on this topic? I'm interested in hosting either faster-whisper or whisper.cpp. As I understand it, whisper.cpp could be more cost-effective because it can run quickly on inexpensive VMs. However, faster-whisper is faster when used with a high-end GPU-based VM. |
Beta Was this translation helpful? Give feedback.
-
Use an AWS g4dn.xlarge EC2 or AWS g5.xlarge EC2, both wokr great. |
Beta Was this translation helpful? Give feedback.
-
Have you tried with AWS Inf1 Instance? |
Beta Was this translation helpful? Give feedback.
-
we use 10 AWS g4dn.xlarge EC2s that turn on and off depending on traffic levels. |
Beta Was this translation helpful? Give feedback.
-
For a medical app, I'm looking for either a HIPAA compliant Whisper API endpoint (I believe OpenAI's is not) or a way to self-host (either through self-hosting GPU hardware or through a cloud-based instance like AWS, or Azure). I am wondering what the most cost effective way is to host the large-v2 model?
Does anyone have any performance metrics for different instances running whisper, faster-whisper, or whisper.cpp? For the large model, I would like to understand whether running on the GPU is generally recognized as faster and more cost effective (cost of instance vs runtime)? Which type of instance on a cloud provider would be recommended for this model? We want to be able to scale if needed without too much hassle.
Thanks so much.
Mark
Beta Was this translation helpful? Give feedback.
All reactions