Add docs for scale-to-zero

(cherry picked from commit 296d0b9)
cortexlabs · Jul 6, 2021 · 7218f12 · 7218f12
1 parent 5f14c9c
commit 7218f12
Show file tree

Hide file tree

Showing 4 changed files with 6 additions and 5 deletions.
diff --git a/docs/workloads/async/async.md b/docs/workloads/async/async.md
@@ -10,7 +10,7 @@ Async APIs are a good fit for users who want to submit longer workloads (such as
 * retrieve status and response via HTTP endpoint
 * autoscale based on queue length
 * avoid cold starts
-* scale to 0
+* scale to zero
 * perform rolling updates
 * automatically recover from failures and spot instance termination
 

diff --git a/docs/workloads/async/autoscaling.md b/docs/workloads/async/autoscaling.md
@@ -6,11 +6,11 @@ Cortex auto-scales AsyncAPIs on a per-API basis based on your configuration.
 
 ### Autoscaling configuration
 
-**`min_replicas`**: The lower bound on how many replicas can be running for an API.
+**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported.
 
 <br>
 
-**`max_replicas`**: The upper bound on how many replicas can be running for an API.
+**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.
 
 <br>
 

diff --git a/docs/workloads/realtime/autoscaling.md b/docs/workloads/realtime/autoscaling.md
@@ -18,11 +18,11 @@ In addition to the autoscaling configuration options (described below), there ar
 
 ### Autoscaling configuration
 
-**`min_replicas`**: The lower bound on how many replicas can be running for an API.
+**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported (experimental).
 
 <br>
 
-**`max_replicas`**: The upper bound on how many replicas can be running for an API.
+**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.
 
 <br>
 

diff --git a/docs/workloads/realtime/realtime.md b/docs/workloads/realtime/realtime.md
@@ -9,6 +9,7 @@ Realtime APIs are a good fit for users who want to run stateless containers as a
 * respond to requests synchronously
 * autoscale based on request volume
 * avoid cold starts
+* scale to zero
 * perform rolling updates
 * automatically recover from failures and spot instance termination
 * perform A/B tests and canary deployments