Skip to content

Commit

Permalink
Add docs for scale-to-zero
Browse files Browse the repository at this point in the history
(cherry picked from commit 296d0b9)
  • Loading branch information
deliahu committed Jul 6, 2021
1 parent 5f14c9c commit 7218f12
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 5 deletions.
2 changes: 1 addition & 1 deletion docs/workloads/async/async.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Async APIs are a good fit for users who want to submit longer workloads (such as
* retrieve status and response via HTTP endpoint
* autoscale based on queue length
* avoid cold starts
* scale to 0
* scale to zero
* perform rolling updates
* automatically recover from failures and spot instance termination

Expand Down
4 changes: 2 additions & 2 deletions docs/workloads/async/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ Cortex auto-scales AsyncAPIs on a per-API basis based on your configuration.

### Autoscaling configuration

**`min_replicas`**: The lower bound on how many replicas can be running for an API.
**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported.

<br>

**`max_replicas`**: The upper bound on how many replicas can be running for an API.
**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.

<br>

Expand Down
4 changes: 2 additions & 2 deletions docs/workloads/realtime/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ In addition to the autoscaling configuration options (described below), there ar

### Autoscaling configuration

**`min_replicas`**: The lower bound on how many replicas can be running for an API.
**`min_replicas`** (default: 1): The lower bound on how many replicas can be running for an API. Scale-to-zero is supported (experimental).

<br>

**`max_replicas`**: The upper bound on how many replicas can be running for an API.
**`max_replicas`** (default: 100): The upper bound on how many replicas can be running for an API.

<br>

Expand Down
1 change: 1 addition & 0 deletions docs/workloads/realtime/realtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Realtime APIs are a good fit for users who want to run stateless containers as a
* respond to requests synchronously
* autoscale based on request volume
* avoid cold starts
* scale to zero
* perform rolling updates
* automatically recover from failures and spot instance termination
* perform A/B tests and canary deployments
Expand Down

0 comments on commit 7218f12

Please sign in to comment.