Skip to content

Commit

Permalink
Merge pull request #45 from huggingface/Michelle-endpoint-updates
Browse files Browse the repository at this point in the history
Update autoscaling.mdx
  • Loading branch information
Michellehbn authored Nov 27, 2023
2 parents 6a3dd65 + 82b95f4 commit c4daa76
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/autoscaling.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The autoscaling process is triggered based on the accelerator's utilization metr

- **GPU Accelerators**: A new replica is added when the average GPU utilization of all replicas over a 2-minute window reaches 80%.

It's important to note that the scaling up process takes place every 3 minutes, while the scaling down process takes 5 minutes. This frequency ensures a balance between responsiveness and stability of the autoscaling system.
It's important to note that the scaling up process takes place every minute, while the scaling down process takes 2 minutes. This frequency ensures a balance between responsiveness and stability of the autoscaling system, with a stabilization of 300 seconds once scaled up or down.

## Considerations for Effective Autoscaling

Expand Down

0 comments on commit c4daa76

Please sign in to comment.