From 82b95f41ea7aaab65eabbb85ff87eca8f16bf692 Mon Sep 17 00:00:00 2001 From: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com> Date: Mon, 27 Nov 2023 13:26:32 +0100 Subject: [PATCH] Update autoscaling.mdx --- docs/source/autoscaling.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/autoscaling.mdx b/docs/source/autoscaling.mdx index dce38e6..23fd841 100644 --- a/docs/source/autoscaling.mdx +++ b/docs/source/autoscaling.mdx @@ -10,7 +10,7 @@ The autoscaling process is triggered based on the accelerator's utilization metr - **GPU Accelerators**: A new replica is added when the average GPU utilization of all replicas over a 2-minute window reaches 80%. -It's important to note that the scaling up process takes place every 3 minutes, while the scaling down process takes 5 minutes. This frequency ensures a balance between responsiveness and stability of the autoscaling system. +It's important to note that the scaling up process takes place every minute, while the scaling down process takes 2 minutes. This frequency ensures a balance between responsiveness and stability of the autoscaling system, with a stabilization of 300 seconds once scaled up or down. ## Considerations for Effective Autoscaling