Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How’s the cold start time of EC2 with instance store #9

Open
Weixuanf opened this issue Jul 17, 2024 · 4 comments
Open

How’s the cold start time of EC2 with instance store #9

Weixuanf opened this issue Jul 17, 2024 · 4 comments

Comments

@Weixuanf
Copy link

Cuz I want to scale down to 0 instance when there’s no request.

How long does it take to cold start EC2 from 0 instance? I think instance store EC2 is slower to boot than EBS backed EC2 instance?

and downloading models from S3 to instance store takes extra time. How’s the downloading speed look like from s3 to instance store?

Thanks for this amazing template!

@Shellmode
Copy link
Contributor

It might take several minutes to ten minutes to cold start, here are some steps:

  1. Karpenter finds provisionable pod(s), starts to spin up EC2(negligible time consumption, less than 1s)
  2. EC2 initialize, run user-data script(sync all models on S3 to instance store) defined in Karpenter custom resource EC2NodeClass, start kubelet and other stuffs to get node ready. (It may take a few minutes, depends on how much you need to sync from S3)
  3. Pull image from ECR(It may also take like 5mins, depends on how large the image is)

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

The solution uses instance store to improve models loading and switching (inside ComfyUI) performance.

It's a tradeoff, we spend more time setting up the environment to get better performance.

@Weixuanf
Copy link
Author

thanks very much for your reply. I want to run serverless comfyui servers that scale down to 0 when no requests to save time. So cold start time is very important for me. I hope to get < 5s cold start time (excluding comfyui boot time itself). I'm thinking EC2 + EBS and stopping/starting the EC2 server to achieve better cold start times than using auto scale group. If you have other suggestions, please let me know!

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

oh so even if I use EKS, there will still be instance store in it?

@Shellmode
Copy link
Contributor

Yes, g4dn & g5 & g6 all have instance store, refer to Amazon EC2 instance store, you can use it or just ignore it (it's free).

EC2 with EBS will have less boot time, because there's no image pulling and model syncing. But you need to handle EC2 scaling in/out yourself. Besides that, loading models from EBS to GPU memory might take more time than loading from instance store.

@PeterTF656
Copy link

thanks very much for your reply. I want to run serverless comfyui servers that scale down to 0 when no requests to save time. So cold start time is very important for me. I hope to get < 5s cold start time (excluding comfyui boot time itself). I'm thinking EC2 + EBS and stopping/starting the EC2 server to achieve better cold start times than using auto scale group. If you have other suggestions, please let me know!

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

oh so even if I use EKS, there will still be instance store in it?

Have you thought about mounting EFS to your instances?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants