How does enable_cpu_offload work? #10703
Answered
by
asomoza
CaledoniaProject
asked this question in
Q&A
-
Does anyone know how enable_cpu_offload works? I mean, what's the strategy on memory usage if it's enabled? |
Beta Was this translation helpful? Give feedback.
Answered by
asomoza
Feb 3, 2025
Replies: 1 comment
-
Hi, it just unloads to RAM models that aren't used, it's a very basic memory optimization, this keeps only the model or models used at the current inference step in the VRAM. There isn't a memory usage strategy, this will happen even if you have 80GB or 16GB of VRAM. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
CaledoniaProject
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, it just unloads to RAM models that aren't used, it's a very basic memory optimization, this keeps only the model or models used at the current inference step in the VRAM.
There isn't a memory usage strategy, this will happen even if you have 80GB or 16GB of VRAM.