Slow load time? #815

edwios · 2023-04-06T15:26:22Z

edwios
Apr 6, 2023

I have the following timings for a 30B LLaMA model:

llama_print_timings:        load time = 62270.78 ms
llama_print_timings:      sample time =   681.50 ms /   203 runs   (    3.36 ms per run)
llama_print_timings: prompt eval time = 60647.60 ms /   323 tokens (  187.76 ms per token)
llama_print_timings:        eval time = 46631.52 ms /   202 runs   (  230.85 ms per run)
llama_print_timings:       total time = 109586.98 ms

Although the time / token was pretty ok, the load time was pretty significant. Do you have similar timings? I have this ran on a M1 Max with 64GB RAM.

Why I asked is because I have not seen such long load time from other discussions and I am wondering if there is something I have missed.

I am using commit d2beca9 if that helps but this long loading time has been here for a while.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow load time? #815

{{title}}

Replies: 0 comments

Select a reply

Slow load time? #815

edwios Apr 6, 2023

Replies: 0 comments

edwios
Apr 6, 2023