Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy does not create a new chunk #19

Open
ggerganov opened this issue Jan 31, 2025 · 4 comments
Open

Copy does not create a new chunk #19

ggerganov opened this issue Jan 31, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@ggerganov
Copy link
Member

When a block is copied with Ctrl+C or Cmd+C, the copied text should be added as a chunk in the extra context. Currently, this does not seem to happen. To reproduce, I copy a chunk of text and way to see a request to the server, but there is no such request.

Version: 0.0.6

@ggerganov ggerganov added the bug Something isn't working label Jan 31, 2025
@igardev
Copy link
Collaborator

igardev commented Jan 31, 2025

The chunk is added in the extra context and is visible if you get it with Ctrl+Shift+, . However, seems that there is a problem with the request. I will research it.

@igardev
Copy link
Collaborator

igardev commented Feb 1, 2025

I've noticed that sometimes llama.cpp server enters a specific state and become unresponsive (not yet sure when and how). While in this state the log is:
srv update_slots: all slots are idle
request: POST /infill 172.22.224.1 200
slot launch_slot_: id 0 | task 343 | processing task
slot update_slots: id 0 | task 343 | new prompt, n_ctx_slot = 32768, n_keep = 0, n_prompt_tokens = 32186
slot update_slots: id 0 | task 343 | kv cache rm [0, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 1024, n_tokens = 1024, progress = 0.031815
slot update_slots: id 0 | task 343 | kv cache rm [1024, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 2048, n_tokens = 1024, progress = 0.063630
slot update_slots: id 0 | task 343 | kv cache rm [2048, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 3072, n_tokens = 1024, progress = 0.095445
slot update_slots: id 0 | task 343 | kv cache rm [3072, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 4096, n_tokens = 1024, progress = 0.127260
slot update_slots: id 0 | task 343 | kv cache rm [4096, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 5120, n_tokens = 1024, progress = 0.159075
slot update_slots: id 0 | task 343 | kv cache rm [5120, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 6144, n_tokens = 1024, progress = 0.190890
slot update_slots: id 0 | task 343 | kv cache rm [6144, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 7168, n_tokens = 1024, progress = 0.222706
slot update_slots: id 0 | task 343 | kv cache rm [7168, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 8192, n_tokens = 1024, progress = 0.254521
slot update_slots: id 0 | task 343 | kv cache rm [8192, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 9216, n_tokens = 1024, progress = 0.286336
slot update_slots: id 0 | task 343 | kv cache rm [9216, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 10240, n_tokens = 1024, progress = 0.318151
slot update_slots: id 0 | task 343 | kv cache rm [10240, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 11264, n_tokens = 1024, progress = 0.349966
slot update_slots: id 0 | task 343 | kv cache rm [11264, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 12288, n_tokens = 1024, progress = 0.381781
slot update_slots: id 0 | task 343 | kv cache rm [12288, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 13312, n_tokens = 1024, progress = 0.413596
slot update_slots: id 0 | task 343 | kv cache rm [13312, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 14336, n_tokens = 1024, progress = 0.445411
slot update_slots: id 0 | task 343 | kv cache rm [14336, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 15360, n_tokens = 1024, progress = 0.477226
slot update_slots: id 0 | task 343 | kv cache rm [15360, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 16384, n_tokens = 1024, progress = 0.509041
slot update_slots: id 0 | task 343 | kv cache rm [16384, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 17408, n_tokens = 1024, progress = 0.540856
slot update_slots: id 0 | task 343 | kv cache rm [17408, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 18432, n_tokens = 1024, progress = 0.572671
slot update_slots: id 0 | task 343 | kv cache rm [18432, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 19456, n_tokens = 1024, progress = 0.604486
slot update_slots: id 0 | task 343 | kv cache rm [19456, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 20480, n_tokens = 1024, progress = 0.636302
slot update_slots: id 0 | task 343 | kv cache rm [20480, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 21504, n_tokens = 1024, progress = 0.668117
slot update_slots: id 0 | task 343 | kv cache rm [21504, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 22528, n_tokens = 1024, progress = 0.699932
slot update_slots: id 0 | task 343 | kv cache rm [22528, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 23552, n_tokens = 1024, progress = 0.731747
slot update_slots: id 0 | task 343 | kv cache rm [23552, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 24576, n_tokens = 1024, progress = 0.763562
slot update_slots: id 0 | task 343 | kv cache rm [24576, end)
slot update_slots: id 0 | task 343 | prompt processing progress, n_past = 25600, n_tokens = 1024, progress = 0.795377
^C

There are other cases when the chunk is not added and there is no request - if it is too small or if it is identical with an existing chunk, but most probably this is not the reason for this bug.

@ggerganov
Copy link
Member Author

ggerganov commented Feb 2, 2025

@igardev This is normal behavior when there are more than one VS Code instances running at the same time. Each one would have a different context accumulated in it's ring buffer and when you switch between them, the llama-server would need to recompute the entire context instead of reusing it. This can become very slow for contexts of several thousand tokens.

@igardev
Copy link
Collaborator

igardev commented Feb 2, 2025

@ggerganov Yes, I have two running instances of VS Code. We could add a remark about that in the ReadMe.md file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants