Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repetitive non-fatal ConflictError: "arbiter is already running %s command" #697

Closed
LazySheeeeep opened this issue Nov 19, 2023 · 3 comments

Comments

@LazySheeeeep
Copy link

2023-11-19T23:09:00+0800 [ERROR] [cli] Exception in callback <bound method Arbiter.manage_watchers of <circus.arbiter.Arbiter object at 0x7faf281ac310>>
Traceback (most recent call last):
File "/tmp2/t12902101/miniconda3/envs/ol/lib/python3.11/site-packages/tornado/ioloop.py", line 919, in _run
val = self.callback()
^^^^^^^^^^^^^^^
File "/tmp2/t12902101/miniconda3/envs/ol/lib/python3.11/site-packages/circus/util.py", line 1038, in wrapper
raise ConflictError("arbiter is already running %s command"
circus.exc.ConflictError: arbiter is already running arbiter_start_watchers command

@LazySheeeeep
Copy link
Author

I simply just ran openllm start llama --model-id NousResearch/llama-2-13b-chat-hf -p <any> on the depertment workstation server after created the necessary environment, and that seems to happen to every model every time I wanna run a server.

@LazySheeeeep
Copy link
Author

and the query doesn't work:
(ol) t12902101@ws2 [~] openllm query --endpoint http://localhost:23456 "hi?"
^[[A^[[B^C^C^C^C^C^C^C
^C^C^CTraceback (most recent call last):
File "/tmp2/t12902101/miniconda3/envs/ol/bin/openllm", line 5, in
from openllm_cli.entrypoint import cli
File "/tmp2/t12902101/miniconda3/envs/ol/lib/python3.11/site-packages/openllm_cli/entrypoint.py", line 92, in

^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C File "/tmp2/t12902101/miniconda3/envs/ol/lib/python3.11/site-packages/openllm_cli/_factory.py", line 28, in
class _OpenLLM_GenericInternalConfig(LLMConfig):
File "/tmp2/t12902101/miniconda3/envs/ol/lib/python3.11/site-packages/openllm_core/_configuration.py", line 978, in init_subclass
^C^C^Z
[1]+ 已停止(SIGTSTP) openllm query --endpoint http://localhost:23456 "hi?"

Even cannot be aborted sometimes.

@aarnphm
Copy link
Collaborator

aarnphm commented Nov 19, 2023

You need to kill the circusd process, some process is being zombied.
This is a known issue from BentoML. See bentoml/BentoML#4193

@aarnphm aarnphm closed this as completed Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants