Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Errors when another modular service is running on port 8000 #6

Open
1 task done
russfellows opened this issue Feb 20, 2025 · 1 comment
Open
1 task done
Labels
bug Something isn't working

Comments

@russfellows
Copy link

russfellows commented Feb 20, 2025

Recipe Name

max-serve-openai-embeddings

Operating System

Linux

What happened?

In the examples, everything shows using a global environment. Your underlying pixi environment manager (aka magic) can handle multiple project directories. It would be good to show examples with custom pixi / magic environments, so that running services can be overridden and not all try to run on the same port 8000.

Relevant log output

(max-embeddings) rfellows@tag-965:~/Documents/Modular/max-recipes/max-serve-openai-embeddings$ magic run app
21:23:29 system | llm.1 started (pid=2584200)
21:23:29 system | main.1 started (pid=2584202)
21:23:29 llm.1  | Global environments as specified in '/home/rfellows/.modular/manifests/pixi-global.toml'
21:23:29 llm.1  | └── max-pipelines: 25.2.0.dev2025022005 (already installed)
21:23:29 llm.1  |     └─ exposes: max-serve, max-pipelines
21:23:30 main.1 | 2025-02-20 21:23:30,161 - __main__ - INFO - Waiting for server at http://0.0.0.0:8001/v1 to start (attempt 1/20)...
21:23:36 llm.1  | ✔ Environment max-pipelines was already up-to-date.
21:23:36 llm.1  | cat: .env: No such file or directory
21:23:38 llm.1  | /home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/transformers/utils/hub.py:106: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
21:23:38 llm.1  |   warnings.warn(
21:23:39 llm.1  | Traceback (most recent call last):
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/bin/max-pipelines", line 6, in <module>
21:23:39 llm.1  |     from max.entrypoints.pipelines import main
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/max/entrypoints/__init__.py", line 17, in <module>
21:23:39 llm.1  |     from .llm import LLM
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/max/entrypoints/llm.py", line 34, in <module>
21:23:39 llm.1  |     from max.serve.pipelines.model_worker import start_model_worker
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/max/serve/pipelines/model_worker.py", line 9, in <module>
21:23:39 llm.1  |     configure_metrics(Settings())
21:23:39 llm.1  |                       ^^^^^^^^^^
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/pydantic_settings/main.py", line 171, in __init__
21:23:39 llm.1  |     super().__init__(
21:23:39 llm.1  |   File "/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
21:23:39 llm.1  |     validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
21:23:39 llm.1  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21:23:39 llm.1  | pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
21:23:39 llm.1  | port
21:23:39 llm.1  |   Value error, port 8000 is already in use [type=value_error, input_value=8000, input_type=int]
21:23:39 llm.1  |     For further information visit https://errors.pydantic.dev/2.10/v/value_error
21:23:40 llm.1  | Attempt 1 failed, retrying...
^C21:23:41 system | SIGINT received
21:23:41 system | sending SIGTERM to llm.1 (pid 2584200)
21:23:41 system | sending SIGTERM to main.1 (pid 2584202)
21:23:41 system | llm.1 stopped (rc=-15)
21:23:41 system | main.1 stopped (rc=-15)

Environment

Notice that I specifically stated to run on port 8001 in the "main.py" file, however; the underlying code still tries to run on port 8000.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Update:

Note: The problem seems to arise in part because of the process trying to create yet another magic environment. Given that I had a project, and was already running the magic shell, I didn't need further invocation, which only created problems.

Perhaps update the example and code to enable running just the 'max-pipelines' server with specific environment variables.

I was able to get this working, by pulling apart the overly ambitious "Procfile". See following example of how I got this to work properly:


(max-embeddings) rfellows@tag-965:/Documents/Modular/max-recipes/max-serve-openai-embeddings$ env | grep TOKEN
HUGGING_FACE_HUB_TOKEN=hf_UEAT****************rEKMenJZH
(max-embeddings) rfellows@tag-965:
/Documents/Modular/max-recipes/max-serve-openai-embeddings$ export MAX_SERVE_PORT=8001 ; export MAX_SERVE_HOST=127.0.0.1
(max-embeddings) rfellows@tag-965:/Documents/Modular/max-recipes/max-serve-openai-embeddings$ cat Procfile
llm: for i in $(seq 1 3); do MAX_SERVE_PORT=8001 MAX_SERVE_HOST=127.0.0.1 HUGGING_FACE_HUB_TOKEN=$(cat .env | grep HUGGING_FACE_HUB_TOKEN | cut -d '=' -f2) && max-pipelines serve --huggingface-repo-id sentence-transformers/all-mpnet-base-v2 && break || (echo "Attempt $i failed, retrying..." && sleep 5); done
main: magic run python main.py && kill -2 $(pgrep -f "max-pipelines serve")
(max-embeddings) rfellows@tag-965:
/Documents/Modular/max-recipes/max-serve-openai-embeddings$ max-pipelines serve --huggingface-repo-id sentence-transformers/all-mpnet-base-v2
/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/transformers/utils/hub.py:106: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
21:51:19.454 INFO: 2597952 MainThread: root: Logging initialized: Console: INFO, File: None, Telemetry: None
21:51:19.455 WARNING: 2597952 MainThread: opentelemetry.metrics._internal: Overriding of current MeterProvider is not allowed
21:51:20.146 WARNING: 2597952 MainThread: max.pipelines: --huggingface-repo-id is deprecated, use --model-path instead. This setting will stop working in a future release.
21:51:20.434 INFO: 2597952 MainThread: max.entrypoints.cli.serve: Starting server using sentence-transformers/all-mpnet-base-v2
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 571/571 [00:00<00:00, 6.77MB/s]
21:51:22.209 WARNING: 2597952 MainThread: max.pipelines: torch_dtype not available, cant infer encoding from config.json
21:51:22.924 INFO: 2597952 MainThread: max.pipelines:

Estimated memory consumption:
    Weights:                418 MiB
    KVCache allocation:     0 MiB
    Total estimated:        418 MiB used / 5575 MiB free
Auto-inferred max sequence length: 514
Auto-inferred max batch size: 1

21:51:22.924 INFO: 2597952 MainThread: max.pipelines:

    Loading TextTokenizer and EmbeddingsPipeline(MPNetPipelineModel) factory for:
        engine:                 PipelineEngine.MAX
        architecture:           MPNetForMaskedLM
        devices:                gpu[0]
        model_path:             sentence-transformers/all-mpnet-base-v2
        quantization_encoding:  float32
        cache_strategy:         model_default
        weight_path:            [
                                   model.safetensors
                                ]

tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 363/363 [00:00<00:00, 4.06MB/s]
vocab.txt: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 5.02MB/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 466k/466k [00:00<00:00, 7.36MB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 239/239 [00:00<00:00, 4.14MB/s]
21:51:24.157 INFO: 2597952 MainThread: max.serve: Server configured with no cache and batch size 1
21:51:24.157 INFO: 2597952 MainThread: max.serve: Settings: api_types=[<APIType.OPENAI: 'openai'>] host='127.0.0.1' port=8001 logs_console_level='INFO' logs_otlp_level=None logs_file_level=None logs_file_path=None disable_telemetry=False use_heartbeat=False mw_timeout_s=1200.0 mw_health_fail_s=60.0 telemetry_worker_spawn_timeout=60.0 runner_type=<RunnerType.PYTORCH: 'pytorch'>
21:51:24.168 INFO: 2597952 MainThread: max.serve: Launching server on http://127.0.0.1:8001
/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/transformers/utils/hub.py:106: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
21:51:26.973 INFO: 2597974 MainThread: root: Logging initialized: Console: INFO, File: None, Telemetry: None
21:51:26.973 WARNING: 2597974 MainThread: opentelemetry.metrics._internal: Overriding of current MeterProvider is not allowed
/home/rfellows/.modular/envs/max-pipelines/lib/python3.12/site-packages/transformers/utils/hub.py:106: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
21:51:30.471 INFO: 2597987 MainThread: root: Logging initialized: Console: INFO, File: None, Telemetry: None
21:51:30.471 WARNING: 2597987 MainThread: opentelemetry.metrics._internal: Overriding of current MeterProvider is not allowed
21:51:33.564 INFO: 2597987 MainThread: max.pipelines: Starting download of model: sentence-transformers/all-mpnet-base-v2
model.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 438M/438M [00:04<00:00, 87.7MB/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.36s/it]
21:51:38.925 INFO: 2597987 MainThread: max.pipelines: Finished download of model: sentence-transformers/all-mpnet-base-v2 in 5.360789 seconds.
21:51:38.925 INFO: 2597987 MainThread: max.pipelines: Building and compiling model...
21:51:54.057 INFO: 2597987 MainThread: max.pipelines: Building and compiling model took 15.131542 seconds
21:51:54.072 INFO: 2597952 MainThread: max.serve: Server ready on http://127.0.0.1:8001 (Press CTRL+C to quit)

@russfellows russfellows added the bug Something isn't working label Feb 20, 2025
@ehsanmok
Copy link
Collaborator

Thanks! please make sure to run magic run clean when you're done which cleans up the services running on the previously occupied ports. I'll be changing to ports to most unused in a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants