how many rpc-host should I start on remote server #11858

vino5211 · 2025-02-14T05:54:23Z

vino5211
Feb 14, 2025

I have 4 gpu servers A,B,C,D, each has 4 NVIDIA A800 80GB PCIe. I start rpcserver on B,C,D. Here is the output of rpc-server commond, it seems found 4 CUDA devices, but only Device 0 is used on each server. So the question is how many rpc servers should I start on remote server, if there is 4 cuda devices?
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING: Host ('0.0.0.0') is != '127.0.0.1'
Never expose the RPC server to an open network!
This is an experimental feature and is not secure!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

create_backend: using CUDA backend
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 4 CUDA devices:
Device 0: NVIDIA A800 80GB PCIe, compute capability 8.0, VMM: yes
Device 1: NVIDIA A800 80GB PCIe, compute capability 8.0, VMM: yes
Device 2: NVIDIA A800 80GB PCIe, compute capability 8.0, VMM: yes
Device 3: NVIDIA A800 80GB PCIe, compute capability 8.0, VMM: yes
Starting RPC server on 0.0.0.0:50052, backend memory: 80614 MB

vino5211 · 2025-02-14T06:01:14Z

vino5211
Feb 14, 2025
Author

I tried to start 4 rpc hosts on each gpu server with CUDA_VISIBLE_DEVICES=0, and run llama-server on server A, it failed with error:
llama.cpp/ggml/src/ggml-backend.cpp:1455: GGML_ASSERT(n_backends <= GGML_SCHED_MAX_BACKENDS) failed
I print n_nackends = 17 ,there is 4 local cuda + 12 remote cuda + cpu，and GGML_SCHED_MAX_BACKENDS is set to 16 in source code.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how many rpc-host should I start on remote server #11858

{{title}}

Replies: 1 comment

{{title}}

Select a reply

how many rpc-host should I start on remote server #11858

vino5211 Feb 14, 2025

Replies: 1 comment

vino5211 Feb 14, 2025 Author

vino5211
Feb 14, 2025

vino5211
Feb 14, 2025
Author