Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: openllm build creates 3 copies of the model weights in different places #4268

Open
jhostetler opened this issue Sep 25, 2023 · 1 comment

Comments

@jhostetler
Copy link

jhostetler commented Sep 25, 2023

Describe the bug

When running openllm build with BENTOML_HOME=/foobar (for example):

  1. First, the model weights are downloaded to a directory under $HOME (in my case, under /root because this is running in a Docker container in a Kubernetes pod).
  2. Second, the weights are copied to a directory under /tmp
  3. Finally, the weights are copied again to a directory under BENTOML_HOME (which is where we wanted them)

I'm guessing at least one of these copies is unnecessary. Ideally, the files would end up under BENTOML_HOME directly without any intermediate copies, but I'm not sure if that's feasible.

In any case, it would be helpful to document that the build process requires enough storage for the full model at all three locations. When building inside a Kubernetes pod, for example, one must mount volumes at both /root and /tmp that are big enough to hold the model, else there will be an error saying the pod has exhausted its ephemeral-storage.

To reproduce

Example Python code:

import os
import subprocess
cmd = ["openllm", "build", "falcon", "--model-id", "tiiuae/falcon-7b"]
env = os.environ.copy()
env.update({"BENTOML_HOME": "/somewhere"})
subprocess.run(cmd, env=env, check=True)

I monitored disk usage with a background process that ran the following shell command every second:

for d in /*; do du -sh $d; done

Logs

No response

Environment

bentoml: 1.1.6

System information (Optional)

Running inside Docker container in Kubernetes pod

@aarnphm
Copy link
Contributor

aarnphm commented Nov 7, 2023

This is intended, as we want the build from bentoml to be atomic. I will probably transfer this to BentoML and we can track it there.

@aarnphm aarnphm transferred this issue from bentoml/OpenLLM Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants