bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #4238

nadimintikrish · 2023-10-16T02:46:21Z

Describe the bug

This keeps happening with mainly transformer based models. However, this error keeps prevailing only after containerizing like using
bentoml containerize bento_svc:latest and running the container like docker run --rm -p 3000:3000 bento_svc:t5mqzwdlzoxbjdx2

I have worked on multiple models like the one in documentation
https://docs.bentoml.com/en/latest/quickstarts/deploy-a-transformer-model-with-bentoml.html
even distil bert models and also sentence embedding.

I cloned the source code of this sentence embedding and tried using containerize and run. it still fails but works like charm if used docker container out of the box.

https://github.com/bentoml/sentence-embedding-bento/blob/main/requirements.txt

This happened with bentoml 1.1.6 and updated version 1.1.7 as well.

Also, I am using Mac OS with Apple Silicon chip if that matters.

I am able to run with scikit learn models and even torch based models that I trained.

Wonder what am I doing wrong here.

strack-trace for further introspection:

2023-10-16T02:32:33+0000 [ERROR] [api_server:4] Exception on /encode [POST] (trace=f755596b7d185db8d620262d333fa9ae,span=2c59e1cd9a90f9cb,sampled=0,service.name=sentence-embedding-svc) Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http_app.py", line 341, in api_func output = await api.func(*args) File "/home/bentoml/bento/src/service.py", line 38, in encode return await embed_runner.encode.async_run(docs.dict()) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner.py", line 55, in async_run return await self.runner._runner_handle.async_run_method(self, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 216, in async_run_method async with self._client.post( File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 1167, in __aenter__ self._resp = await self._coro File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 586, in _request await resp.start(conn) File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 905, in start message, payload = await protocol.read() # type: ignore[union-attr] File "/usr/local/lib/python3.9/site-packages/aiohttp/streams.py", line 616, in read await self._waiter aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected

To reproduce

No response

Expected behavior

No response

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.1.7
python: 3.9.18
platform: macOS-13.6-arm64-i386-64bit
uid_gid: 502:20
conda: 23.1.0
in_conda_env: True

conda_packages

name: bento_39
channels:
  - defaults
dependencies:
  - ca-certificates=2023.08.22=hca03da5_0
  - libcxx=14.0.6=h848a8c0_0
  - libffi=3.4.4=hca03da5_0
  - ncurses=6.4=h313beb8_0
  - openssl=3.0.11=h1a28f6b_2
  - pip=23.2.1=py39hca03da5_0
  - python=3.9.18=hb885b13_0
  - readline=8.2=h1a28f6b_0
  - setuptools=68.0.0=py39hca03da5_0
  - sqlite=3.41.2=h80987f9_0
  - tk=8.6.12=hb8d0fd4_0
  - tzdata=2023c=h04d1e81_0
  - wheel=0.41.2=py39hca03da5_0
  - xz=5.4.2=h80987f9_0
  - zlib=1.2.13=h5a0b063_0
  - pip:
      - absl-py==2.0.0
      - aiohttp==3.8.6
      - aiosignal==1.3.1
      - annotated-types==0.6.0
      - anyio==4.0.0
      - appdirs==1.4.4
      - asgiref==3.7.2
      - astunparse==1.6.3
      - async-timeout==4.0.3
      - attrs==23.1.0
      - bentoml==1.1.7
      - build==1.0.3
      - cachetools==5.3.1
      - cattrs==23.1.2
      - certifi==2023.7.22
      - charset-normalizer==3.3.0
      - circus==0.18.0
      - click==8.1.7
      - click-option-group==0.5.6
      - cloudpickle==2.2.1
      - contextlib2==21.6.0
      - deepmerge==1.1.0
      - deprecated==1.2.14
      - exceptiongroup==1.1.3
      - filelock==3.12.4
      - flatbuffers==23.5.26
      - frozenlist==1.4.0
      - fs==2.4.16
      - fsspec==2023.9.2
      - gast==0.5.4
      - google-auth==2.23.3
      - google-auth-oauthlib==1.0.0
      - google-pasta==0.2.0
      - grpcio==1.59.0
      - h11==0.14.0
      - h5py==3.10.0
      - httpcore==0.18.0
      - httpx==0.25.0
      - huggingface-hub==0.17.3
      - idna==3.4
      - importlib-metadata==6.0.1
      - inflection==0.5.1
      - jinja2==3.1.2
      - joblib==1.3.2
      - keras==2.14.0
      - libclang==16.0.6
      - markdown==3.5
      - markdown-it-py==3.0.0
      - markupsafe==2.1.3
      - mdurl==0.1.2
      - ml-dtypes==0.2.0
      - mpmath==1.3.0
      - multidict==6.0.4
      - networkx==3.1
      - numpy==1.26.0
      - oauthlib==3.2.2
      - opentelemetry-api==1.20.0
      - opentelemetry-instrumentation==0.41b0
      - opentelemetry-instrumentation-aiohttp-client==0.41b0
      - opentelemetry-instrumentation-asgi==0.41b0
      - opentelemetry-sdk==1.20.0
      - opentelemetry-semantic-conventions==0.41b0
      - opentelemetry-util-http==0.41b0
      - opt-einsum==3.3.0
      - packaging==23.2
      - pathspec==0.11.2
      - pillow==10.0.1
      - pip-requirements-parser==32.0.1
      - pip-tools==7.3.0
      - prometheus-client==0.17.1
      - protobuf==4.24.4
      - psutil==5.9.5
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pydantic==2.4.2
      - pydantic-core==2.10.1
      - pygments==2.16.1
      - pynvml==11.5.0
      - pyparsing==3.1.1
      - pyproject-hooks==1.0.0
      - python-dateutil==2.8.2
      - python-json-logger==2.0.7
      - python-multipart==0.0.6
      - pyyaml==6.0.1
      - pyzmq==25.1.1
      - regex==2023.10.3
      - requests==2.31.0
      - requests-oauthlib==1.3.1
      - rich==13.6.0
      - rsa==4.9
      - safetensors==0.4.0
      - schema==0.7.5
      - scikit-learn==1.3.1
      - scipy==1.11.3
      - simple-di==0.1.5
      - six==1.16.0
      - sniffio==1.3.0
      - starlette==0.31.1
      - sympy==1.12
      - tensorboard==2.14.1
      - tensorboard-data-server==0.7.1
      - tensorflow==2.14.0
      - tensorflow-estimator==2.14.0
      - tensorflow-io-gcs-filesystem==0.34.0
      - tensorflow-macos==2.14.0
      - termcolor==2.3.0
      - threadpoolctl==3.2.0
      - tokenizers==0.14.1
      - tomli==2.0.1
      - torch==2.1.0
      - torchvision==0.16.0
      - tornado==6.3.3
      - tqdm==4.66.1
      - transformers==4.34.0
      - typing-extensions==4.8.0
      - urllib3==2.0.6
      - uvicorn==0.23.2
      - watchfiles==0.20.0
      - werkzeug==3.0.0
      - wrapt==1.14.1
      - xgboost==2.0.0
      - yarl==1.9.2
      - zipp==3.17.0
prefix: /Users/user/miniconda3/envs/bento_39

pip_packages

absl-py==2.0.0
aiohttp==3.8.6
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.0.0
appdirs==1.4.4
asgiref==3.7.2
astunparse==1.6.3
async-timeout==4.0.3
attrs==23.1.0
bentoml==1.1.7
build==1.0.3
cachetools==5.3.1
cattrs==23.1.2
certifi==2023.7.22
charset-normalizer==3.3.0
circus==0.18.0
click==8.1.7
click-option-group==0.5.6
cloudpickle==2.2.1
contextlib2==21.6.0
deepmerge==1.1.0
Deprecated==1.2.14
exceptiongroup==1.1.3
filelock==3.12.4
flatbuffers==23.5.26
frozenlist==1.4.0
fs==2.4.16
fsspec==2023.9.2
gast==0.5.4
google-auth==2.23.3
google-auth-oauthlib==1.0.0
google-pasta==0.2.0
grpcio==1.59.0
h11==0.14.0
h5py==3.10.0
httpcore==0.18.0
httpx==0.25.0
huggingface-hub==0.17.3
idna==3.4
importlib-metadata==6.0.1
inflection==0.5.1
Jinja2==3.1.2
joblib==1.3.2
keras==2.14.0
libclang==16.0.6
Markdown==3.5
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mdurl==0.1.2
ml-dtypes==0.2.0
mpmath==1.3.0
multidict==6.0.4
networkx==3.1
numpy==1.26.0
oauthlib==3.2.2
opentelemetry-api==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-asgi==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
opentelemetry-util-http==0.41b0
opt-einsum==3.3.0
packaging==23.2
pathspec==0.11.2
Pillow==10.0.1
pip-requirements-parser==32.0.1
pip-tools==7.3.0
prometheus-client==0.17.1
protobuf==4.24.4
psutil==5.9.5
pyasn1==0.5.0
pyasn1-modules==0.3.0
pydantic==2.4.2
pydantic_core==2.10.1
Pygments==2.16.1
pynvml==11.5.0
pyparsing==3.1.1
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
PyYAML==6.0.1
pyzmq==25.1.1
regex==2023.10.3
requests==2.31.0
requests-oauthlib==1.3.1
rich==13.6.0
rsa==4.9
safetensors==0.4.0
schema==0.7.5
scikit-learn==1.3.1
scipy==1.11.3
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.31.1
sympy==1.12
tensorboard==2.14.1
tensorboard-data-server==0.7.1
tensorflow==2.14.0
tensorflow-estimator==2.14.0
tensorflow-io-gcs-filesystem==0.34.0
tensorflow-macos==2.14.0
termcolor==2.3.0
threadpoolctl==3.2.0
tokenizers==0.14.1
tomli==2.0.1
torch==2.1.0
torchvision==0.16.0
tornado==6.3.3
tqdm==4.66.1
transformers==4.34.0
typing_extensions==4.8.0
urllib3==2.0.6
uvicorn==0.23.2
watchfiles==0.20.0
Werkzeug==3.0.0
wrapt==1.14.1
xgboost==2.0.0
yarl==1.9.2
zipp==3.17.0

The text was updated successfully, but these errors were encountered:

sudeepg545 · 2023-10-27T19:49:04Z

As some additional notes to the above issue, it definitely seems to be memory issue, we were also facing the same issue and could resolve it for now by adding more memory to the container. In our case, we are deploying in ECS and running multiple models (i.e. multiple runners) in each EC2 instance.

jianshen92 · 2023-10-27T20:03:46Z

As some additional notes to the above issue, it definitely seems to be memory issue, we were also facing the same issue and could resolve it for now by adding more memory to the container. In our case, we are deploying in ECS and running multiple models (i.e. multiple runners) in each EC2 instance.

Thanks for your finding! Do you find the memory slowly building up over time? Can you share a graph of memory consumption with respect to request volume?

nadimintikrish · 2023-10-31T06:50:59Z

Thanks @sudeepg545 ! I will look into this POV.
Will update on this @jianshen92 .

frostming · 2024-07-12T07:47:38Z

The memory issue has been in recent releases, can you please have a try? @nadimintikrish @sudeepg545

nadimintikrish added the bug Something isn't working label Oct 16, 2023

nadimintikrish mentioned this issue Oct 17, 2023

bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #3669

Open

jianshen92 self-assigned this Oct 18, 2023

frostming added the feedback-wanted Request for feedback label Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #4238

bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #4238

nadimintikrish commented Oct 16, 2023

sudeepg545 commented Oct 27, 2023

jianshen92 commented Oct 27, 2023 •

edited

Loading

nadimintikrish commented Oct 31, 2023

frostming commented Jul 12, 2024

bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #4238

bug: aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected #4238

Comments

nadimintikrish commented Oct 16, 2023

Describe the bug

To reproduce

Expected behavior

Environment

Environment variable

System information

sudeepg545 commented Oct 27, 2023

jianshen92 commented Oct 27, 2023 • edited Loading

nadimintikrish commented Oct 31, 2023

frostming commented Jul 12, 2024

jianshen92 commented Oct 27, 2023 •

edited

Loading