segmentation fault at converting llama model #470

mhyeonsoo · 2025-01-17T02:30:07Z

Description of the bug:

I downloaded llama3.2-1b-instruct model to convert the model to tflite.
Used

python convert_to_tflite.py --checkpoint_path=model.safetensors

at ai_edge_torch/generative/examples/llama path.

and I got below output

2025-01-17 11:23:48.510034: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-17 11:23:49.101093: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/gea-ai/.local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:335: UserWarning: Device capability of jax unspecified, assuming `cpu` and `cuda`. Please specify it via the `devices` argument of `register_backend`.
  warnings.warn(
WARNING:2025-01-17 11:23:50,990:jax._src.xla_bridge:969: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.
2025-01-17 11:23:51.533181: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-01-17 11:23:51.534009: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-01-17 11:23:51.534179: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
W0117 11:23:59.496164 140590083385152 conversion.py:77] Your model "prefill_8" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496322 140590083385152 conversion.py:77] Your model "prefill_64" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496353 140590083385152 conversion.py:77] Your model "prefill_128" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496378 140590083385152 conversion.py:77] Your model "prefill_256" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496402 140590083385152 conversion.py:77] Your model "prefill_512" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496423 140590083385152 conversion.py:77] Your model "prefill_1024" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
W0117 11:23:59.496453 140590083385152 conversion.py:77] Your model "decode" is converted in training mode. Please set the module in evaluation mode with `module.eval()` for better on-device performance and compatibility.
Fatal Python error: Segmentation fault

Current thread 0x00007fddadfaa740 (most recent call first):
  File "/home/user/.local/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319 in _lazy_init
  File "/home/user/.local/lib/python3.11/site-packages/torch/cuda/random.py", line 33 in get_rng_state
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 208 in _fn
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1322 in transform_code_object
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 699 in _compile_inner
  File "/home/user/.local/lib/python3.11/site-packages/torch/_utils_internal.py", line 87 in wrapper_function
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 666 in compile_inner
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 924 in _compile
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 526 in __call__
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1269 in __call__
  File "/home/user/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747 in _call_impl
  File "/home/user/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736 in _wrapped_call_impl
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 465 in _fn
  File "/home/user/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747 in _call_impl
  File "/home/user/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736 in _wrapped_call_impl
  File "/home/user/.local/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 1432 in inner
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/_trace.py", line 560 in _export_to_torch_ir
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/_trace.py", line 1252 in _strict_export_lower_to_aten_ir
  File "/home/user.local/lib/python3.11/site-packages/torch/export/_trace.py", line 1224 in _strict_export
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/_trace.py", line 1880 in _export
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/exported_program.py", line 114 in wrapper
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/_trace.py", line 990 in wrapper
  File "/home/user/.local/lib/python3.11/site-packages/torch/export/__init__.py", line 270 in export
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/_convert/conversion.py", line 126 in export
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/_convert/conversion.py", line 139 in <listcomp>
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/_convert/conversion.py", line 138 in convert_signatures
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/_convert/converter.py", line 172 in convert
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/generative/utilities/converter.py", line 214 in _export_helper
  File "/home/user/.local/lib/python3.11/site-packages/ai_edge_torch/generative/utilities/converter.py", line 119 in convert_to_tflite
  File "/mount/workspace/09.LLM/ai-edge-torch/ai_edge_torch/generative/examples/llama/convert_to_tflite.py", line 79 in main
  File "/home/user/.local/lib/python3.11/site-packages/absl/app.py", line 254 in _run_main
  File "/home/user/.local/lib/python3.11/site-packages/absl/app.py", line 308 in run
  File "/mount/workspace/09.LLM/ai-edge-torch/ai_edge_torch/generative/examples/llama/convert_to_tflite.py", line 91 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, tensorflow.python.framework.fast_tensor_util, h5py._errors, h5py.defs, h5py._objects, h5py.h5, h5py.utils, h5py.h5t, h5py.h5s, h5py.h5ac, h5py.h5p, h5py.h5r, h5py._proxy, h5py._conv, h5py.h5z, h5py.h5a, h5py.h5d, h5py.h5ds, h5py.h5g, h5py.h5i, h5py.h5f, h5py.h5fd, h5py.h5pl, h5py.h5o, h5py.h5l, h5py._selector, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, jaxlib.cpu_feature_guard, PIL._imaging, torch._C, torch._C._dynamo.autograd_compiler, torch._C._dynamo.eval_frame, torch._C._dynamo.guards, torch._C._dynamo.utils, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special (total: 78)
Segmentation fault (core dumped)

Actual vs expected behavior:

expected to have a converted tflite model, but returned segmentation fault.

Any other information you'd like to share?

This is my environments:

python: 3.11.11
tensorflow: 2.13.0
2.5.0+cu118
cuda: 11.8
cudnn: 8.7.0

The text was updated successfully, but these errors were encountered:

gaikwadrahul8 · 2025-01-22T19:24:19Z

Hi, @mhyeonsoo
I apologize for the delay in my response, I have been able to replicate a similar behavior on my end while utilizing GPU please refer this gist-file so we'll have to dig more into this issue and will update you

I have also attempted to utilize CPU instead of GPU within Google Colab however, I encountered a different error: RuntimeError: Cannot set version_counter for inference tensor. This error is also documented in this gist-file

EDIT : I would appreciate it if you could attempt using commit 5a93316. Another user reported success with this commit 5a93316 please refer this comment #447 (comment)

Thank you for your understanding and patience.

mhyeonsoo · 2025-01-23T00:07:34Z

Hi @gaikwadrahul8 ,
Thanks for the response.
I have reviewed 5a93316 and I checked that I've already used such code lines.

I will look forward to listen to the update from you :)
Thanks,

mhyeonsoo added the type:bug Bug label Jan 17, 2025

github-actions bot assigned pkgoogle Jan 17, 2025

pkgoogle assigned gaikwadrahul8 and unassigned pkgoogle Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segmentation fault at converting llama model #470

segmentation fault at converting llama model #470

mhyeonsoo commented Jan 17, 2025 •

edited

Loading

gaikwadrahul8 commented Jan 22, 2025 •

edited

Loading

mhyeonsoo commented Jan 23, 2025

segmentation fault at converting llama model #470

segmentation fault at converting llama model #470

Comments

mhyeonsoo commented Jan 17, 2025 • edited Loading

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

gaikwadrahul8 commented Jan 22, 2025 • edited Loading

mhyeonsoo commented Jan 23, 2025

mhyeonsoo commented Jan 17, 2025 •

edited

Loading

gaikwadrahul8 commented Jan 22, 2025 •

edited

Loading