New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

支持在Jetson Xavier NX上部署Qwen2-VL吗？ #53

Open

zjl2000 opened this issue Jan 16, 2025 · 1 comment

zjl2000 commented Jan 16, 2025

No description provided.

Collaborator

kzjeef commented Jan 16, 2025

我记得Xavier的compute是SM70 ？

那么目前Xavier就是一个ARMv8的Host + 一个SM70的GPU的编译组合。

首先ARMv8的Host 我们是支持的，不过是一个CPU only的编译配置，而且目前我们还没尝试过交叉编译，只能在Xavier上host编译。

然后SM70我们也是支持的，这里面有一些功能如FP8， BF16， flash attention是不能用的（SM70不支持）。不过 FP16，A16W8， A16W4 都是可以支持的， attention这边的话，attention这边的话是xformers的实现。

理论上是可行的，不过应该会有一些编译问题。
需要我们做一些compile上的fix。

这个你可以先尝试下，我们收到这个需求，也会排期看一下。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment