-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Build succeeds and features succeed without lean server crash #106
Conversation
Thanks @Adarsh321123 for the PR! Just adding a note in this PR: I think the Ubuntu we are talking about is one with CUDA-enabled GPU. In the PR two groups of flags are deleted. I gave it a try and indeed it worked on Ubuntu with GPU, despite being fairly slow. I also found that the first line of change seems not necessary -- the build succeeds with or without cutting two of the four flags on that line, i.e. from So then the part of change that really functions here is the second, which is to delete the entire
It seems to me that this completely disables the use of CUDA on a machine with a CUDA-enabled GPU, instead of choosing whether to use CUDA based on the machine's specs. The fact that this works indicates that there might be issues with how the program is being built when |
The first change seems functionally unnecessary so far, i will add it back if there is a reason for that. The second change seems suboptimal but does work, despite being slow. I think it indicates that we should look into how CUDA in used during the build. For now, I will incorporate the second part of change (deleting the CUDA-related flags) while we look more into actual reasons behind that. |
Hi Peiyang! Thank you for your comments. I think the "Ubuntu" I mentioned may not necessarily be one with a CUDA-enabled GPU. This is because the same build and run errors exist on my M1 Mac, which does not have a CUDA-enabled GPU. As such, the bug may be related to CUDA rather than being because of CUDA. I agree that the first set of flags is unnecessary on Ubuntu. However, it is necessary for a successful build on my M1 Mac. Please let me know if you notice the same on your end. I agree that the second change is suboptimal. However, given that we have found an alternative for a while, I agree that it is acceptable to incorporate this change. Also, although this removes CUDA capabilities, I noticed that the 3 features are very fast on an Ubuntu desktop with a 4090, but very slow on a DGX server with 4 A100s. This again suggests that CUDA may not be the direct issue, but may be related. |
I see, I said it's probably related to CUDA because I was able to build everything on a Ubuntu machine without CUDA-enabled GPUs. Anyways, I think we can incorporate the temporary suboptimal fix for now. |
For the M1 Mac, that's interesting. So you were unable to successfully |
It's interesting that you were able to build on an Ubuntu machine without CUDA-enabled GPUs. |
I actually don't have a M1 Mac myself, so if you could confirm that the first line is necessary for M1 Mac, and if I validate that the change in the first line does not make any other architecture not work (I'll handle that testing), we can just incorporate the change. |
Yes, I was unable to successfully run |
Sounds great! Let me know of the result after you double check that. Then I'll merge. Thanks! |
I can confirm that I cannot |
Sounds great, thanks for the contribution! |
Previously on Ubuntu,
lake build
failed. Now,lake build
succeeds, and features like premise selection and tactic suggestion also succeed rather than cause the Lean server to crash.