Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Default settings cause a 10x performance decrease relative to OMP_NUM_THREADS=1 #115

Open
coltonbh opened this issue Jul 13, 2024 · 3 comments
Labels
unconfirmed This report has not yet been confirmed by the developers

Comments

@coltonbh
Copy link
Contributor

Describe the bug

I understand this library is no longer maintained; however, I haven't tested this on tblite but wanted to add my results here for your reference.

When I run a "typical" calculation on a molecule with ~40-60 atoms or so the calculation is 10x slower than if I set OMP_NUM_THREADS=1. If I profile the code most of the time is spent spawning new process and not actually doing calcultions.

For such lightweight calculations I'd recommend threads over processes as the overhead of spawning new processes is actually higher than the calculation itself. Not sure if you've already updated the tblite implementation to use threads over processes, but I'd certainly recommend this ;P

The performance hit is impressive. If OpenMP is here to stay in the implementation I'd suggest an easy API for passing in the OMP_NUM_THREADS variable and probably set it to 1 by default. 🚗💨 . The current implementation does not offer this possibility and since new processes are spawned outside of the python interpreter unfortunately xtb-python does not respect setting os.environ['OMP_NUM_THREADS': 1] so programatically controlling this important variable is rather challenging (still looking for a solution).

Thx!

@coltonbh coltonbh added the unconfirmed This report has not yet been confirmed by the developers label Jul 13, 2024
@coltonbh
Copy link
Contributor Author

coltonbh commented Jul 13, 2024

Oops! Misspoke but leaving this here for context! I was thinking you were using OpenMPI and my thinking above was in the context of starting new MPI processes as ranks, but you are using OpenMP. Regardless the facts of the setup I shared are still the correct! For some reason setting OMP_NUM_THREADS=1 creates a 10x performance increase and xtb-python spends the majority of times spawning processes rather than doing calculations unless you set this variable. Not sure why process spawning is related to the number of threads, but it is!

Also, this variable gets set inside xtb somehow the minute any bit of the xtb library is imported. So one needs to set this with os.environ['OMP_NUM_THREADS'] = 1 before importing any of the xtb library. Then things work as expected.

Thanks again for your time working out the kinks in a very important package where speed matters! :P

@coltonbh
Copy link
Contributor Author

More details. It appears to be spawning threads within threads. So if I set this variable to 16 (I have a 16 more machine) I actually get 47 (randomly?!) threads spawned on my machine. The resource contention kills the performance. If I set this variable to 1, I get 16 threads spawned. So there are threads spawning threads and this is causing the issue.

@coltonbh
Copy link
Contributor Author

coltonbh commented Jul 13, 2024

This fixes the perf issue, but of course isn't optimal and has to be used wherever you first import anything from the xtb library:

from contextlib import contextmanager
import os

@contextmanager
def set_env_variable(var_name, value):
    """Context manager to set an environment variable temporarily.

    Args:
        var_name: The name of the environment variable.
        value: The value to set the environment variable to.
    """
    original_value = os.environ.get(var_name)
    try:
        os.environ[var_name] = str(value)
        yield
    finally:
        if original_value:
            os.environ[var_name] = original_value
        else:
            del os.environ[var_name]

Then wherever you first import xtb:

with set_env_variable("OMP_NUM_THREADS", "1"):
    import xtb
    import xtb.interface
    import xtb.libxtb
    from xtb.utils import Solvent

@awvwgk these are really all issues with the underlying xtb library. Should I file issues there? Or is this all built anew with the tblite library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unconfirmed This report has not yet been confirmed by the developers
Projects
None yet
Development

No branches or pull requests

1 participant