Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added an alternative way to generate first-order multiplets with better scaling #27

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

RaphaelRobidas
Copy link

The current method of generating multiplets (splitting multiple times into doublets) scales exponentially with the multiplicity. An alternative way to generate multiplets was implemented using binomial coefficients. It's faster for high multiplicities (although slower for simple cases) while remaining pretty elegant. The code is not currently used by the other functions, I let you judge whether it would be useful to use elsewhere in the codebase.

Benchmark results:

Couplings: [(8, 2), (4, 2)]
Current: 0.0466 s for 10000 iterations
New: 0.1685 s for 10000 iterations

Couplings: [(8, 2), (4, 2), (7, 3), (1.5, 3)]
Current: 2.5646 s for 10000 iterations
New: 2.5093 s for 10000 iterations

Couplings: [(8, 8), (4, 5)]
Current: 21.5207 s for 10000 iterations
New: 0.8021 s for 10000 iterations

Couplings: [(13, 8), (7, 5)]
Current: 22.7828 s for 10000 iterations
New: 0.8319 s for 10000 iterations

Verification and benchmarking code:

import time

from nmrsim.firstorder import multiplet, binomial_multiplet

def equivalent(peak1, peak2):
    if len(peak1) != len(peak2):
        return False
    for p1, p2 in zip(peak1, peak2):
        for c1, c2 in zip(p1, p2):
            if abs(c1 - c2) > 1e-9:
                return False
    return True

for i in range(1, 10):
    for j in range(1, 10):
        ref = multiplet((300, 1), [(13, i), (7, j)])
        comp = binomial_multiplet((300, 1), [(13, i), (7, j)])
        assert equivalent(ref, comp)

N = 10000

for case in [
    [(8, 2), (4, 2)],
    [(8, 2), (4, 2), (7, 3), (1.5, 3)],
    [(8, 8), (4, 5)],
    [(13, 8), (7, 5)],
]:
    t1 = time.time()
    for i in range(N):
        multiplet((300, 1), case)
    t2 = time.time()
    print(case)

    print(f"Current: {t2-t1:.4f} s for {N} iterations")
    t1 = time.time()
    for i in range(N):
        binomial_multiplet((300, 1), case)
    t2 = time.time()

    print(f"New: {t2-t1:.4f} s for {N} iterations")
    print("")

I also noticed a slight confusion regarding the notation of the multiplicities. Currently, a coupling of the form (8, 2) will produce a triplet with J = 8 Hz, as it causes the signal to split twice. The docstring of the multiplet function previously indicated that (8, 2) would produce a doublet.

@sametz
Copy link
Owner

sametz commented Dec 23, 2023

Brief response for now (traveling for the holidays): Good catch on the documentation error. When I get back from vacation I'll review your new version of the function.

@sametz
Copy link
Owner

sametz commented Jan 10, 2024

Two things:

  1. The next version release will be Python 3.8+. That means math.comb from the standard library could be used instead of scipy. I'd like to minimize reliance on external libraries where convenient. Could you try implementing your solution with math.comb? See this link on Stack Overflow.

  2. How many splittings need there be before your code is faster than the current code? Is there a real-world example that can be used as a benchmark? For hydrocarbons, anything over 8 nuclei would be unusual, but my understanding is that for other spin-half nuclei such as phosphorous more complexity can be seen.

@RaphaelRobidas
Copy link
Author

I didn't know about math.comb, very nice.

As for the timing, I ran the benchmarks again and the new code is now as fast or faster as the current code:

[(8, 2)]
Current: 0.1899 s for 100000 iterations
New: 0.1985 s for 100000 iterations

[(8, 3)]
Current: 0.3185 s for 100000 iterations
New: 0.2371 s for 100000 iterations

[(8, 4)]
Current: 0.5507 s for 100000 iterations
New: 0.2804 s for 100000 iterations

[(8, 5)]
Current: 1.0068 s for 100000 iterations
New: 0.3223 s for 100000 iterations

[(8, 2), (4, 2)]
Current: 0.5759 s for 100000 iterations
New: 0.5650 s for 100000 iterations

[(8, 2), (4, 2), (7, 3), (1.5, 3)]
Current: 32.0979 s for 100000 iterations
New: 7.9310 s for 100000 iterations

It seems that the implementation in the standard library is a bit faster than the one in Scipy. Overall, the new code does not seems to have drawbacks now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants