Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSK_ERR_MUTATION_TIME_OLDER_THAN_PARENT_NODE #352

Closed
steinrue opened this issue Oct 1, 2024 · 9 comments
Closed

TSK_ERR_MUTATION_TIME_OLDER_THAN_PARENT_NODE #352

steinrue opened this issue Oct 1, 2024 · 9 comments

Comments

@steinrue
Copy link

steinrue commented Oct 1, 2024

Hi there,

I am using slim to simulate selection a denovo mutation and then pyslim to recapitate the genomic background. The versions I am using are python-3.12.4, pyslim-1.0.4, slim-4.3, msprime-1.3.3, and tskit-0.5.8.

The two files in the attached zip-file: 'isolated_error.slim' and 'isolated_error.py' reproduce the error for me. I tried to remove as much fluff from the example as I could, but beyond this, I didn't get the error anymore. Even choosing some different values for random_seed=... in the recapitate() function (like random_seed=4711 instead of random_seed=77148825) makes it run without error .

When I run "slim isolated_error.slim && python isolated_error.py", the recapitation throws the error

    ts.load_tables(tables._ll_tables, build_indexes=build_indexes)
_tskit.LibraryError: A mutation's time must be < the parent node of the edge on which it occurs, or be marked as 'unknown'. (TSK_ERR_MUTATION_TIME_OLDER_THAN_PARENT_NODE)

Any help would be appreciated,
Matthias

isolated_error.zip

@bhaller
Copy link
Collaborator

bhaller commented Oct 2, 2024

Hi Matthias,

OK, I can now see that the error is occurring on the Python side; the SLiM script runs without errors. @petrelharp any ideas as to what is going on here?

@petrelharp
Copy link
Contributor

And, I get the error now as well. It looks to be a pyslim error, since the initial .trees file is okay, it's only something that happens in recapitate.

@petrelharp
Copy link
Contributor

petrelharp commented Oct 2, 2024

Okay - looking into this, the first thing I notice is that changing the recapitate line to

recapTs = pyslim.recapitate (sampleTs, ancestral_Ne=10)

I get

ValueError: Not all roots of the provided tree sequence are at the time expected by recapitate().
 This could happen if you've simplified in python before recapitating (fix: don't simplify first). 
It could also happen in other situations, e.g., you added new individuals without parents in SLiM
 during the course of the simulation with sim.addSubPop(), in which case you will probably need 
to recapitate with msprime.sim_ancestry(initial_state=ts, ...). 

Indeed, you should not simplify before recapitation, or if you do, use keep_input_roots=True; see the documentation, with more discussion here.

Indeed, adding keep_input_roots=True makes the error go away, in this case. However, we shouldn't be producing illegal tree sequences either way.

@petrelharp
Copy link
Contributor

(There's also the question of whether we should be throwing the "too many root times" error in the non-constant-Ne case that Matthias uses; but there are use cases where different roots are at different times legally, I think.)

@petrelharp
Copy link
Contributor

Oh wait - this is actually happening in msprime. Changing the pyslim.recapitate line to

msprime.sim_ancestry(initial_state=sampleTs, recombination_rate=1.25e-8, random_seed=77148825, demography=stepStoneDemo)

produces the same error. I'll move this to msprime, but will investigate more a bit.

Anyhow - I'll bet including keep_input_roots=True will fix the problem, @steinrue - whatever is happening is probably because of a weird starting condition. (but if that's not true, let us know?)

@petrelharp
Copy link
Contributor

Okay - turns out this is the same issue as tskit-dev/msprime#2319 . Here's the code I used to verify this:

site = slimTs.site(0)
t = slimTs.at(site.position)
mut = slimTs.mutation(0)
child = mut.node
parent = t.parent(mut.node)
print(f"Mutation is above node {mut.node}, at position {site.position}, "
      f" and has time {mut.time}. \n"
      f"Node {mut.node} is at time {slimTs.node(child).time}, "
      f" and the parent is node {parent}, at time {slimTs.node(parent).time}.\n")


o_t = sampleTs.at(sampleTs.site(0).position)
o_mut = sampleTs.mutation(0)
o_parent = o_t.parent(o_mut.node)
print(f"In the original trees given to msprime, the mutation was at time {o_mut.time}, "
      f"was above node {o_mut.node}, at time {sampleTs.node(o_mut.node).time}, "
      f"and has parent node {o_parent}."
)

(Note I also hacked into msprime to save out the tables before converting them to tree sequences to verify this was the issue - but there's only one mutation so it's unambiguous.)

I'm going to close this.

@steinrue
Copy link
Author

steinrue commented Oct 2, 2024

Thank you very much for the quick help, Ben and Peter. Adding keep_input_roots=True to the simplify command seems to indeed fix the problem, also in the more general case that I am using it.

I guess the moral of the story is RTFM =). Thanks.

@petrelharp
Copy link
Contributor

Thanks for the report!

@bhaller
Copy link
Collaborator

bhaller commented Oct 2, 2024

I guess the moral of the story is RTFM =). Thanks.

Well, maybe a tiny bit of RTFM lol, but there is a bug here, also, which is very useful to get a report on. No worries, it's great to get the feedback, don't worry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants