Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less constrained ordered task prep #1335

Merged
merged 3 commits into from
Oct 1, 2018

Conversation

carver
Copy link
Contributor

@carver carver commented Sep 28, 2018

What was wrong?

While sketching out a skeleton header sync, I found myself wanting OrderedTaskPreparation but with some constraints removed:

  1. The ability to register tasks out of order (eg~ register a header when we don't have its parent yet)
  2. The ability to register tasks without any prereqs (eg~ once we have a header registered, we don't need anything else besides all its ancestors)

How was it fixed?

  • Reworked how pruning was done, which didn't support out-of-order tasks registration.
  • Fairly simple to support tasks without prereqs, just had to check at register time if the task was already done
  • More tests to cover these cases (and the combination of these cases)

I decided that often we don't want to permit out-of-order tasks. It's useful to get the warnings in the downstream sync steps. So permitting out-of-order tasks is an opt-in flag.

This comes at some cost to pruning time, which could be reduced with some more effort, if it seems to be a bottleneck.

Cute Animal Picture

put a cute animal picture link inside the parentheses

@carver carver force-pushed the less-constrained-ordered-task-prep branch from b0be735 to 80700b2 Compare September 29, 2018 00:10
Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly still have trouble digesting this API you've created. That isn't to say there is anything wrong with it, more that I haven't actually worked directly with it in any way.

What I don't see in this PR is an update to the docstrings. Can you do a quick audit to see if they need to be updated for these changes?

if task_id not in self._tasks:
raise ValidationError(f"No task {task_id} is present")
else:
return task_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning the task_id here seems non-standard. Most of our validation functions either raise exceptions or return None.


def _find_oldest_unpruned_task_id(self, finished_task_id: TTaskID) -> TTaskID:
get_dependency_of_id = compose(
self._validate_has_task,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the reason you're returning the task_id from the validate function. You can use cytoolz.do to accomplish that.

from cytoolz.curried import do
get_dependency_of_id = compose(
    do(self._validate_has_task),
    ...
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, I just ran across that last week actually. Pretty handy. Until ethereum/eth-utils#136 is done, this is my workaround:

curry(do)(self._validate_has_task)

self._tasks.get,
)
ancestry_pipeline = repeat(get_dependency_of_id, self._max_depth)
return pipe(finished_task_id, *ancestry_pipeline)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just functional programming nitpicking. I think you can do the same thing in a less readable manner as follows. Code may have an off-by-one error.

from cytoolz import iterate, nth
return nth(self._max_depth, iterate(get_dependency_of_id, finished_task_id))

Fancy!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, nice, yeah I like it better! I was actually looking for something like iterate the other day when I found do. 👍

"""
root_candidate = task_id
get_dependency_of_id = compose(self._dependency_of, attrgetter('task'), self._tasks.get)
for depth in count():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can define an upper bound on this loop? The infiniteness of it seems unnecessary and it seems like it would be easier to debug in that case if it were to explicitly break and throw an exception at some reasonable upper bound.

@carver carver force-pushed the less-constrained-ordered-task-prep branch from 80700b2 to 7b5601d Compare October 1, 2018 21:40
@carver
Copy link
Contributor Author

carver commented Oct 1, 2018

I honestly still have trouble digesting this API you've created. That isn't to say there is anything wrong with it, more that I haven't actually worked directly with it in any way.

Yeah, the API is completely designed around being handy from the point of view of the syncer. But still, it's worth iterating on if it's not pretty immediately grok-able. Just maybe not right now: I don't have any better ideas, and it's doing its job of making the syncer easier to write/comprehend.

@carver carver merged commit 4a2cadc into ethereum:master Oct 1, 2018
@carver carver deleted the less-constrained-ordered-task-prep branch October 1, 2018 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants