Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test order affecting reproducibility #20

Open
FedericoCeratto opened this issue Jan 27, 2016 · 1 comment
Open

Test order affecting reproducibility #20

FedericoCeratto opened this issue Jan 27, 2016 · 1 comment

Comments

@FedericoCeratto
Copy link

partition_tests() assign tests to parallel workers based on previous timing.
It would be useful to be able to assign the tests pseudorandomly:

  1. Occasionally tests might succeed or fail based on their execution order due to imperfect test isolation. Being able to run a test suite multiple times in different order would help to spot this issue.

  2. In order to ensure reproducibility, it should be possible to run the tests in the same order every time by setting a fixed random.seed(). Currently testr does not guarantee constant order.

  3. The test order should not be dependent on the host where the tests are run from. Tests are assigned based on previous timing on the same host and this can lead to false negatives. With randomization, the order should be a function of a random seed value and the number of workers.

@rbtcollins
Copy link
Member

reproducability within a single worker is up to that worker - e.g. subunit.run ordering depends on the unittest2 loader behaviour.

To reproduce a specific worker, something like testr run --load-list <(testr last --subunit | subunit-filter --tags worker-1 | subunit-ls) should do it.

testr run --analyze-isolation presumes that backend test ordering is deterministic, which isn't always true, so it can indeed be fooled here - but its not something testr can affect today.

The online scheduler will be able to do somewhat better, at which point we can consider the things you're asking for here - though I disagree on point 3. Scheduling can be usefully based on a bunch of things, and what makes sense for different users may be well - different. So I think its fine to say you don't want local timing data to influence scheduling, but that doesn't imply global rejection of the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants