Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more scenarios to system tests #2738

Merged
merged 18 commits into from
Apr 6, 2023
Merged

Conversation

lloeki
Copy link
Member

@lloeki lloeki commented Apr 3, 2023

What does this PR do?

Enable more system test scenarios to be run in CI

Motivation

The default scenario covers only a limited number of tests

  • Validate new developments
  • Guarantee no regression

Additional Notes

Probably the system tests repo should provide an cross-repo reusable GitHub workflow so that we don't have to update this one manually.

How to test the change?

CI

@lloeki lloeki requested a review from a team April 3, 2023 15:27
@github-actions github-actions bot added the dev/github Github repository maintenance and automation label Apr 3, 2023
@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch 13 times, most recently from 4efbd97 to b50f2b2 Compare April 4, 2023 10:14
@GustavoCaso
Copy link
Member

The total time for running all the system tests is ~30 minutes.

Can we reduce the CI time for running the system tests? We are currently running each scenario for each weblog variant.

Variants:

- rack
- sinatra14
- sinatra20
- sinatra21
- rails32
- rails40
- rails41
- rails42
- rails50
- rails51
- rails52
- rails60
- rails61
- rails70

Scenarios:

- DEFAULT
- REMOTE_CONFIG_MOCKED_BACKEND_ASM_DD
- REMOTE_CONFIG_MOCKED_BACKEND_ASM_FEATURES
- REMOTE_CONFIG_MOCKED_BACKEND_ASM_FEATURES_NOCACHE
- REMOTE_CONFIG_MOCKED_BACKEND_ASM_DD_NOCACHE
- APPSEC_CUSTOM_RULES
- APPSEC_MISSING_RULES
- APPSEC_CORRUPTED_RULES
- APPSEC_DISABLED
- APPSEC_LOW_WAF_TIMEOUT
- APPSEC_CUSTOM_OBFUSCATION
- APPSEC_RATE_LIMITER
- APPSEC_IP_BLOCKING
- APPSEC_REQUEST_BLOCKING
- SAMPLING
- PROFILING

That's 224 (14 * 16) jobs per commit. Should we try not running all the permutations?

@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch 4 times, most recently from 36a32ed to 9fdc833 Compare April 4, 2023 12:46
@lloeki
Copy link
Member Author

lloeki commented Apr 4, 2023

The total time for running all the system tests is ~30 minutes.

Yeah I'm trying to reduce that by altering the matrix (e.g some tests don't depend on the framework). Also using docker image caching will help by saving a lot of rebuilds.

This is a first shot to make sure things work.

@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from e4ed8ea to eafa5d8 Compare April 4, 2023 13:19
@codecov-commenter
Copy link

codecov-commenter commented Apr 4, 2023

Codecov Report

Merging #2738 (0628ee5) into master (fd50833) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2738      +/-   ##
==========================================
- Coverage   98.06%   98.06%   -0.01%     
==========================================
  Files        1222     1222              
  Lines       67271    67269       -2     
  Branches     3013     3013              
==========================================
- Hits        65972    65970       -2     
  Misses       1299     1299              
Impacted Files Coverage Δ
spec/ddtrace/release_gem_spec.rb 100.00% <ø> (ø)

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from 9e2577a to 6386a70 Compare April 4, 2023 13:59
@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from 561bebd to 8c8623c Compare April 5, 2023 13:42
@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from a82c759 to 8ebd4f4 Compare April 5, 2023 14:21
@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

PROFILING and SAMPLING scenarios are a bit on the slow side.

Unfortunately we can't access matrix at the job level:

    if: ${{ github.ref == 'refs/heads/master' || matrix.scenario != 'PROFILING' || matrix.scenario != 'SAMPLING' }}

@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

Using artifacts to transfer images is a bit slow too.

A better option would be to move to using ghcr.io since it would only transfer unknown layers, but this would pollute packages a bit. This can be mitigated with a retention policy, implementable using actions/delete-package-versions.

@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

Using --cache-from= was successful for runner and agent, making builds fly, but for some reason the runner is unable to reach the agent (/info health check fails to respond):

runner log:

runner    |!!!!!!!!!!!!!!!!! _pytest.outcomes.Exit: http://agent:8126/info never answered to healthcheck request !!!!!!!!!!!!!!!!!!

agent log (this is the only line in the log):

Agent 7.43.1 - Commit: 9e9c790 - Serialization version: v5.0.67 - Go version: go1.19.6

pytest log:

07:37:59.771 DEBUG    Starting new HTTP connection (1): agent:8126
07:37:59.778 DEBUG    Healthcheck #60 on http://agent:8126/info: HTTPConnectionPool(host='agent', port=8126): Max retries exceeded with url: /info (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8de6ed1400>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Building the app images failed to use the cache, but that may be due to how the Dockerfiles and docker-wrapping shell scripts are written.

@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

An improvement, both in UI, in code clarity, and in architecture would be to leverage reusable workflows.

Ideally these reusable workflow YAML files could even be hosted on the system tests repository (if it makes sense) or referenced from the system tests repository from this one (or the system-tests–apps-ruby repository), potentially reducing duplication.

@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch 4 times, most recently from 38d42ed to 018dac4 Compare April 5, 2023 15:19
@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from 018dac4 to 853f567 Compare April 5, 2023 15:22
@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from 0628ee5 to 0cd950e Compare April 5, 2023 16:12
@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

For caching, this inconsistency is annoying: DataDog/system-tests#1046

@lloeki lloeki force-pushed the enable-more-system-test-scenarios branch from 3d94762 to b5ee9cf Compare April 5, 2023 20:27
@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

Cleanup works, but I get this strange issue, which I reported upstream:

actions/delete-package-versions#101

@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

The total time for running all the system tests is ~30 minutes.

Last run was ~16min. That's without the full benefit of docker build caching though (which should drive the build part down to mere seconds in most cases), until this comment is addressed.

@lloeki
Copy link
Member Author

lloeki commented Apr 5, 2023

Improvements and fixes delayed to later PRs:

Them not being in here has little impact compared to the value added so I'll be merging this PR so that it doesn't linger on.

@lloeki lloeki merged commit 87dc8f2 into master Apr 6, 2023
@lloeki lloeki deleted the enable-more-system-test-scenarios branch April 6, 2023 11:25
@github-actions github-actions bot added this to the 1.11.0 milestone Apr 6, 2023
@lloeki lloeki modified the milestones: 1.11.0, 1.11.0.beta1 Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev/github Github repository maintenance and automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants