-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] AzureDevops jobs failing: NoSpaceLeftError: No space left on devices. #6635
Comments
I put up #6416 to push some commits and get debugging information... that PR now contains a proposed fix. I think the primary issue was that container images on the self-hosted runners (the pool introduced in #6407) were taking up too much space. Via commits pushed to #6416, ran the following: # check disk usage
df
# check docker's disk usage
docker system df
# list docker images
docker images Saw that the main disk was 85% full (roughly 26 / 31 GB used).
It's very possible that a CI run could here could require another 5GB of data written to disk, summed across the following:
Saw that 20.3 GB of that 26.0 GB was devoted to docker images... many of which were old and unused.
Ran the following: docker run \
--all \
--force \
--filter until=720h And saw 16.5 GB of disk space cleared 😁
full logs (click me)
Ran those commands again and saw disk usage had fallen to just 30% used.
I think it'd help to routinely clean out old docker images. And that that should be done via CI configs instead of something like a manually-configured cron job on the images... so that all maintainers (most of whom don't have direct access to the runners) can modify the details. |
Description
See, for example, https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=16953&view=logs&j=c90eab90-013e-596e-d874-a7254853d76e
I also see many warnings about disk space on the jobs that run on our custom hosted Linux runner:
Reproducible example
This is happening on all CI jobs.
Environment info
N/A
Additional Comments
Related discussion about files being left behind: #6416
The text was updated successfully, but these errors were encountered: