Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: optimize file syncing with s3 #381

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

f-hollow
Copy link
Collaborator

@f-hollow f-hollow commented Jan 9, 2025

Description

This PR adds a workflow that compares checksums for the files described below and syncs the diffs:

  • Hugo public/ files generated in CI for the latest commit
  • Hugo public/ directory already deployed to the S3 bucket

The first preview deployment takes up the same about of time. The subsequent deployments reduced from current 3 minutes to roughly 1 minute.

Issue to be solved

The aws s3 sync command compares files according to their timestamp and size. This syncing approach doesn't work well with CI which generates new files during each and every build. The AWS issue #9074 explains it all. There are numerous threads where users discuss this and suggest that the sync should be based on the file checksums.

Apparently, AWS has known about this issue for quite some time, but there haven't been any visible actions form their side. One common workaround suggested by users is to use the sync option --size-only, however, others warn that it might be a bad idea.

Resources

Follow-up actions

The initial deploy time can also be reduced if we implement object redirects for the files that are not usually changed, which means it is most of the files in our case: previous articles, their media, etc.

Related

Testing

Thorough testing has been done here.


Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • All CI checks (GH Actions) pass.
  • Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • Code is well-commented, especially in complex areas.
  • Git history is clean — commits are squashed to the minimum necessary.

@f-hollow
Copy link
Collaborator Author

f-hollow commented Jan 9, 2025

@pedrominatel PTAL

@f-hollow f-hollow force-pushed the ci/optimize_file_syncing_with_s3 branch from 3e932a0 to 4ae4436 Compare January 9, 2025 09:54
@f-hollow f-hollow requested a review from pedrominatel January 9, 2025 10:44
@f-hollow f-hollow force-pushed the ci/optimize_file_syncing_with_s3 branch from 4ae4436 to ef3655c Compare January 9, 2025 11:13
AWS_S3_BUCKET: ${{ secrets.PREVIEW_AWS_BUCKET_NAME }}
AWS_REGION: ${{ secrets.AWS_REGION }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

jobs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the check to run the jobs only for PRs created under Espressif namespace?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, there is no need for this limitation. The workflow won't work anyway, because the secrets with our values won't be available in the context of the forked repo.

The absence of this limitation also makes it easier to test changes to this workflow in the testing environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but we will always have failed jobs on forked repos, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this workflow won't be even triggered, so, technically, there shouldn't be any failed jobs reported in forks.

@horw
Copy link
Member

horw commented Jan 17, 2025

Thank you for your replies, LGTM!

Copy link
Member

@pedrominatel pedrominatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @f-hollow.

@f-hollow f-hollow merged commit 48433ab into espressif:main Jan 17, 2025
4 checks passed
@f-hollow f-hollow deleted the ci/optimize_file_syncing_with_s3 branch January 17, 2025 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants