Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Reduce API throttling by reading all env variables per env at once. #2526

Open
1 task done
DavidWHarvey opened this issue Dec 30, 2024 · 1 comment
Open
1 task done
Labels
Status: Triage This is being looked at and prioritized Type: Feature New feature or request

Comments

@DavidWHarvey
Copy link

Describe the need

When running plans against 64 repos, we hit API throttling at about an order of magnitude earlier due to the reading of env variables one at a time rather than one API call to get all of them (the data source can read them all in one call). WRT number of resources per repo, env variables currently dominate, since there may be 10 or 50 per environment.

We are re-planning repos and Azure resources (using a matrix per repo - so one plan per repo in parallel), and the process takes about 8 minutes on 64 repos, consuming about 60% of the budget, i.e., we cannot run this twice in the same hour.

I have identified 3 possible solutions:

  • on a read, read all values for the environment and cache them in memory. This is simple, but I'm unclear about how this might affect terraform.
  • create a new resource that holds the env variable map for the entire environment, similar to the data source.
  • Work around this with a provisioner and the data source to trigger the provisioner to run. The provisioner would sync a per env map with the contents of the environment.

SDK Version

No response

API Version

No response

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@DavidWHarvey DavidWHarvey added Status: Triage This is being looked at and prioritized Type: Feature New feature or request labels Dec 30, 2024
@DavidWHarvey
Copy link
Author

We implemented the provisioner workaround, with a single terraform_data. The "used" is now <400 vs ~9000 before this change for one run on 137 repos. I don't understand the metric here, but the limit is 15000.

The provisioner calls a simple python script which is passed a old & new map for an env, and syncs them, calling the gh command. The tedious parts after remembering to pass GH_TOKEN but claim it is nonsensitive since it is not logged deal with the data resource to read env vars not tolerating non-existing repo or env. You cannot use other data members to filter the keys to use for a foreach. We had to make an API call prior to running terraform to generate a list of environments for the repo to avoid errors on enumerating env vars.

Because of terraform not wanting you to act on prior state, drift detection was also tedious. We want to trigger replacement if current != desired to handle the case where current changed but desired didn't. But we need some desired prior state into to avoid needing a second plan/apply to clear the fact that we triggered due to a mismatch. Adding a var "LAST_CHANGED" set to a timestamp provided a means to force a trigger to replace (resync) but record a state that would be correct on the next run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Triage This is being looked at and prioritized Type: Feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant