-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Override BQ load job location when necessary #31986
Conversation
Assigning reviewers. If you would like to opt out of this review, comment R: @shunping for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
Looks like there are some presubmit errors caused by the code change.
I think we need to mock |
waiting on author |
@shunping can you PTAL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There is a racing condition here such that the retry job could happen when the conflict job is running, and then get_table_location will fail on 404, thus failing the Dataflow job. |
This reverts commit ea98212.
Found a case where if we retry a successful load job (e.g. due to bundle failure), we get a 409
ALREADY_EXISTS
error along with a job reference that does not contain a location.In finish bundle, we perform wait operations on job references, where we get and poll until the job is finished. If we attempt to get a job without a location, the API will default it to US multi region. However, this is problematic when we're writing to a different region because BQ won't be able to find the job and we encounter a 404
NOT_FOUND
error.In such a case where the job reference has null location, we should override it with the table's location