BigQuery Utilization Alerts

Introduction

This tutorial walks you through the creation of the required infrastructure and the BigQuery Utilization Alerts service.

For more information on the general solution architecture and the required tools, please read the README.

Check the README.md

Reference Pattern for BigQuery reservations

Google recommends creating a dedicated project for managing BigQuery slot commitments, reservations and assignments. This will make it easier for you to manage your reservations in a centralized place and also helps with controlling IAM-based access to these resources.

You will be setting up such an admin project over the course of this tutorial, including all of the BigQuery reservations resources to showcase how the service interacts with these resources. Additionally, you'll be setting up two more projects, which will be assigned to use the reserved BigQuery slot capacity.

Disclaimer: This tutorial will instruct you to also create BigQuery flex slots commitments in the admin project. These flex slots can incur a considerable amount of cost, so make sure that you decommision them as soon as you no longer need them.

Creating the BigQuery Commitments, Reservations and Assignments

Setting up the Admin Project

Let's begin by creating a new (admin) project for this tutorial.

Configure the gcloud-cli to use the same (admin) project:

gcloud config set project <PROJECT_ID>

Once the admin project has been selected, you can create reservations in it. We will purchase dedicated slots (called commitments), create pools of slots (called reservations), and assign BigQuery compute projects (resources) to those reservations.

You might use the different projects to represent different workloads or business units within your organization.

Next, create two additional projects and make note of their Project-IDs. You will need these later on. These projects will be home to the workloads or business units using BigQuery.

Enable the BigQuery Reservation API

The BigQuery Reservation API needs to be enabled.

Alternatively, you can enable the Reservations API directly from the CLI:

gcloud services enable bigqueryreservation.googleapis.com

Set Variables

First, you will need to assign a couple of variables that will be used during this part of the tutorial.

Set to the current (admin) project:

export ADMIN_PROJECT_ID="$(gcloud config get-value project)"

Replace these with the Project-IDs for the previously created addtional projects:

export PROJECT_A_SLOTS=<PROVIDE-WORKLOAD-PROJECT-A-ID>

export PROJECT_B_SLOTS=<PROVIDE-WORKLOAD-PROJECT-B-ID>

Commitments and reservations will be created in the US multi-region:

export LOCATION=US
export RESERVATION_NAME_1=r1
export RESERVATION_NAME_2=r2

Purchase a Flex Slots Commitment

Now it's time to purchase Flex Slots. You can learn more about them[here].

Purchase 100 Flex Slots on the current (admin) project:

bq mk \
  --location=$LOCATION \
  --capacity_commitment \
  --plan=FLEX \
  --slots=100

Create a Reservation on the Commitment

Next, you'll split up the commitment into two reservations of equal size:

bq mk \
  --location=$LOCATION \
  --reservation \
  --slots=50 \
  --ignore_idle_slots=false \
  $RESERVATION_NAME_1

bq mk \
  --location=$LOCATION \
  --reservation \
  --slots=50 \
  --ignore_idle_slots=false \
  $RESERVATION_NAME_2

Assign the Reservations to the Workload Projects

You can now assign a reservation to the workload projects.

Let's map the first reservation to the first workload project...

bq mk \
  --location=$LOCATION \
  --reservation_assignment \
  --reservation_id=$RESERVATION_NAME_1 \
  --job_type=QUERY \
  --assignee_id=$PROJECT_A_SLOTS \
  --assignee_type=PROJECT

... and the other to the second workload project:

bq mk \
  --location=$LOCATION \
  --reservation_assignment \
  --reservation_id=$RESERVATION_NAME_2 \
  --job_type=QUERY \
  --assignee_id=$PROJECT_B_SLOTS \
  --assignee_type=PROJECT

At this point the two workload projects are setup to consume slot capacity from the provisioned commitment. If you'd like, you can use the Google Cloud Console and run some BigQuery jobs from one of the workload projects.

Create Slack/GChat webhooks

The service you are about to deploy acts like a bot and will occasionally post messages to a Chat service like Google Chat or Slack. In order to do this, you will need to setup webhooks in your preferred Chat application. Please refer to the external documentation to do this.

Instructions for setting up webhooks in Slack

Instructions for setting up webhooks in Google Chat

Copy the created pre-signed webhook URLs as you will need them later on.

Local Service Development

The utilization analysis service is written in Go and users are able to test and run their code locally.

Have a look at the code in src/main.go

Export the current project to the environment:

export GOOGLE_CLOUD_PROJECT="$(gcloud config get-value project)"

The service can be configured, using environment variables. The SLOT_USAGE_THRESHOLD sets the minimum slot utilization for the service to actually send notifications on Slack or Google Chat:

export SLOT_USAGE_THRESHOLD="0.8"

(Optional) If you'd like to have the service dump its analysis to blob storage, you can specify a GCS bucket:

export STATE_BUCKET=<PROVIDE-BUCKET-NAME>

Next, let's replace and export the Slack or Google Chat webhook URLs (or both) to the environment:

export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/xxxxxxxxxx/xxxxxxxxxx/xxxxxxxxxxxxxxxxxxxxx"

export GCHAT_WEBHOOK_URL="https://chat.googleapis.com/v1/spaces/xxxxxxxxxxx/messages?key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Finally, we can execute the service locally:

cd src
go run main.go
cd ..

To test the service locally, you need to invoke its endpoint with a HTTP POST request. You can do this with curl:

curl -X POST \
  http://localhost:8080

Build the Container Image

Enable Cloud Build to build the service container image remotely.

Alternatively, you can enable the Cloud Build API directly:

gcloud services enable cloudbuild.googleapis.com

Upload the build context in ./src and run the container integration on Cloud Build:

gcloud builds submit --tag gcr.io/$(gcloud config get-value project)/bq-utilization-alerts ./src

Deploying the Infrastructure to the Admin Project

Now that the container image for the Cloud Run service has been built, it's time to deploy the central infrastructure and the service to the admin project.

Hashicorp's terraform is used to setup infrastructure as code. To track the terraform state remotely, we first need to create a GCS bucket:

gsutil mb gs://$(gcloud config get-value project)-terraform-state/

Update terraform/config.tf with the name of the newly created terraform state bucket.

Edit terraform/config.tf

Update terraform/config.tf to match the configuration of your environment.

Edit terraform/config.tf

After completing configuration, terraform needs to initialized with the Google Cloud providers:

terraform -chdir=terraform init

Finally, you can plan the terraform deployment...

terraform -chdir=terraform plan

... and apply it:

terraform -chdir=terraform apply

Insert your presigned Slack and/or Google Chat webhook URLs into new versions of the created secrets in Google Cloud Secrets Manager:

echo -n  "https://chat.googleapis.com/v1/spaces/....." | gcloud secrets versions add "$(terraform -chdir=terraform output -raw gchat_secret)" --data-file=-

echo -n  "https://hooks.slack.com/services/....." | gcloud secrets versions add "$(terraform -chdir=terraform output -raw slack_secret)" --data-file=-

Checking the Triggers

The infrastructure has been setup and the scheduler should already periodically invoke the service in the central (admin) project.

Note: Per default, the scheduler is set to trigger the service every 5 minutes. You might have to wait a little to see the first invokations.

Check the execution logs of the Cloud Run service.

Add Remote BigQuery Workload Projects

Finally, in order to be able to query for BigQuery jobs in the workload projects, the service account of the service needs the necessary permissions to query the projects.

To do this, you need to add configuration to the Terraform manifest. Update terraform/config.tf and add the Project-IDs of the two previously created workload projects to the set of local.scan_projects:

Edit terraform/config.tf

Alternatively, you can add the unique number of your organization and give the service access to all projects in your organization by modifying the set local.scan_organizations:

Edit terraform/config.tf

Afterwards, you can re-apply the changed configuration with Terraform:

terraform -chdir=terraform apply

It's done! 🎉

The service in the admin project should now also be able to query for BigQuery jobs in the configured workload projects.

Cleaning Up

To clean up all the resources in reverse sequence, follow along.

Destroying the Infrastructure

Use Terraform to destroy all created resources:

terraform -chdir=terraform destroy

Cleaning up BigQuery Assignment, Reservations and Commitments

Next, you have to clean up the created BigQuery resources.

Remove assignments from the workload projects:

export ASSIGNMENT_ID_1=$(bq ls --location=$LOCATION --reservation_assignment $ADMIN_PROJECT_ID:US.r1 | awk '{if(NR>2)print}' | awk '{print $1}')

export ASSIGNMENT_ID_2=$(bq ls --location=$LOCATION --reservation_assignment $ADMIN_PROJECT_ID:US.r2 | awk '{if(NR>2)print}' | awk '{print $1}')

bq rm \
--location=$LOCATION \
--reservation_assignment $ASSIGNMENT_ID_1

bq rm \
--location=$LOCATION \
--reservation_assignment $ASSIGNMENT_ID_2

Delete reservations in the admin project:

bq rm \
--location=$LOCATION \
--reservation $RESERVATION_NAME_1

bq rm \
--location=$LOCATION \
--reservation $RESERVATION_NAME_2

Delete the Flex Slots commitment:

export LATEST_COMMITMENT_ID=$(bq ls --location=$LOCATION --capacity_commitment | awk '{if(NR>2)print}' | awk '{print $1}')

bq rm \
--location=$LOCATION \
--capacity_commitment $LATEST_COMMITMENT_ID

After deleting the Flex Slots commitment, you will no longer be charged for the costs they incur.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TUTORIAL.md

TUTORIAL.md

BigQuery Utilization Alerts

Introduction

Reference Pattern for BigQuery reservations

Creating the BigQuery Commitments, Reservations and Assignments

Setting up the Admin Project

Enable the BigQuery Reservation API

Set Variables

Purchase a Flex Slots Commitment

Create a Reservation on the Commitment

Assign the Reservations to the Workload Projects

Create Slack/GChat webhooks

Local Service Development

Build the Container Image

Deploying the Infrastructure to the Admin Project

Checking the Triggers

Add Remote BigQuery Workload Projects

It's done! 🎉

Cleaning Up

Destroying the Infrastructure

Cleaning up BigQuery Assignment, Reservations and Commitments

Files

TUTORIAL.md

Latest commit

History

TUTORIAL.md

File metadata and controls

BigQuery Utilization Alerts

Introduction

Reference Pattern for BigQuery reservations

Creating the BigQuery Commitments, Reservations and Assignments

Setting up the Admin Project

Enable the BigQuery Reservation API

Set Variables

Purchase a Flex Slots Commitment

Create a Reservation on the Commitment

Assign the Reservations to the Workload Projects

Create Slack/GChat webhooks

Local Service Development

Build the Container Image

Deploying the Infrastructure to the Admin Project

Checking the Triggers

Add Remote BigQuery Workload Projects

It's done! 🎉

Cleaning Up

Destroying the Infrastructure

Cleaning up BigQuery Assignment, Reservations and Commitments