Skip to content

Commit

Permalink
Add workflow to calculate schema changes (#517)
Browse files Browse the repository at this point in the history
* Add workflow to calculate schema changes

* use the actual table

* slightly better format

* Check if it would create a new PR for no diff

* Push branch only when there are changes

* Refactor

* use github output

* Test no diff works as expected

* Update workflow to run on schedule as testing is done

* lint

* test again

* fix script

* Add empty file

* test lints

fix

* add generated changelog

* do not prettify md

* clean up

* feedback

update

* Update changelog

* update dbt images too
  • Loading branch information
amishas157 authored Oct 29, 2024
1 parent d4aaecb commit 56f581c
Show file tree
Hide file tree
Showing 7 changed files with 126 additions and 2 deletions.
72 changes: 72 additions & 0 deletions .github/workflows/update_dbt_marts_schema_changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
name: Update changelog for DBT marts

on:
schedule:
- cron: "0 23 * * 1-5" # This will run at 11 PM UTC, which is 5 PM CST

concurrency:
group: ${{ github.workflow }}-${{ github.ref_protected == 'true' && github.sha || github.ref }}-{{ github.event_name }}
cancel-in-progress: true

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v2

- name: Authenticate to crypto-stellar GCP
uses: "google-github-actions/auth@v2"
with:
project_id: hubble-261722
credentials_json: "${{ secrets.CREDS_PROD_HUBBLE }}"

- name: Set up Google Cloud SDK
run: |
echo "Installing Google Cloud SDK..."
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update && sudo apt-get install -y google-cloud-sdk
- name: Create new branch
id: create_branch
run: |
git config --local user.email "[email protected]"
git config --local user.name "GitHub Action"
BRANCH_NAME="update-data-schema-changelog-${{ github.run_id }}"
git checkout -b $BRANCH_NAME
echo "::set-output name=branch::$BRANCH_NAME"
- name: Run Bash Script
run: |
cd $GITHUB_WORKSPACE
PROJECT=hubble-261722
export PROJECT
output=$(. scripts/update_dbt_marts_schema_changelog.sh)
echo "$output" > changelog/dbt_marts.md
- name: Commit changes
id: commit_changes
run: |
git add changelog/dbt_marts.md
if git commit -m "Update changelog for DBT marts"; then
echo "Changes committed."
echo "changes_committed=true" >> $GITHUB_OUTPUT
else
echo "No changes to commit."
echo "changes_committed=false" >> $GITHUB_OUTPUT
fi
- name: Push branch
if: steps.commit_changes.outputs.changes_committed == 'true'
run: |
git push origin ${{ steps.create_branch.outputs.branch }}
- name: Create Pull Request
if: steps.commit_changes.outputs.changes_committed == 'true'
run: |
gh pr create -B master -H ${{ steps.create_branch.outputs.branch }} \
--title 'Update schema changelog for DBT marts' \
--body 'This is an autogenerated PR to update schema changelog for marts.'
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
1 change: 1 addition & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
**/*.md
1 change: 1 addition & 0 deletions .sqlfluffignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
dags/ddls/queries
changelog/dbt_marts.md
2 changes: 1 addition & 1 deletion airflow_variables_dev.json
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@
"partnership_assets__account_holders_activity_fact": false,
"partnership_assets__asset_activity_fact": false
},
"dbt_image_name": "stellar/stellar-dbt:f04ff57d9",
"dbt_image_name": "stellar/stellar-dbt:53375b5f9",
"dbt_internal_source_db": "test-hubble-319619",
"dbt_internal_source_schema": "test_crypto_stellar_internal",
"dbt_job_execution_timeout_seconds": 300,
Expand Down
2 changes: 1 addition & 1 deletion airflow_variables_prod.json
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@
"partnership_assets__asset_activity_fact": false,
"trade_agg": false
},
"dbt_image_name": "stellar/stellar-dbt:f04ff57d9",
"dbt_image_name": "stellar/stellar-dbt:53375b5f9",
"dbt_internal_source_db": "hubble-261722",
"dbt_internal_source_schema": "crypto_stellar_internal_2",
"dbt_job_execution_timeout_seconds": 2400,
Expand Down
25 changes: 25 additions & 0 deletions changelog/dbt_marts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Changes in DBT marts schema

| Date | Table Name | Operation | Columns |
|------------|---------------------------------|---------------|--------------------------|
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_MEANINGFUL | type_changed | min_price_r, claimants, path, asset_balance_changes, parameters_decoded, parameters, max_price_r |
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_MEANINGFUL | column_added | airflow_start_ts, details_json |
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_MGI | type_changed | claimants, asset_balance_changes, parameters_decoded, max_price_r, min_price_r, parameters, path |
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_MGI | column_added | details_json, airflow_start_ts |
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_XLM | column_added | details_json, airflow_start_ts |
| 2024-10-29 | ENRICHED_HISTORY_OPERATIONS_XLM | type_changed | parameters, max_price_r, min_price_r, path, asset_balance_changes, parameters_decoded, claimants |
| 2024-09-12 | ASSET_STATS_AGG | column_added | airflow_start_ts |
| 2024-09-12 | DIM_DATES | column_added | airflow_start_ts |
| 2024-09-12 | DIM_MGI_WALLETS | column_added | airflow_start_ts |
| 2024-09-12 | FCT_MGI_CASHFLOW | column_added | airflow_start_ts |
| 2024-09-12 | LIQUIDITY_POOLS_VALUE | column_added | airflow_start_ts |
| 2024-09-12 | LIQUIDITY_POOLS_VALUE_HISTORY | column_added | airflow_start_ts |
| 2024-09-12 | LIQUIDITY_POOL_TRADE_VOLUME | column_added | airflow_start_ts |
| 2024-09-12 | LIQUIDITY_PROVIDERS | column_added | airflow_start_ts |
| 2024-09-12 | MGI_MONTHLY_USD_BALANCE | column_added | airflow_start_ts |
| 2024-09-12 | MGI_NETWORK_STATS_AGG | column_added | airflow_start_ts |
| 2024-09-12 | NETWORK_STATS_AGG | column_added | airflow_start_ts |
| 2024-09-12 | OHLC_EXCHANGE_FACT | column_added | airflow_start_ts |
| 2024-09-12 | PARTNERSHIP_ASSETS__ACCOUNT_HOLDERS_ACTIVITY_FACT | column_added | airflow_start_ts |
| 2024-09-12 | PARTNERSHIP_ASSETS__ASSET_ACTIVITY_FACT | column_added | airflow_start_ts |
| 2024-09-12 | PARTNERSHIP_ASSETS__MOST_ACTIVE_FACT | column_added | airflow_start_ts |
25 changes: 25 additions & 0 deletions scripts/update_dbt_marts_schema_changelog.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash

project=$PROJECT
result=$(bq query --format=prettyjson --nouse_legacy_sql \
"SELECT
date(detected_at) as date
, table_name
, sub_type as operation
, ARRAY_AGG(column_name) as columns
FROM
${project}.elementary.alerts_schema_changes
GROUP BY
1, 2, 3
ORDER BY 1 DESC, 2 ASC
")

echo "# Changes in DBT marts schema"

echo ""

echo "| Date | Table Name | Operation | Columns |"
echo "|------------|---------------------------------|---------------|--------------------------|"

echo "$result" | jq -r '.[] | "| \(.date) | \(.table_name ) | \(.operation) | \(.columns | join(", ")) |"'
echo ""

0 comments on commit 56f581c

Please sign in to comment.