Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev into main for 2023-12-13 #3153

Merged
merged 66 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
3946f71
WIP move to year-quarter partitions
e-belfer Nov 24, 2023
e5d3ab4
WIP change extraction partitions and etl settings, fix unit tests
e-belfer Nov 29, 2023
f0d3d03
Resolve merge conflict and update settings_test
e-belfer Nov 30, 2023
78f1992
Merge branch 'dev' into cems-quarterly
e-belfer Nov 30, 2023
ec00457
Update conda-lock.yml and rendered conda environment files.
e-belfer Nov 30, 2023
77db673
Update integration tests to use quarter
e-belfer Nov 30, 2023
a0f729c
Update DOI to production
e-belfer Dec 4, 2023
9320d8a
Merge branch 'dev' into cems-quarterly
e-belfer Dec 4, 2023
748c09b
Merge branch 'cems-quarterly' of https://github.com/catalyst-cooperat…
e-belfer Dec 4, 2023
8e01373
Fix EPACEMS integration test
e-belfer Dec 4, 2023
58bffe3
Update conda-lock.yml and rendered conda environment files.
e-belfer Dec 4, 2023
610ef4c
Repartition row groups in monolith parquet, update integration test, …
e-belfer Dec 5, 2023
70c720a
Merge branch 'cems-quarterly' of https://github.com/catalyst-cooperat…
e-belfer Dec 5, 2023
64560bb
Drop year from fast ETL and add concurrency limiting
e-belfer Dec 6, 2023
e8c5542
Merge branch 'dev' into cems-quarterly
e-belfer Dec 6, 2023
2f567cf
Update conda-lock.yml and rendered conda environment files.
e-belfer Dec 6, 2023
9607248
Drop concurrency further and update integration test to use 2022 data
e-belfer Dec 6, 2023
441a9b5
Update conda-lock.yml and rendered conda environment files.
e-belfer Dec 6, 2023
7c895af
Merge branch 'dev' into cems-quarterly
cmgosnell Dec 7, 2023
f1cd9d9
point cems to a new (draft!) archive w/ year_quarter partitions
cmgosnell Dec 8, 2023
ca1b2e7
Update conda-lock.yml and rendered conda environment files.
cmgosnell Dec 8, 2023
e2f60b5
Merge branch 'dev' into cems-quarterly
cmgosnell Dec 8, 2023
078ae5b
Merge branch 'cems-quarterly' of github.com:catalyst-cooperative/pudl…
cmgosnell Dec 8, 2023
f01aef9
Merge branch 'cems-quarterly' into cems-year_quarters
cmgosnell Dec 8, 2023
99c3c37
Merge branch 'cems-year_quarters' of github.com:catalyst-cooperative/…
cmgosnell Dec 8, 2023
a0ddb5b
Update conda-lock.yml and rendered conda environment files.
cmgosnell Dec 8, 2023
e4188e2
Update conda-lock.yml and rendered conda environment files.
cmgosnell Dec 8, 2023
bef3965
Bump docker/metadata-action from 4.4.0 to 5.3.0
dependabot[bot] Dec 11, 2023
22c4a92
Bump actions/setup-python from 4 to 5
dependabot[bot] Dec 11, 2023
00fe4b1
Bump google-github-actions/setup-gcloud from 1 to 2
dependabot[bot] Dec 11, 2023
96e3189
Bump tibdex/github-app-token from 1 to 2
dependabot[bot] Dec 11, 2023
254c772
Update conda-lock.yml and rendered conda environment files.
zaneselvans Dec 11, 2023
6007f54
Merge pull request #3142 from catalyst-cooperative/dependabot/github_…
zaneselvans Dec 11, 2023
12b359e
Merge pull request #3144 from catalyst-cooperative/dependabot/github_…
zaneselvans Dec 11, 2023
d97ef6d
Merge pull request #3145 from catalyst-cooperative/dependabot/github_…
zaneselvans Dec 11, 2023
10591d4
Merge pull request #3143 from catalyst-cooperative/dependabot/github_…
zaneselvans Dec 11, 2023
63613f3
Replace ferc714 @multi_asset with asset factory (#3123) for better s…
rousik Dec 11, 2023
a3847c0
Update conda-lock.yml and rendered conda environment files.
rousik Dec 11, 2023
7197e64
Merge branch 'cems-quarterly' into cems-year_quarters
cmgosnell Dec 11, 2023
e13c6cf
[pre-commit.ci] pre-commit autoupdate
pre-commit-ci[bot] Dec 11, 2023
bf4a6df
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 11, 2023
40c799c
Update .pre-commit-config.yaml
zaneselvans Dec 11, 2023
e4cc807
Merge pull request #3146 from catalyst-cooperative/update-conda-lockfile
zaneselvans Dec 11, 2023
1815560
address pr concerns
cmgosnell Dec 11, 2023
5c7eaba
Merge branch 'cems-year_quarters' of github.com:catalyst-cooperative/…
cmgosnell Dec 11, 2023
1ac9e48
Merge branch 'dev' into cems-quarterly
cmgosnell Dec 11, 2023
51d9a03
Merge branch 'cems-quarterly' of github.com:catalyst-cooperative/pudl…
cmgosnell Dec 11, 2023
27c415d
Update conda-lock.yml and rendered conda environment files.
cmgosnell Dec 11, 2023
f33db7b
Merge branch 'cems-quarterly' into cems-year_quarters
cmgosnell Dec 11, 2023
55828de
Update conda-lock.yml and rendered conda environment files.
cmgosnell Dec 11, 2023
0db0d4a
Merge branch 'cems-quarterly' into cems-year_quarters
cmgosnell Dec 11, 2023
15c6069
Merge branch 'cems-year_quarters' of github.com:catalyst-cooperative/…
cmgosnell Dec 11, 2023
98861eb
Merge pull request #3139 from catalyst-cooperative/cems-year_quarters
cmgosnell Dec 11, 2023
cf09bda
Merge branch 'dev' into pre-commit-ci-update-config
zaneselvans Dec 11, 2023
fbd4689
add release notes for quarterly cems
cmgosnell Dec 11, 2023
84c5330
Fix some comments/docstrings; clarify Zenodo RECID regex
zaneselvans Dec 11, 2023
1c3e47a
Remove comment about epacems DOI being draft archive.
zaneselvans Dec 11, 2023
4babaaf
Merge pull request #3148 from catalyst-cooperative/pre-commit-ci-upda…
zaneselvans Dec 11, 2023
003f1d7
Adjust epacems output tests to reflect quarterly partitions.
zaneselvans Dec 12, 2023
bbef1e2
Merge branch 'dev' into cems-quarterly
zaneselvans Dec 12, 2023
bfe6203
add tests to cover a few uncovered lines
cmgosnell Dec 12, 2023
f020a07
Merge pull request #3096 from catalyst-cooperative/cems-quarterly
cmgosnell Dec 12, 2023
4f6b8dc
Table diff tools (#3128)
jdangerx Dec 12, 2023
694032d
Refactor calculation of annualized_respondents_ferc714 (#3024)
rousik Dec 12, 2023
71e77f9
Update conda-lock.yml and rendered conda environment files.
zaneselvans Dec 12, 2023
0009827
Merge branch 'main' into dev
zaneselvans Dec 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/bot-auto-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Impersonate auto merge PR bot
uses: tibdex/github-app-token@v1
uses: tibdex/github-app-token@v2
id: generate-token
with:
app_id: ${{ secrets.BOT_AUTO_MERGE_PRS_APP_ID }}
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:

- name: Docker Metadata
id: docker_metadata
uses: docker/metadata-action@v4.4.0
uses: docker/metadata-action@v5.3.0
with:
images: catalystcoop/pudl-etl
flavor: |
Expand Down Expand Up @@ -83,7 +83,7 @@ jobs:

# Setup gcloud CLI
- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1
uses: google-github-actions/setup-gcloud@v2

- name: Determine commit information
run: |-
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docker-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:

- name: Docker Metadata
id: docker_metadata
uses: docker/metadata-action@v4.4.0
uses: docker/metadata-action@v5.3.0
with:
images: catalystcoop/pudl-etl
flavor: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
with:
fetch-depth: 2
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Build source and wheel distributions
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/run-etl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
uses: actions/checkout@v4
- name: Docker Metadata
id: docker_metadata
uses: docker/metadata-action@v4.4.0
uses: docker/metadata-action@v5.3.0
# TODO(rousik): we could consider YYYY-MM-DD-HHMM-branch-sha
with:
images: catalystcoop/pudl-etl-ci
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/zenodo-cache-sync.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ jobs:
service_account: "zenodo-cache-manager@catalyst-cooperative-pudl.iam.gserviceaccount.com"

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1
uses: google-github-actions/setup-gcloud@v2

- name: Update GCS cache with any new Zenodo archives
run: |
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ repos:
# Formatters: hooks that re-write Python & documentation files
####################################################################################
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.6
rev: v0.1.7
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
Expand Down
102 changes: 102 additions & 0 deletions devtools/sqlite-table-diff.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\"\"\"Example of diffing tables across multiple different SQLite DBs.\n",
"\n",
"The tables must have the same name/schema. This is intended for use in\n",
"investigating validation test errors.\n",
"\"\"\"\n",
"import sqlite3\n",
"from pathlib import Path\n",
"from typing import Iterable\n",
"\n",
"import pandas as pd\n",
"\n",
"from pudl.helpers import diff_wide_tables, TableDiff\n",
"from pudl.metadata.classes import Resource\n",
"from pudl.metadata.fields import apply_pudl_dtypes\n",
"\n",
"\n",
"def table_diff(\n",
" table_name: str,\n",
" old_conn,\n",
" new_conn,\n",
" ignore_cols: Iterable[str] = (\"plant_id_ferc1\",),\n",
" addl_key_cols: Iterable[str] = (),\n",
" ) -> TableDiff:\n",
" \"\"\"Diff two versions of the same table that live in SQL databases.\n",
"\n",
" The table has to have the same name + columns in both DBs.\n",
"\n",
" Args:\n",
" table_name: the name, in the SQL database, of the table you want to compare.\n",
" old_conn: SQLite connection to the old version of the database.\n",
" new_conn: SQLite connection to the new version of the database.\n",
" ignore_cols: a list of columns that you would like to ignore diffs in.\n",
" addl_key_cols: \n",
" columns that aren't necessarily in the primary key, but that you'd\n",
" like to use as key columns for the diff - for example, if your\n",
" table only uses `record_id` as primary_key, but you want to group\n",
" the rows by `record_year` and `utility_id` as well, you would pass\n",
" those in.\n",
" \"\"\"\n",
" query = f\"SELECT * FROM {table_name}\" # noqa: S608\n",
" old_table = apply_pudl_dtypes(pd.read_sql(query, old_conn))\n",
" new_table = apply_pudl_dtypes(pd.read_sql(query, new_conn))\n",
"\n",
" cols = list(set(old_table.columns) - set(ignore_cols))\n",
"\n",
" primary_key = list(set(Resource.from_id(table_name).schema.primary_key).union(set(addl_key_cols)))\n",
" return diff_wide_tables(primary_key, old_table[cols], new_table[cols])\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"new_db = sqlite3.connect(Path(\"~/Downloads/pudl.sqlite\").expanduser())\n",
"old_db = sqlite3.connect(Path(\"~/Downloads/pudl (2).sqlite\").expanduser())\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"table_name = \"denorm_plants_steam_ferc1\"\n",
"diff = table_diff(table_name, old_db, new_db, ignore_cols=(\"plant_id_ferc1\", \"plant_id_pudl\"), addl_key_cols=(\"report_year\", \"utility_id_pudl\"))\n",
"diff.changed"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "pudl-dev",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
6 changes: 6 additions & 0 deletions docs/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ v2023.12.XX
outputs describing historical utility and balancing authority service territories. See
:issue:`1174` and :pr:`3086`.

Data Coverage
^^^^^^^^^^^^^
* Updated :doc:`data_sources/epacems` to switch to pulling the quarterly updates of
CEMS instead of the annual files. Integrates CEMS through 2023q3. See issue
:issue:`2973` & PR :pr:`3096`.

---------------------------------------------------------------------------------------
v2023.12.01
---------------------------------------------------------------------------------------
Expand Down
Loading
Loading