From 312d9f0d94cd37cccce8353e44430e7dfd5f4193 Mon Sep 17 00:00:00 2001 From: Nancy Amandi <103530451+Nancy9ice@users.noreply.github.com> Date: Sat, 28 Sep 2024 00:03:42 +0100 Subject: [PATCH] added descriptions of the EIA861 data (#3808) * added descriptions of the EIA861 data * edited 861 data source description * edited eia861_child.rst.jinja * Add more notable irregularities * edited eia86a data source * made some recommended changes --------- Co-authored-by: Austen Sharpe --- docs/templates/eia861_child.rst.jinja | 98 +++++++++++++++++++++++---- 1 file changed, 83 insertions(+), 15 deletions(-) diff --git a/docs/templates/eia861_child.rst.jinja b/docs/templates/eia861_child.rst.jinja index 28108d510e..8007874d36 100644 --- a/docs/templates/eia861_child.rst.jinja +++ b/docs/templates/eia861_child.rst.jinja @@ -1,14 +1,24 @@ {% extends "data_source_parent.rst.jinja" %} {% block background %} -EIA Form EIA-861 (and the short form EIA-861S) make up the Annual Electric Power -Industry Report, collecting data from distribution utilities and power marketers. -It is a census of all United States electric utilities. As of 2023 PUDL -only incorporates the annual data, but there's also a less detailed monthly `Form -EIA-861M `__. +EIA Form EIA-861 (and the short form EIA-861S) make up the Annual Electric Power Industry Report. +This survey is a census of all U.S. electric utilities and collects information about retail +sales of electricity and associated revenue on an annual basis. + + +As of 2023, the EIA-861 Form is organized into the following schedules: + +* **Schedule 1:** Identification +* **Schedule 2:** Energy Sources Data +* **Schedule 3:** Distribution Information +* **Schedule 4:** Sales Data +* **Schedule 5:** Mergers and/or Acquisition Information +* **Schedule 6:** Demand-Side Management Information +* **Schedule 7:** Net-Metering Programs Information +* **Schedule 8:** Service Territory Information There are more than 20 different spreadsheets included in the annual zipfiles. For -details, see the `official EIA-861 page <{{ source.path }}>`__ +details, see the `official EIA-861 page <{{ source.path }}>`__. {% endblock %} {% block download_docs %} @@ -18,22 +28,80 @@ details, see the `official EIA-861 page <{{ source.path }}>`__ {% endblock %} {% block availability %} +PUDL incorporates annual EIA-861 data starting from 2001 onward. +There is also a less detailed monthly +`Form EIA-861M `__ +that is not incorporated into PUDL. {% endblock %} {% block respondents %} +The Form EIA-861 must be completed by electric power industry entities such as, but not limited to, +electric utilities, all Demand Side Management (DSM) Program Managers (entities responsible for +conducting or administering a DSM program), wholesale power marketers, energy service providers, +electric power producers, transmission owners, transmission operators, and Third Party Owners of +solar PV (TPO). Responses are collected at the operating company level (not at the holding company level). + +EIA-861S is intended for smaller bundled-service utilities and requires less detailed responses. +Respondents respond to either Form EIA-861 or Form EIA-861S. +However, to maintain data quality, respondents reporting to EIA-861S are required to complete +the full form every eight years. {% endblock %} {% block original_data %} +EIA typically publishes 861 data from the prior year in two rounds: an early release in the summer and a final release in the fall. +The data are published on the EIA website and distributed as a collection of spreadsheets. +The content of the spreadsheets varies from year to year as the questions in the form are updated. +EIA also periodically changes the naming and structure of the spreadsheets without warning. +Older “final” data may also be revised several years after it was published. +To ensure reproducible analyses, we archive `versioned snapshots of the EIA-861 data on Zenodo `__. +These archives are periodically refreshed with new data from the `EIA website <{{ source.path }}>`__. + +To understand the details of how the form and data have evolved over time, +we recommend reading the Form Instructions from different years, linked above. {% endblock %} {% block notable_irregularities %} -* In 2012-2013 the EIA-861 was overhauled. Several tables were discontinued, and many - more were added. Similar kinds of information reported in the two regimes may not be - directly comparable. -* Most of the EIA-861 data that's available in the PUDL DB has not yet been fully - normalized, meaning it contains many duplicate copies of the same information, which - may not always be internally consistent. -* We also have not yet integrated the Balancing Authority and Utility IDs reported in - the EIA-861 into our entity tables, so for the moment we don't have any foreign key - constraints enabled on the EIA-861 tables. + +Non-unique utility IDs +---------------------- +The Freedom of Information Act (FOIA), 5 U.S.C. §552, the Department of Energy (DOE) regulations, +10 C.F.R. §1004.11, implementing the FOIA, and the Trade Secrets Act, 18 U.S.C. §1905 allows +qualifying respondents to restrict access to their data. According to sources at the EIA, +approximately 3 respondents have used this as a means to keep their utility-level data proprietary. +These entries appear under the utility id ``88888`` in the data. + +The EIA also performs state-level imputations and adjustments for more accurate state-level analysis. +These entries appear under the utility id ``99999`` and are not intended for use in utility-level +aggregations. + +Form changes and table discrepancies +------------------------------------ +Not all of the EIA-861 tables exist for all of the reporting years. Both the Form and its output files +have changed over time; most notably between 2012 and 2013. Beginning in 2013, the EIA split its +Demand Side Management spreadsheet into two: Demand Response and Energy Efficiency in order to reduce +ambiguity and ensure data reliability. While there are many similarities between the two years, EIA +officials recommend against combining data from these three files, citing subtle differences in Form +questions. For these files, the post-2012 DR and EE spreadsheets are considered more accurate. + +Cost data rounded to the thousands +---------------------------------- +It’s important to note that all costs are reported to EIA-861 in thousands of dollars. PUDL transforms +these values to dollars, but they will still reflect the values rounded to the thousands. + +Not yet fully normalized +------------------------ +Most of the EIA-861 data that's available in the PUDL DB has not yet been fully +normalized, meaning it contains many duplicate copies of the same information, which +may not always be internally consistent. + +We also have not yet integrated the Balancing Authority and Utility IDs reported in +the EIA-861 into our entity tables, so for the moment we don't have any foreign key +constraints enabled on the EIA-861 tables. + +2019 short form +--------------- +In 2019 the data collected from the short form was incorporated into the other published +tables rather than as a standalone Short_Form_####.xlsx (where #### is the year) as in +other years. We haven't addressed this discrepancy yet; see details in issue :issue:`3654`. + {% endblock %}