Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform and harvesting the eia860 Energy Storage table #3526

Merged
merged 14 commits into from
Apr 3, 2024

Conversation

aesharpe
Copy link
Member

@aesharpe aesharpe commented Mar 30, 2024

Overview

Closes #3506 and #3536

What problem does this address?

  • Adds a transformer for the EIA860 energy storage table
  • Puts this table through the harvesting process

What did you change?

  • Added a transformer for the storage table
  • Added new fields for the storage table
  • Added coding metadata for core_eia__codes_storage_technology_types and core_eia__codes_storage_enclosure_types
  • Updated some column names to fix typos, better reflect column contents, naming protocols, and existing column names
  • Add the core_eia860__yearly_generators_energy_storage to the list of finished EIA assets and resource metadata

Notes

  • I noticed that some of the columns that get dropped from the specific generator tables have values that aren't in the general generators table. We should probably do a comparison before we drop them, but that is OOS.
  • Looks like the boolean columns are coming out as 1/0 values in the core tables, need to figure out where to add the dtype fix to stop this.

Testing

How did you make sure this worked? How can a reviewer verify this?

  • Still need to run all the tests

To-do list

…ils tweaking some of the column names and skipfooters, and passing it through the harvesting process. Adds in new codes, fields, and a new alembic migration.
@aesharpe aesharpe self-assigned this Mar 30, 2024
src/pudl/metadata/fields.py Outdated Show resolved Hide resolved
@aesharpe aesharpe requested a review from cmgosnell April 1, 2024 16:21
@cmgosnell cmgosnell added eia860 Anything having to do with EIA Form 860 new-data Requests for integration of new data. gridlab Work related to open modeling input data integration funded/coordinated by GridLab labels Apr 1, 2024
Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you missed one code step, which is to define those tables as db tables. there are many examples of that in metadata.resources.eia. This could be the cause of your pydantic error.

You also need to add the _core table in the tables to be harvested (see comment in transform.eia

I noticed that some of the columns that get dropped from the specific generator tables have values that aren't in the general generators table. We should probably do a comparison before we drop them, but that is OOS.

I'm not sure what you mean here. This maybe could be related to the previous harvest comment.

Looks like the boolean columns are coming out as 1/0 values in the core tables, need to figure out where to add the dtype fix to stop this.

Are you seeing this in the _core table or in the core table? the core table is definitely getting dtyped with the dypes defined in the schema.

(i haven't tried to build these assets yet)

src/pudl/transform/eia.py Outdated Show resolved Hide resolved
src/pudl/transform/eia860.py Outdated Show resolved Hide resolved
src/pudl/transform/eia860.py Outdated Show resolved Hide resolved
src/pudl/metadata/fields.py Outdated Show resolved Hide resolved
…e_power_rating from the energy table schema because it's in the generators generic table. Also rename that column to match the other harvested column with mvar.
@aesharpe
Copy link
Member Author

aesharpe commented Apr 2, 2024

Are you seeing this in the _core table or in the core table? the core table is definitely getting dtyped with the dypes defined in the schema.

Yeah, both tables just have 0 or 1 values instead of True/False

@aesharpe aesharpe marked this pull request as ready for review April 2, 2024 20:26
@aesharpe
Copy link
Member Author

aesharpe commented Apr 2, 2024

Yeah, both tables just have 0 or 1 values instead of True/False

Realized that this is because I was viewing them in SQL where the dtypes were dropped

Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second zane's suggestion to remove the _application suffix but that is not blocking for me. also one docs comment that is also not blocking.

src/pudl/transform/eia860.py Outdated Show resolved Hide resolved
@zaneselvans
Copy link
Member

Yeah, SQLite stores booleans as integers. It's generally kinda lazy about dtypes.

If you want to see rich dtypes, you can always read from the parquet files, which is what you'll get if you use something like

raw_weather = defs.load_asset_value(AssetKey("raw_gridpathratoolkit__daily_weather"))

@aesharpe
Copy link
Member Author

aesharpe commented Apr 2, 2024

I second zane's suggestion to remove the _application suffix but that is not blocking for me. also one docs comment that is also not blocking.

Happy to change if it's 2:1

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@aesharpe aesharpe added this pull request to the merge queue Apr 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Apr 2, 2024
@zaneselvans zaneselvans added this pull request to the merge queue Apr 2, 2024
Merged via the queue into main with commit c28abb2 Apr 3, 2024
12 checks passed
@zaneselvans zaneselvans deleted the eia860-energy-storage-transform branch April 3, 2024 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
eia860 Anything having to do with EIA Form 860 gridlab Work related to open modeling input data integration funded/coordinated by GridLab new-data Requests for integration of new data.
Projects
Archived in project
3 participants