Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix all plants ferc1 #1656

Merged
merged 12 commits into from
Jun 16, 2022
Merged

Fix all plants ferc1 #1656

merged 12 commits into from
Jun 16, 2022

Conversation

aesharpe
Copy link
Member

@aesharpe aesharpe commented May 24, 2022

This PR changes a few things related to the ferc1 output table with all the plants:

  • Changes all references to all_plants_ferc1 to plants_all_ferc1 to match the syntax of the other ferc1 plant tables
  • Adds/calculates an opex_nonfuel column for all of the plant tables so that the aggregate plants_all_ferc1 table has an accurate Total O&M metric. Compared these values against RMI official data for validation.

We will need to update any references to the all_plants_ferc1 table in other repos -- I already did this for the rmi repo in a branch called fix-all-plants-ferc-ref.

… table, change opex_total - opex_fuel to opex_total - opex_fuel.fillna(0). A lot of the opex_fuel rows are NA, but they really mean zero because they are renewable. When you try and subtract NA from a float you get NA even if the original value is not NA. This means that all the renewables have NA opex_nonfuel (O&M Cost) which is not true! This fixes that.
…ble because opex_total apparently doesn't actually include opex_maintenance. Adding this makes it match the official BHE data we're working with with RMI.
@aesharpe aesharpe added the rmi label May 24, 2022
@aesharpe aesharpe requested a review from cmgosnell May 24, 2022 17:12
@aesharpe aesharpe self-assigned this May 24, 2022
@codecov
Copy link

codecov bot commented May 24, 2022

Codecov Report

Merging #1656 (0486ae9) into dev (7cd597c) will increase coverage by 0.9%.
The diff coverage is 100.0%.

❗ Current head 0486ae9 differs from pull request most recent head 9ab2800. Consider uploading reports for the commit 9ab2800 to get more accurate results

@@           Coverage Diff           @@
##             dev   #1656     +/-   ##
=======================================
+ Coverage   83.8%   84.8%   +0.9%     
=======================================
  Files         65      65             
  Lines       7270    7849    +579     
=======================================
+ Hits        6099    6663    +564     
- Misses      1171    1186     +15     
Impacted Files Coverage Δ
src/pudl/metadata/resources/ferc1.py 100.0% <ø> (ø)
src/pudl/transform/ferc1.py 92.0% <ø> (+<0.1%) ⬆️
src/pudl/output/ferc1.py 100.0% <100.0%> (ø)
src/pudl/output/pudltabl.py 89.4% <100.0%> (+0.4%) ⬆️
src/pudl/analysis/state_demand.py 24.5% <0.0%> (-0.4%) ⬇️
src/pudl/output/eia923.py 96.6% <0.0%> (-0.2%) ⬇️
src/pudl/workspace/datastore.py 69.4% <0.0%> (-0.2%) ⬇️
src/pudl/analysis/spatial.py 93.6% <0.0%> (-0.1%) ⬇️
src/pudl/analysis/timeseries_cleaning.py 88.6% <0.0%> (-0.1%) ⬇️
src/pudl/metadata/classes.py 82.1% <0.0%> (-0.1%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7cd597c...9ab2800. Read the comment docs.

@@ -59,7 +59,7 @@ def plants_steam_ferc1(pudl_engine):
.assign(
capacity_factor=lambda x: x.net_generation_mwh / (8760 * x.capacity_mw),
opex_fuel_per_mwh=lambda x: x.opex_fuel / x.net_generation_mwh,
opex_nonfuel=lambda x: x.opex_production_total - x.opex_fuel,
opex_nonfuel=lambda x: x.opex_production_total - x.opex_fuel.fillna(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there instances in which we think the fuel cost could actually be non-zero and the opex_production_total does actually include it even though opex_fuel is null?

It might be nice to do a little checking of this opex_production_total before we make this assumption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, for the steam table only 0.12% of the opex_production_total values don't add up to the sum of all other opex_CATEGORY columns. Most of these look like errors where the opex_production_total value is very low, in some cases single digits when it obviously shouldn't be.....this is probably something we should flag or fix down the road....but anyways these 36 bad values should probably not use the production_total value and rather the actual sum total (calculated). All the bad values' production_total values are lower than the sum total of opex_ columns so it's unlikely that they would include additional, "hidden" fuel values where opex_fuel is null. (Also there are only 4 bad rows where opex_fuel is null)

Given how exactly most of the opex values sum to equal the opex_production_total value regardless of whether there is a value for opex_fuel, I think that it's unlikely/other values would have to be reported wrong in order for opex_fuel to be null but a non-zero fuel amount was also included in the production_total value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we supposed to remove calculable columns in one of the ETL steps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great news! thanks for investigating! that # of records are definitely ignore-able imho!

src/pudl/output/ferc1.py Outdated Show resolved Hide resolved
…is change in the transform module as well as the output module where there are references to the original opex_total column
…es for RMI purposes to opex_total_nonfuel so it isn't confused with the opex components that sum up to opex_total.
@aesharpe
Copy link
Member Author

aesharpe commented Jun 9, 2022

I just pushed some code that changes the erroneously labeled opex_total field in the steam table to opex_operations. I also changed the name opex_nonfuel created for all ferc plants tables in the output layer to opex_total_nonfuel so as not to confuse it with the opex_ component columns that add up to the total.

Copy link
Member

@zaneselvans zaneselvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment where there is a pd.NA that I think should be an np.nan

src/pudl/output/ferc1.py Outdated Show resolved Hide resolved
Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great!

@aesharpe aesharpe merged commit ef53be1 into dev Jun 16, 2022
@zaneselvans zaneselvans deleted the fix-all-plants-ferc branch November 18, 2022 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants