Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct CEMS for net vs. gross generation #245

Open
zaneselvans opened this issue Dec 6, 2018 · 14 comments
Open

Correct CEMS for net vs. gross generation #245

zaneselvans opened this issue Dec 6, 2018 · 14 comments
Labels
data-cleaning Tasks related to cleaning & regularizing data during ETL. epacems Integration and analysis of the EPA CEMS dataset. output Exporting data from PUDL into other platforms or interchange formats.

Comments

@zaneselvans
Copy link
Member

The generation numbers which can be calculated from the EPA CEMS data need to be clearly identified as either net generation or gross generation, and potentially standardized. The EIA 923 generation table (not generation fuel) has information about net vs. gross electricity generation that should be helpful. This is reportedly a big pain in the ass, according to other people who have worked on it.

@zaneselvans zaneselvans added data-cleaning Tasks related to cleaning & regularizing data during ETL. epacems Integration and analysis of the EPA CEMS dataset. labels Dec 6, 2018
@zaneselvans zaneselvans added this to the future release milestone Dec 6, 2018
@gschivley
Copy link
Contributor

I've seen a working paper that adjusted gross generation to an estimate of net and even estimated hourly hydro generation from stream flow data. Can't find it at the moment but I'll check around. It's also the paper that pointed out issues with generation from some combined cycle units (no generation reported to CEMS for the steam turbine IIRC).

@karldw karldw mentioned this issue Mar 28, 2019
@cmgosnell cmgosnell removed this from the future_release milestone Oct 4, 2019
@grgmiller
Copy link
Collaborator

This has probably been resolved at this point, but based on a conversation with the EPA CAMD folks yesterday, the data in EPA CEMS is gross generation, including all generation before subtracting out house loads.

One of my research goals is to actually try and estimate hourly data about generators not included in CEMS (<25MW) by trying to derive a relationship between gross and net generation and converting net generation data from EIA-923 and EIA-930 to gross generation and emissions.

@zaneselvans
Copy link
Member Author

Definitely not yet resolved! That's why the issue is still open :)

But also definitely something we want to get done. IIRC the 923 data has both net generation (in the generation_eia923 table) and gross generation (in the generation_fuel_eia923 table) on a monthly basis. Though we may have mislabeled the gross generation net generation now that I think about it. Need to look at that table definition more closely. The net generation in the generation table is by generator, but the generation fuel table is only plant level data. But using the ratio between the net and gross generation and the per-generation unit heat rates that the MCOE routines use, it ought to be possible to estimate the net to gross ratois. Butit might be pretty dependent on the capacity factor or duty cycle of the generators. We could make the estimate on a monthly basis maybe. Or we could try and regress out the relationship between capacity factor (or number of startup/shutdown events) and the net-to-gross ratio.

@grgmiller
Copy link
Collaborator

From what I've seen in the raw 923 files, it looks like all of the reported MWh data is net generation (or is at least labeled as such in the column headers). The only plant level gross generation data I've been able to find is in CEMS.

@zaneselvans
Copy link
Member Author

Hmm, okay if that's the case then probably you'll need to use the fuel heat content -- it's reported in both CEMS, and in the generation_fuel_eia923 table, broken down into "fuel for electricity" and... other fuel. Which I assume (?) indicates how much fuel is going to parasitic loads, if the generation listed there really is net.

@gschivley
Copy link
Contributor

The EIA923 Schedules_6_7 file has annual gross generation, station use, direct use, incoming electricity, etc. Might be helpful for estimating average plant-level ratios of gross to net generation.

@gschivley
Copy link
Contributor

This NBER paper is also a good read for method ideas https://www.nber.org/papers/w23053.pdf

@zaneselvans
Copy link
Member Author

Aaaaah, that's right there are other files! We're generalizing the spreadsheet extraction process and will map all of the files to get at this data.

@grgmiller
Copy link
Collaborator

Reading the user manual for EPA's AVERT tool, they state:
"Gross generation [from CAMD] is converted to net generation within the preprocessing engine using unit-specific parasitic loss factors. These factors were calculated based on a comparison of by-plant gross generation [as reported in CAMD] and by-plant net generation [from EIA-923] using 2015 data. Different loss factors are used for coal-fired steam units with and without sulfur controls (8.3% and 6.9%, respectively); natural gasfired combined cycle units (3.3%) and combustion turbines (2.2%); and natural gas- or oil-fired steam units (7.7%). For example, a sulfur-controlled coal steam unit with an annual gross generation of 100 GWh is assumed to export a total of 91.7 GWh to the grid, while a natural gasfired conbined cycle unit with the same gross generation is assumed to export 96.7 GWh."

It seems like here EPA may simply be calculating a ratio between the two numbers, but it would probably make sense to perform a regression that takes into account the weighted capacity factor, although it might be hard to apply any regression to interpolate an hour-specific parasitic loss factor if the hourly capacity factor falls outside the range of monthly-weighted capacity factors in the regression

@karldw
Copy link
Contributor

karldw commented Feb 7, 2020

A non-paywalled version of the Cicala paper is here, with appendix.

@grgmiller
Copy link
Collaborator

Interesting. It looks like the data in EIA Schedules 6 and 7 would be quite useful, although it looks like the set of plants in schedule 6/7 (n=5215) is smaller than the set of plants in the other schedules (n=8714). Still this would be a good starting point. I'll take a look at the Cicala paper and his method for what he calls "net-to-gross ratios", and report back about conversion factors

@grgmiller
Copy link
Collaborator

While working on this, I just wanted to highlight an observation about the data included in CEMS (which perhaps was already obvious to others looking at this data, but just wanted to be sure to post):

The gross_load_mw data is not generation(mwh), but load (mw), so to get the gross generation numbers to convert to net generation, you first have to multiply gross_load_mw by operating_time_hours to create a new column, which I call gross_generation_mwh. If you look at the data, a generator that has the same gross load in two hours but operates for 0.5 hours in one of those hours will have about half the heat input in that hour.

@cmgosnell cmgosnell added the output Exporting data from PUDL into other platforms or interchange formats. label Sep 2, 2021
@karldw
Copy link
Contributor

karldw commented Sep 6, 2022

Congrats on implementing this! It's been a while since I've thought about gross-to-net conversion, so I'm not sure I have anything helpful to add, but your write-up is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-cleaning Tasks related to cleaning & regularizing data during ETL. epacems Integration and analysis of the EPA CEMS dataset. output Exporting data from PUDL into other platforms or interchange formats.
Projects
None yet
Development

No branches or pull requests

5 participants