Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding different missing FERC Respondent IDs in different contexts #1304

Open
zaneselvans opened this issue Oct 26, 2021 · 0 comments
Open
Labels
bug Things that are just plain broken. ferc1 Anything having to do with FERC Form 1 glue PUDL specific structures & metadata. Stuff that connects datasets together.

Comments

@zaneselvans
Copy link
Member

The results of running

pudl.glue.ferc1_eia.get_unmapped_utils_ferc1(ferc1_engine)

differ depending on whether the FERC 1 DB contains just the 2020 data (as it does in the tests) vs. all of the years of data (as it does when we are trying to identify new unmapped utilities).

When run against a FERC 1 DB that only contains 2020 data, it finds a utility with respondent_id=542 that has no data associated with it -- it only exists in the f1_respondents table and all its fields there are NA, including respondent_name which ought to be Missing Respondent 542 if we were doing things right.

Now that the missing utility has been added to the mapping spreadsheet, it's not causing the tests to fail, but there's something wrong with how we are filling in missing utilities that we should probably fix.

@zaneselvans zaneselvans added bug Things that are just plain broken. ferc1 Anything having to do with FERC Form 1 labels Oct 26, 2021
zaneselvans added a commit that referenced this issue Oct 26, 2021
The full ETL with all FERC1 and EIA 860/923 data will run without
obvious errors. There are still tests and validations that fail, but at
least you can load the DB.

This does *not* include eia860m or EPA CEMS data yet. FERC-714 and
EIA-861 also remain to be updated for 2020.

Issues that remain:
* Something screwy is going on with FERC respondent 542 -- it shows up
  only in the `f1_respondent_id` table, and has all Null data there...
  and our unmapped utility finder script failed to identify it.  See
  #1304
* A defensive assertion aimed at identifying human errors in the ID
  mapping sheet is failing because (probably?) we have a fair number of
  plants and utilities with IDs but no names in there now. See #1305 and
  also #1232
@cmgosnell cmgosnell added the glue PUDL specific structures & metadata. Stuff that connects datasets together. label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things that are just plain broken. ferc1 Anything having to do with FERC Form 1 glue PUDL specific structures & metadata. Stuff that connects datasets together.
Projects
None yet
Development

No branches or pull requests

2 participants