Skip to content

NickCrews/fec-dumps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fec-dumps

Cron job that publishes .parquet versions of the Schedule A and Schedule B tables from the weekly .dump backups from the Federal Election Commission's database.

See fecgov/FEC#13168 for motivation.

All of the tables are published as parquet files to the Hugging Face dataset https://huggingface.co/datasets/NickCrews/fec-dumps by a github action every week. In duckdb you can get the 1985-1986 Schedule A data with simply SELECT * FROM 'https://huggingface.co/datasets/NickCrews/fec-dumps/resolve/main/disclosure.fec_fitem_sched_a_1985_1986.parquet'! The action overwrites the files on main, so the url will stay stable over time, but the data will change every week.

Note that columns in the parquets are all strings. I may change this later, so your using code should be defensive for future changes.

License

I just repackaged the existing data FEC in an easier format, so the normal regulations on FEC data still apply.

All my code etc is MIT, you are free to modify and use as you wish, but with the data don't do something that the FEC doesn't like.

Methodology

This uses the https://github.com/NickCrews/pg_dumpster cli tool to extract the table data entries from the postgres .dump files that the FEC publishes each week to https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/index.html?prefix=bulk-downloads/data-dump/schedules/

About

Weekly-updated parquet exports of the FEC's Schedule A and B datasets

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages