Skip to content

Unpacking the U.S. Senate Lobbying Disclosure databases

Notifications You must be signed in to change notification settings

datahoarder/senate_lobbying

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Senate Lobbying stuff

Test

(just want to try out xmltodict)

from shutil import unpack_archive
from os.path import basename
import requests
import xmltodict
import json
from glob import glob

url = 'http://soprweb.senate.gov/downloads/2015_1.zip'
fname = basename(url)
resp = requests.get(url)
with open(fname, 'wb') as f:
    f.write(resp.content)
unpack_archive(fname)

xmlfiles = glob("*.xml")
xname = xmlfiles[0]
xmltxt = open(xname, 'r', encoding='utf-16').read()
# namespacedata = xmltodict.parse(xmltxt, process_namespaces=True)

data = xmltodict.parse(xmltxt)
jname = xname + '.json'
with open(jname, 'w') as j:
  j.write(json.dumps(data, indent=2))

About

Unpacking the U.S. Senate Lobbying Disclosure databases

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published