-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Dear Congress.gov API Team,
We are currently using the Congress.gov API to ingest structured metadata for CRS (Congressional Research Service) reports. The API endpoints for structured data function as expected and meet our requirements.
However, we are encountering issues when attempting to programmatically retrieve the associated PDF and HTML documents referenced in the API responses. The URLs provided point to the congress.gov domain, which appears to be protected by Cloudflare. Automated requests to download these files are blocked or challenged, preventing reliable programmatic access.
We would appreciate clarification on the following:
Is there an officially supported mechanism for programmatic retrieval of CRS PDF and HTML documents?
Are alternative endpoints, bulk download options, or authenticated access methods available for accessing these materials?
If access to congress.gov-hosted documents is intentionally restricted, is there a recommended approach for compliant automated retrieval?
Our use case involves systematic ingestion of publicly available CRS materials for research and indexing purposes. We are prepared to adhere to any applicable rate limits, authentication requirements, or usage policies.
We would appreciate your guidance on how best to proceed.