Professional Python SDK for the DataQuery API - High-performance data access with parallel downloads, time series queries, and seamless OAuth 2.0 authentication.
- High-Performance Downloads: Parallel file downloads with automatic retry and progress tracking
- Time Series Queries: Query data by expressions, instruments, or groups with flexible filtering
- OAuth 2.0 Authentication: Automatic token management and refresh
- Connection Pooling: Optimized HTTP connections with configurable rate limiting
- Pandas Integration: Direct conversion to DataFrames for analysis
- Async & Sync APIs: Use async/await or synchronous methods based on your needs
pip install dataquery-sdkSet your API credentials as environment variables:
export DATAQUERY_CLIENT_ID="your_client_id"
export DATAQUERY_CLIENT_SECRET="your_client_secret"Or create a .env file in your project directory:
DATAQUERY_CLIENT_ID=your_client_id
DATAQUERY_CLIENT_SECRET=your_client_secretSynchronous (Python Scripts)
from dataquery import DataQuery
# Download all files for a date range
with DataQuery() as dq:
results = dq.run_group_download(
group_id="JPMAQS_GENERIC_RETURNS",
start_date="20250101",
end_date="20250131",
destination_dir="./data"
)
print(f"Downloaded {results['successful_downloads']} files")Asynchronous (Jupyter Notebooks)
from dataquery import DataQuery
# Download all files for a date range
async with DataQuery() as dq:
results = await dq.run_group_download_async(
group_id="JPMAQS_GENERIC_RETURNS",
start_date="20250101",
end_date="20250131",
destination_dir="./data"
)
print(f"Downloaded {results['successful_downloads']} files")from dataquery import DataQuery
async with DataQuery() as dq:
# Query by expression
result = await dq.get_expressions_time_series_async(
expressions=["DB(MTE,IRISH EUR 1.100 15-May-2029 LON,,IE00BH3SQ895,MIDPRC)"],
start_date="20240101",
end_date="20240131"
)
# Convert to pandas DataFrame
df = dq.to_dataframe(result)
print(df.head())from dataquery import DataQuery
async with DataQuery() as dq:
# List all available groups
groups = await dq.list_groups_async(limit=100)
# Convert to DataFrame for easy viewing
groups_df = dq.to_dataframe(groups)
print(groups_df[['group_id', 'group_name', 'description']])from dataquery import DataQuery
from pathlib import Path
async with DataQuery() as dq:
result = await dq.download_file_async(
file_group_id="JPMAQS_GENERIC_RETURNS",
file_datetime="20250115",
destination_path=Path("./downloads")
)
print(f"Downloaded: {result.local_path}")async with DataQuery() as dq:
# Get time series for Ireland bonds only
result = await dq.get_group_time_series_async(
group_id="FI_GO_BO_EA",
attributes=["MIDPRC", "REPO_1M"],
filter="country(IRL)",
start_date="20240101",
end_date="20240131"
)
df = dq.to_dataframe(result)async with DataQuery() as dq:
# Search for instruments by keywords
results = await dq.search_instruments_async(
group_id="FI_GO_BO_EA",
keywords="irish"
)
# Use the results to query time series
instrument_ids = [inst.instrument_id for inst in results.instruments[:5]]
data = await dq.get_instrument_time_series_async(
instruments=instrument_ids,
attributes=["MIDPRC"],
start_date="20240101",
end_date="20240131"
)async with DataQuery() as dq:
# Download multiple files concurrently with parallel chunks
results = await dq.run_group_download_async(
group_id="JPMAQS_GENERIC_RETURNS",
start_date="20250101",
end_date="20250131",
destination_dir="./data",
max_concurrent=5, # Download 5 files simultaneously
num_parts=4 # Split each file into 4 parallel chunks
)Recommended Settings:
max_concurrent: 3-5 (concurrent file downloads)num_parts: 2-8 (parallel chunks per file)
Configure rate limits to avoid API throttling:
from dataquery import DataQuery, ClientConfig
config = ClientConfig(
client_id="your_client_id",
client_secret="your_client_secret",
rate_limit_rpm=300, # Requests per minute
max_retries=3,
timeout=60.0
)
async with DataQuery(config=config) as dq:
# Your code here
pass# Required
DATAQUERY_CLIENT_ID=your_client_id
DATAQUERY_CLIENT_SECRET=your_client_secret
# Optional - API Endpoints
DATAQUERY_BASE_URL=https://api-developer.jpmorgan.com
DATAQUERY_FILES_BASE_URL=https://api-strm-gw01.jpmchase.com
# Optional - Performance
DATAQUERY_MAX_RETRIES=3
DATAQUERY_TIMEOUT=60
DATAQUERY_RATE_LIMIT_RPM=300from dataquery import DataQuery, ClientConfig
config = ClientConfig(
client_id="your_client_id",
client_secret="your_client_secret",
base_url="https://api-developer.jpmorgan.com",
max_retries=3,
timeout=60.0,
rate_limit_rpm=300
)
async with DataQuery(config=config) as dq:
# Your code here
passfrom dataquery import DataQuery
from dataquery.exceptions import (
DataQueryError,
AuthenticationError,
NotFoundError,
RateLimitError
)
async def safe_query():
try:
async with DataQuery() as dq:
result = await dq.get_expressions_time_series_async(
expressions=["DB(...)"],
start_date="20240101",
end_date="20240131"
)
return result
except AuthenticationError as e:
print(f"Authentication failed: {e}")
except NotFoundError as e:
print(f"Resource not found: {e}")
except RateLimitError as e:
print(f"Rate limit exceeded: {e}")
except DataQueryError as e:
print(f"API error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")start_date="20240101" # YYYYMMDD format
end_date="20241231"start_date="TODAY" # Today
start_date="TODAY-1D" # Yesterday
start_date="TODAY-1W" # 1 week ago
start_date="TODAY-1M" # 1 month ago
start_date="TODAY-1Y" # 1 year ago| Calendar | Description | Use Case |
|---|---|---|
CAL_WEEKDAYS |
Monday-Friday | International data (recommended) |
CAL_USBANK |
US banking days | US-only data (default) |
CAL_WEEKDAY_NOHOLIDAY |
All weekdays | Generic business days |
CAL_DEFAULT |
Calendar day | Include weekends |
The examples/ directory contains comprehensive examples:
- File Downloads: Single file, batch downloads, availability checks
- Time Series: Expressions, instruments, groups with filters
- Discovery: Search instruments, list groups, get attributes
- Advanced: Grid data, auto-download, custom progress tracking
Run an example:
python examples/files/download_file.py
python examples/expressions/get_expressions_time_series.pyThe SDK includes a command-line interface:
# Download files
dataquery download --group-id JPMAQS_GENERIC_RETURNS \
--start-date 20250101 \
--end-date 20250131 \
--destination ./data
# List groups
dataquery list-groups --limit 100
# Check file availability
dataquery check-availability --file-group-id JPMAQS_GENERIC_RETURNS \
--date 20250115File Downloads
download_file_async()- Download a single filerun_group_download_async()- Download all files in a date rangelist_available_files_async()- Check file availability
Time Series Queries
get_expressions_time_series_async()- Query by expressionget_instrument_time_series_async()- Query by instrument IDget_group_time_series_async()- Query entire group with filters
Discovery
list_groups_async()- List available data groupssearch_instruments_async()- Search for instrumentslist_instruments_async()- List all instruments in a groupget_group_attributes_async()- Get available attributesget_group_filters_async()- Get available filters
Utilities
to_dataframe()- Convert any response to pandas DataFramehealth_check_async()- Check API healthget_stats()- Get connection and rate limit statistics
For detailed API documentation, see the API Reference.
- Python 3.10 or higher
- Dependencies:
aiohttp>=3.8.0- Async HTTP clientpydantic>=2.0.0- Data validationstructlog>=23.0.0- Structured loggingpython-dotenv>=1.0.0- Environment variable management
Optional:
pandas>=2.0.0- For DataFrame conversion
# Clone the repository
git clone https://github.com/dataquery/dataquery-sdk.git
cd dataquery-sdk
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=dataquery --cov-report=html
# Run specific test file
pytest tests/test_client.py -v# Format code
black dataquery/ tests/
# Check linting
flake8 dataquery/ tests/ examples/
# Type checking
mypy dataquery/Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- GitHub Issues: Report a bug
- Documentation: Read the docs
- Email: support@dataquery.com
See CHANGELOG.md for version history and release notes.