Feng Lab LLM API

A unified Python library for inference across multiple Large Language Model providers. Built for research workflows, this library provides a consistent interface for OpenAI, AWS Bedrock, and Azure OpenAI (Versa) models with built-in caching, batch processing, and error tracking.

Features

Multi-provider support: Unified API for OpenAI, AWS Bedrock (Claude, Llama, Cohere, Qwen), and Azure OpenAI
Response caching: DuckDB-based caching to avoid redundant API calls and reduce costs
Batch processing: Async batch inference with configurable concurrency
Structured output: Pydantic model validation for enforcing response schemas
Error tracking: Comprehensive error classification and JSONL logging for debugging
Reasoning model support: Special handling for GPT-5 reasoning models with effort/verbosity parameters

Installation

pip install lab-llm

Or install from source:

git clone https://github.com/jjfenglab/llm-api.git
cd llm-api
pip install -e .

Quick Start

import asyncio
from dotenv import load_dotenv

from lab_llm.llm_api import LLMApi
from lab_llm.constants import OpenAi, LLMModel
from lab_llm.dataset import TextDataset
from lab_llm.llm_cache import LLMCache
from lab_llm.error_callback_handler import ErrorCallbackHandler
from lab_llm.duckdb_handler import DuckDBHandler

load_dotenv()

# Initialize components
db_handle = DuckDBHandler("./cache.db")
cache = LLMCache(db_handle)
model = LLMModel(name=OpenAi.GPT4_O_MINI)

# Create API instance
api = LLMApi(cache=cache, seed=42, model_type=model)

# Single prompt
response = api.get_output("What is the capital of France?")
print(response)

# Batch processing
dataset = TextDataset(["What is 2+2?", "What is the speed of light?"])
responses = asyncio.run(api.get_outputs(dataset))
print(responses)

Configuration

Create a .env file in your project directory with the required credentials:

# For OpenAI models
OPENAI_ACCESS_TOKEN=your_openai_api_key

# For AWS Bedrock models (Claude, Llama, Cohere, Qwen)
BEDROCK_ACCESS_KEY=your_aws_access_key
BEDROCK_ACCESS_KEY_SECRET=your_aws_secret_key

# For Azure OpenAI / Versa models
VERSA_API_KEY=your_versa_api_key
VERSA_ENDPOINT=https://your-endpoint.openai.azure.com/openai/deployments/<model_name>/chat/completions?api-version=2024-10-21

Load environment variables in your code:

from dotenv import load_dotenv
load_dotenv()

Supported Models

OpenAI (Direct API)

OpenAi.GPT4_O - GPT-4o
OpenAi.GPT4_O_MINI - GPT-4o Mini
OpenAi.GPT5 - GPT-5 (reasoning model)
OpenAi.GPT5_MINI - GPT-5 Mini (reasoning model)
OpenAi.GPT5_NANO - GPT-5 Nano (reasoning model)

Azure OpenAI (Versa)

VersaOpenAi.GPT4_O_2024_08 - GPT-4o (August 2024)
VersaOpenAi.GPT4_O_MINI_2024_07 - GPT-4o Mini (July 2024)
VersaOpenAi.GPT5_2025_08 - GPT-5 (August 2025)
And more...

AWS Bedrock

Claude.HAIKU_3 - Claude 3 Haiku
Claude.HAIKU_3_5 - Claude 3.5 Haiku
Claude.SONNET_4_5 - Claude Sonnet 4.5
Meta.LLAMA_3_3_70B - Llama 3.3 70B
Meta.LLAMA_3_2_11B - Llama 3.2 11B
Cohere.COMMAND_R - Command R
Qwen.QWEN_3_235 - Qwen 3 235B

For the complete list, see lab_llm/constants.py.

Usage Examples

Using Structured Output (Pydantic)

from pydantic import BaseModel

class Answer(BaseModel):
    answer: str
    confidence: float

response = api.get_output(
    "What is 2+2?",
    response_model=Answer
)
print(response.answer, response.confidence)

Using Reasoning Models

from lab_llm.constants import OpenAi, LLMModel

model = LLMModel(name=OpenAi.GPT5)
api = LLMApi(
    cache=cache,
    model_type=model,
    reasoning_effort="medium",  # low, medium, high
    verbosity="concise"         # concise, detailed
)

Custom System Prompt

response = api.get_output(
    "Analyze this data",
    system_prompt="You are a data scientist specializing in statistical analysis."
)

Error Tracking

lab_llm provides error tracking to help debug failures during research workflows.

Quick Start

from lab_llm.error_tracker import ErrorTracker
from lab_llm.error_callback_handler import ErrorCallbackHandler

# Create error tracker (logs to JSONL file)
error_tracker = ErrorTracker("study_errors.jsonl")

# Pass to error handler
error_handler = ErrorCallbackHandler(logger, error_tracker=error_tracker)

# Use with LLMApi
llm_api = LLMApi(
    cache=cache,
    error_handler=error_handler,
    # ... other params
)

Analyzing Errors

import pandas as pd
from lab_llm.error_tracker import ErrorTracker

tracker = ErrorTracker("study_errors.jsonl")

# Get error summary
summary = tracker.get_summary()
print(summary)

# Analyze transient errors (should retry)
transient = tracker.get_transient_errors()

# Analyze permanent errors (need fixes)
permanent = tracker.get_permanent_errors()

# Investigate specific prompt
errors = tracker.get_errors_by_prompt(prompt_hash)

Error Categories:

transient: Timeouts, rate limits, network errors (will retry automatically)
permanent: Validation errors, serialization errors (need prompt/code fixes)
user_interrupt: Keyboard interrupts (stops execution)
unknown: Unclassified errors

For a complete example, see examples/analyze_failures.ipynb.

Development

Running Tests

Run all tests:

pytest tests/ -v

Run integration tests (requires API credentials in .env):

pytest tests/test_integration.py -v

Release Process

Update version in pyproject.toml
Add changes to CHANGELOG.md
Test installation: pip install -e .
Create a new release tag

License

MIT License - see LICENSE for details.

Citation

If you use this library in your research, please cite:

@software{feng_lab_llm,
  author = {Feng Lab, UCSF},
  title = {Feng Lab LLM API},
  url = {https://github.com/jjfenglab/llm-api}
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
examples		examples
lab_llm		lab_llm
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feng Lab LLM API

Features

Installation

Quick Start

Configuration

Supported Models

OpenAI (Direct API)

Azure OpenAI (Versa)

AWS Bedrock

Usage Examples

Using Structured Output (Pydantic)

Using Reasoning Models

Custom System Prompt

Error Tracking

Quick Start

Analyzing Errors

Development

Running Tests

Release Process

License

Citation

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors 3

Languages

License

jjfenglab/llm-api

Folders and files

Latest commit

History

Repository files navigation

Feng Lab LLM API

Features

Installation

Quick Start

Configuration

Supported Models

OpenAI (Direct API)

Azure OpenAI (Versa)

AWS Bedrock

Usage Examples

Using Structured Output (Pydantic)

Using Reasoning Models

Custom System Prompt

Error Tracking

Quick Start

Analyzing Errors

Development

Running Tests

Release Process

License

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 3

Languages

Packages