Skip to content

feat(image-generation): add BFL (Black Forest Labs) provider#54

Merged
Kamilbenkirane merged 3 commits intomainfrom
providers/bfl
Dec 1, 2025
Merged

feat(image-generation): add BFL (Black Forest Labs) provider#54
Kamilbenkirane merged 3 commits intomainfrom
providers/bfl

Conversation

@Kamilbenkirane
Copy link
Member

Summary

Add support for BFL (Black Forest Labs) FLUX.2 image generation models:

  • flux-2-pro: Professional model with automatic prompt upsampling
  • flux-2-flex: Flexible model with configurable steps and guidance

Changes

  • Add BFL provider with polling-based async generation
  • Add BFL to Provider enum and credential system
  • Add new ImageGenerationParameter values: guidance, steps, seed, prompt_upsampling, safety_tolerance, output_format
  • Add billed_units field to ImageGenerationUsage for BFL cost tracking
  • Add BFL to integration tests (uses flux-2-pro with 1024x1024 for minimum cost)
  • Add BFL_API_KEY to CI workflow for publish integration tests

Files Changed

Area Files
Core src/celeste/core.py, src/celeste/credentials.py
Provider providers/bfl/ (client, config, models, parameters)
Shared io.py, parameters.py, models.py, providers/__init__.py
Tests test_generate.py
CI .github/workflows/publish.yml, .env.example

Test Plan

  • All 4 integration tests pass locally (BFL, OpenAI, Google, ByteDance)
  • Ruff, mypy, and bandit checks pass
  • Add BFL_API_KEY secret to GitHub repository for CI

Add support for BFL's FLUX.2 image generation models:
- flux-2-pro: Professional model with automatic prompt upsampling
- flux-2-flex: Flexible model with configurable steps and guidance

Changes:
- Add BFL provider with polling-based async generation
- Add BFL to Provider enum and credential system
- Add new ImageGenerationParameter values (guidance, steps, seed, etc.)
- Add billed_units field to ImageGenerationUsage
- Add BFL to integration tests and CI workflow
- Support BFL_API_KEY environment variable
@claude
Copy link

claude bot commented Dec 1, 2025

PR Review: BFL Provider Implementation

Overview

Comprehensive BFL FLUX.2 implementation. Well-structured code following existing patterns.

Critical Issues

1. Response Construction Bug (client.py:131-135)

httpx.Response does not accept json parameter. Current code will fail at runtime.

Fix: Use content parameter with JSON-encoded bytes

import json
content = json.dumps(final_data).encode('utf-8')
return httpx.Response(status_code=200, content=content, headers={'content-type': 'application/json'}, request=httpx.Request('GET', polling_url))

2. Missing Input Validation (parameters.py:26-32)

AspectRatioMapper lacks validation before indexing split result.

Fix: Add length check before indexing parts array

Should Fix

3. API Key Pattern (client.py:80, 105)

Use consistent pattern with AUTH_HEADER_PREFIX even when empty for maintainability

4. Timeout Configuration (config.py:12)

120s polling timeout may conflict with httpx default 60s timeout. Needs documentation or fix.

5. Test Coverage

Add unit tests for parameter mappers, polling timeout, error handling

Security ✅

  • Proper SecretStr usage
  • No injection risks
  • Type-safe structures

Performance ✅

  • Async/await correctly used
  • Connection pooling inherited
  • Reasonable 0.5s polling interval

Summary

Solid implementation with good architecture. Fix critical bugs (#1, #2) before merge. Address validation and consistency issues (#3, #4, #5) for production quality.

Great work overall!

@claude
Copy link

claude bot commented Dec 1, 2025

Additional Review Details

Code Quality Observations

Strengths:

  • Clean separation: client, config, models, parameters files
  • Comprehensive parameter support: guidance, steps, seed, safety_tolerance, output_format, prompt_upsampling
  • Good model constraints with aspect ratio presets (1024x1024, 1920x1080, etc.)
  • Proper async polling implementation with configurable intervals

Detailed Issues:

Issue #1: Response Construction (CRITICAL)

File: packages/image-generation/src/celeste_image_generation/providers/bfl/client.py:131-135, 137-141

The httpx.Response() constructor signature is:
Response(status_code, headers=None, content=b'', ...)

Your code passes json=final_data which is not a valid parameter. This will raise TypeError when the polling completes.

Working fix:

final_data = {**poll_data, "_submit_metadata": submit_data}
content = json.dumps(final_data).encode('utf-8')
return httpx.Response(
    status_code=200,
    content=content,
    headers={"content-type": "application/json"},
    request=httpx.Request("GET", polling_url),
)

Issue #2: Aspect Ratio Validation (HIGH)

File: packages/image-generation/src/celeste_image_generation/providers/bfl/parameters.py:26-28

Current code assumes split always returns 2 parts:

parts = validated_value.split("x")
width = int(parts[0])  # IndexError if only 1 part
height = int(parts[1])  # IndexError if only 1 part

Add validation:

parts = validated_value.split("x")
if len(parts) != 2:
    raise ValueError(f"Invalid aspect_ratio: {validated_value}. Expected WIDTHxHEIGHT")
try:
    width = int(parts[0])
    height = int(parts[1])
except ValueError as e:
    raise ValueError(f"Invalid dimensions in aspect_ratio: {validated_value}") from e

Issue #3: Polling Timeout vs HTTP Timeout

File: packages/image-generation/src/celeste_image_generation/providers/bfl/config.py:12

POLLING_TIMEOUT = 120.0 but http_client default timeout is 60.0 seconds. Individual GET requests in polling loop may timeout before the 120s polling timeout.

Consider passing timeout parameter to http_client.get() calls or document this behavior.

Test Coverage Gaps

  • No unit tests for new code (0% coverage for bfl/ directory)
  • Integration test only covers flux-2-pro, not flux-2-flex
  • No tests for model-specific parameters (steps, guidance)
  • No error case testing (timeout, malformed responses)
  • No validation testing for AspectRatioMapper rounding logic

Minor: API Key Header Pattern

While functionally correct, using consistent pattern improves maintainability:

# Current (works but inconsistent)
config.AUTH_HEADER_NAME: self.api_key.get_secret_value()

# Recommended (consistent with other providers)
config.AUTH_HEADER_NAME: f"{config.AUTH_HEADER_PREFIX}{self.api_key.get_secret_value()}"

All other providers use this pattern even when prefix is empty.

@claude
Copy link

claude bot commented Dec 1, 2025

Positive Highlights & Final Recommendations

What Works Really Well ✅

  1. Architecture alignment: Perfectly follows the Celeste provider pattern
  2. Type safety: Full type annotations, will pass mypy strict checks
  3. Error messages: Clear, descriptive error messages with provider context
  4. Polling design: Smart two-phase approach (submit + poll) handles BFL async API well
  5. Parameter completeness: All BFL-specific parameters properly mapped
  6. Dimension constraints: Excellent preset definitions for common use cases
  7. Clean imports: Proper all exports, no circular dependencies

Integration Quality

The provider integrates seamlessly:

  • ✅ Added to Provider enum (core.py)
  • ✅ Credentials system updated (credentials.py)
  • ✅ Models exported properly (models.py)
  • ✅ Client registered in providers/init.py
  • ✅ CI workflow configured
  • ✅ .env.example updated

Recommended Action Plan

Before Merge (Required):

  1. Fix httpx.Response construction (add json import, use content parameter)
  2. Add validation to AspectRatioMapper (check len(parts) == 2)
  3. Add try/except for int() conversions in AspectRatioMapper

Before Merge (Strongly Recommended):
4. Document timeout interaction or pass timeout to http_client.get()
5. Standardize API key header construction for consistency
6. Add at least one unit test for AspectRatioMapper rounding logic

Post-Merge (Nice to Have):
7. Add integration test for flux-2-flex model
8. Add unit tests for error cases and polling timeout
9. Consider exponential backoff for polling (e.g., 0.5s, 1s, 2s, 2s...)
10. Document parameter defaults behavior

Estimated Fix Time

Final Verdict

Approve with changes required. The implementation is architecturally sound and shows strong understanding of the codebase patterns. The critical issues are minor bugs that are easy to fix. Once addressed, this will be a high-quality addition to Celeste.

Great work @Kamilbenkirane! 🎉

- celeste-ai: 0.2.11 → 0.2.12
- celeste-image-generation: 0.2.9 → 0.2.10
@claude
Copy link

claude bot commented Dec 1, 2025

Code Review: BFL (Black Forest Labs) Provider Implementation

Thank you for this well-structured contribution! This is a solid implementation that follows the existing codebase patterns. Below is my detailed review:


Strengths

Code Quality

  • Excellent consistency: Follows existing provider patterns (OpenAI, Google, ByteDance) closely
  • Clean separation of concerns: Well-organized into client, config, models, and parameters modules
  • Type safety: Proper use of type hints throughout
  • Documentation: Good docstrings explaining the purpose of each component

Architecture

  • Polling pattern: The async polling implementation in client.py:102-142 is well-designed for BFL's API
  • Parameter mapping: Comprehensive parameter mappers for all BFL-specific parameters
  • Model constraints: Excellent use of constraint system with clear dimension limits and aspect ratio presets

🔍 Issues & Recommendations

Critical: httpx.Response Construction (client.py:129-134)

Issue: The manual construction of httpx.Response objects may be incompatible with httpx's internal requirements.

return httpx.Response(
    status_code=200,
    json=final_data,  # ❌ 'json' is not a valid constructor parameter
    request=httpx.Request("GET", polling_url),
)

Problem:

  • httpx.Response constructor doesn't accept a json parameter
  • This will cause a runtime error when the code executes
  • The pattern used differs from test mocking patterns elsewhere in the codebase

Recommended fix:

import json as json_module

# For successful response
return httpx.Response(
    status_code=200,
    content=json_module.dumps(final_data).encode('utf-8'),
    headers={"Content-Type": "application/json"},
    request=httpx.Request("GET", polling_url),
)

# For error response  
return httpx.Response(
    status_code=400,
    content=json_module.dumps(poll_data).encode('utf-8'),
    headers={"Content-Type": "application/json"},
    request=httpx.Request("GET", polling_url),
)

References:

  • See tests/unit_tests/test_client.py:199-202 for the correct pattern
  • httpx.Response expects content: bytes, not json: dict

High: Error Handling Consistency

Issue: Mixed exception types not aligned with codebase conventions (client.py:57, 101, 113)

raise ValueError(msg)  # Line 57, 101
raise TimeoutError(msg)  # Line 113

Recommendation: Use exceptions from celeste.exceptions for consistency:

  • ValueErrorValidationError (imported from celeste.exceptions)
  • Consider if TimeoutError should be a custom Celeste exception

Example from ByteDance provider (bytedance/client.py:79):

from celeste.exceptions import ValidationError

if not sample_url:
    msg = f"No image URL in {self.provider} response"
    raise ValidationError(msg)

Medium: Missing Error Response Handling

Issue: Non-200 responses from submit endpoint return immediately without parsing (client.py:92-93)

if submit_response.status_code != 200:
    return submit_response  # ❌ Error not handled by caller

Problem: The base client's _handle_error_response expects proper response formatting, but early returns may bypass this.

Recommendation: Either:

  1. Parse BFL-specific error format before returning, or
  2. Ensure the response format is compatible with base client error handling

Medium: Missing NULL Safety Check

Issue: polling_url is accessed without checking if submit_data parse succeeded (client.py:96-98)

submit_data = submit_response.json()  # Could raise JSONDecodeError
polling_url = submit_data.get("polling_url")

if not polling_url:
    msg = f"No polling_url in {self.provider} response"
    raise ValueError(msg)

Recommendation: Wrap in try-except or check if response is valid JSON first:

try:
    submit_data = submit_response.json()
except json.JSONDecodeError as e:
    msg = f"{self.provider} returned invalid JSON: {submit_response.text}"
    raise ValidationError(msg) from e

Low: Magic Numbers

Issue: Hardcoded configuration values in models.py could be constants

min_pixels=64 * 64,  # 4,096
max_pixels=2048 * 2048,  # 4,194,304 (4MP)
min_aspect_ratio=9 / 21,  # ~0.429
max_aspect_ratio=21 / 9,  # ~2.333

Recommendation: Extract to config.py for maintainability:

# config.py
MIN_PIXELS = 64 * 64  # 4,096
MAX_PIXELS = 2048 * 2048  # 4,194,304 (4MP)
MIN_ASPECT_RATIO = 9 / 21  # ~0.429
MAX_ASPECT_RATIO = 21 / 9  # ~2.333

Low: Potential Race Condition

Issue: Polling loop doesn't handle connection errors (client.py:114-120)

poll_response = await self.http_client.get(
    polling_url,
    headers=poll_headers,
)

Recommendation: Consider adding retry logic for transient network errors:

try:
    poll_response = await self.http_client.get(...)
except httpx.RequestError as e:
    # Log and retry or fail based on elapsed time
    if elapsed >= config.POLLING_TIMEOUT - 5:
        raise
    await asyncio.sleep(config.POLLING_INTERVAL)
    continue

🔒 Security Review

Secure Practices Confirmed

  • API keys properly handled via SecretStr (credentials.py:35)
  • No hardcoded secrets or credentials
  • Proper use of environment variables
  • Custom auth header (x-key) correctly configured

⚠️ Consideration

  • Polling timeout (120s) seems reasonable but consider if this could be DOS vector if many requests are polling simultaneously. Current implementation is acceptable.

🧪 Test Coverage

Good

  • Integration test added (test_generate.py:11)
  • Uses cost-effective test case (1024x1024)
  • Follows existing test patterns

📝 Suggestions for Future Work

Consider adding unit tests for:

  • Polling timeout behavior
  • Error response parsing
  • Parameter validation edge cases
  • Aspect ratio rounding logic (width/height to nearest 16)

📊 Performance Considerations

Polling Efficiency

  • 0.5s interval is reasonable (config.py:12)
  • 120s timeout appropriate for image generation
  • Consider: Add exponential backoff if BFL recommends it

Memory

  • No obvious memory leaks
  • Polling loop properly awaits asyncio.sleep()

🎯 Best Practices Alignment

Follows Celeste Patterns

  • Parameter mapper architecture matches existing providers
  • Consistent with OpenAI/Google/ByteDance implementations
  • Proper use of Provider enum and credentials system
  • Dimension constraints follow Celeste convention

Python Best Practices

  • Type hints throughout
  • Proper use of async/await
  • Clean imports and module organization

📝 Minor Suggestions

  1. parameters.py:31-32: Rounding logic could have a comment explaining BFL's 16-pixel requirement

    # BFL API requires dimensions to be multiples of 16
    width = ((width + 8) // 16) * 16
  2. client.py:41-44: Consider extracting metadata merge logic into helper method

  3. models.py: Consider if preset names should be more concise (e.g., "Square 1K" → "1K Square")


🎉 Summary

This is a high-quality implementation that demonstrates good understanding of the Celeste architecture. The main blocking issue is the httpx.Response construction bug which will cause runtime failures.

Required Changes:

  1. ✅ Fix httpx.Response construction (use content instead of json)

Strongly Recommended:

  1. ⚠️ Use ValidationError instead of ValueError for consistency
  2. ⚠️ Add error handling for JSON parsing failures

Nice to Have:

  1. 💡 Extract magic numbers to config constants
  2. 💡 Add network error handling in polling loop
  3. 💡 Add inline comments for non-obvious logic

Once the critical httpx.Response issue is resolved, this will be ready to merge! Great work overall. 🚀


Additional Context:

  • This review was based on comparison with existing providers (OpenAI, Google, ByteDance)
  • All findings were verified against the Celeste codebase patterns
  • Test coverage appears minimal but consistent with existing provider tests

httpx.Response doesn't accept json parameter - use content with JSON-encoded bytes instead.
@claude
Copy link

claude bot commented Dec 1, 2025

PR Review: BFL (Black Forest Labs) Provider Implementation

Overview

This PR adds support for BFL's FLUX.2 image generation models with a polling-based async implementation. The code is well-structured and follows the existing patterns in the codebase.

✅ Strengths

Code Quality

  • Consistent Architecture: Follows the established provider pattern perfectly, mirroring OpenAI/ByteDance implementations
  • Clean Separation of Concerns: Config, models, parameters, and client are properly separated
  • Type Safety: Good use of type hints throughout (dict[str, Any], Unpack, proper return types)
  • Documentation: Clear docstrings for all classes and methods
  • Error Messages: Informative error messages with context (e.g., includes provider name)

Implementation Details

  • Polling Logic: Well-implemented async polling with proper timeout handling
  • Response Handling: Clever use of _submit_metadata to preserve cost information across polling lifecycle
  • Parameter Mapping: Comprehensive mappers for all 7 BFL-specific parameters with proper validation
  • Model Constraints: Detailed dimension constraints with helpful presets for common aspect ratios

Testing & Integration

  • Integration Test: Added BFL to parametrized test suite, maintaining consistency
  • CI/CD: Properly configured with BFL_API_KEY secret
  • Version Bumps: Appropriately bumped package versions

🔍 Potential Issues & Suggestions

1. Polling Loop - Potential Resource Issue (Medium Priority)

Location: packages/image-generation/src/celeste_image_generation/providers/bfl/client.py:110-146

The polling loop creates a new HTTP request every 0.5 seconds without any backoff strategy or rate limiting consideration.

Issues:

  • Fixed 0.5s interval may be too aggressive for BFL's API
  • No exponential backoff for failed polls
  • Could hit rate limits during high usage

Suggestion:

# Consider exponential backoff or jittered polling
import random

poll_interval = config.POLLING_INTERVAL
while True:
    # ... existing timeout check ...
    
    poll_response = await self.http_client.get(polling_url, headers=poll_headers)
    
    # ... existing status checks ...
    
    # Add jitter to avoid thundering herd
    await asyncio.sleep(poll_interval + random.uniform(0, 0.1))
    # Or exponential backoff: poll_interval = min(poll_interval * 1.5, 5.0)

2. Missing Error Handling for Non-200 Poll Responses (Medium Priority)

Location: packages/image-generation/src/celeste_image_generation/providers/bfl/client.py:121-122

When poll_response.status_code != 200, the code returns the error response immediately. However, this might not distinguish between retryable errors (503, 429) and permanent failures (400, 404).

Suggestion:
Consider adding retry logic for transient errors:

if poll_response.status_code in (429, 503):  # Rate limit or service unavailable
    await asyncio.sleep(poll_interval * 2)  # Backoff
    continue
elif poll_response.status_code != 200:
    return poll_response  # Permanent failure

3. Aspect Ratio Rounding Edge Case (Low Priority)

Location: packages/image-generation/src/celeste_image_generation/providers/bfl/parameters.py:30-32

The rounding logic ((width + 8) // 16) * 16 always rounds up. For a value like 1023, this becomes 1024, which might unexpectedly change user intent.

Current behavior:

  • 1023x10231024x1024 (changes aspect ratio slightly)
  • 1009x10091024x1024 (significant change)

Suggestion:
Document this behavior or consider validating that input dimensions are already multiples of 16, raising a clear error if not.

4. Missing Unit Tests (Medium Priority)

The PR only includes integration tests. Consider adding unit tests for:

  • AspectRatioMapper rounding logic (especially edge cases)
  • Polling timeout behavior
  • Error response handling
  • Status transitions (Pending → Ready, Pending → Error)

Example:

# tests/unit_tests/providers/bfl/test_aspect_ratio.py
def test_aspect_ratio_rounds_to_multiple_of_16():
    mapper = AspectRatioMapper()
    result = mapper.map({}, "1023x1023", model)
    assert result["width"] == 1024
    assert result["height"] == 1024

5. Hardcoded Status Strings (Low Priority)

Location: packages/image-generation/src/celeste_image_generation/providers/bfl/client.py:67-71, 127-144

Status strings like "Ready", "Error", "Failed" are hardcoded in multiple places. Consider moving to config:

# config.py
STATUS_READY = "Ready"
STATUS_ERROR = "Error"
STATUS_FAILED = "Failed"

This improves maintainability if BFL changes their API.

🔒 Security Review

✅ Good Practices

  • API key properly handled via SecretStr and .get_secret_value()
  • No credentials logged or exposed in error messages
  • Proper use of environment variables

⚠️ Minor Consideration

The polling URL comes from the BFL API response. While this is expected, ensure the polling_url is validated to be from a trusted domain to prevent potential SSRF attacks:

polling_url = submit_data.get("polling_url")
if not polling_url or not polling_url.startswith(config.BASE_URL):
    msg = f"Invalid polling_url from {self.provider}"
    raise ValueError(msg)

🚀 Performance Considerations

✅ Good Choices

  • Async/await properly used throughout
  • time.monotonic() used for timeout (not affected by system clock changes)
  • Minimal memory overhead with streaming approach

💡 Optimization Opportunity

The synthetic httpx.Response object construction (lines 132-137) creates new Request/Response objects. This is fine but slightly verbose. Consider if the framework supports passing through the modified response data more directly.

📋 Test Coverage

Current Coverage

  • ✅ Integration test with flux-2-pro at 1024x1024
  • ✅ Tests the full generation flow end-to-end

Missing Coverage

  • ❌ No unit tests for parameter mappers
  • ❌ No tests for polling timeout scenario
  • ❌ No tests for error states (API errors, invalid responses)
  • ❌ No tests for flux-2-flex model-specific parameters (steps, guidance)

Recommendation: Add at least unit tests for the parameter mappers to prevent regressions.

📝 Documentation

✅ Well Documented

  • Clear PR description with summary and test plan
  • Inline comments for non-obvious logic (e.g., rounding to multiple of 16)
  • Model parameter constraints clearly defined

💡 Could Add

  • Example usage in docstring for BFLImageGenerationClient
  • Explanation of billed_units in the ImageGenerationUsage docstring
  • BFL-specific README in providers/bfl/ (following ByteDance pattern)

🎯 Summary

This is a high-quality implementation that integrates cleanly with the existing architecture. The code is well-structured, type-safe, and follows established patterns.

Recommended Actions Before Merge

  1. High Priority: Add validation for polling_url domain (security)
  2. Medium Priority: Consider adding exponential backoff to polling
  3. Medium Priority: Add unit tests for parameter mappers
  4. Nice to Have: Add retry logic for transient HTTP errors

Approval Status

Approved with minor suggestions - The code is production-ready. The suggestions above are optimizations and defensive measures that can be addressed in follow-up PRs if needed.

Great work on maintaining code consistency and quality! 🎉

@Kamilbenkirane Kamilbenkirane merged commit e6ee2f9 into main Dec 1, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant