Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 7% (0.07x) speedup for BookStackDataSource.export_book_pdf in backend/python/app/sources/external/bookstack/bookstack.py

⏱️ Runtime : 2.71 milliseconds 2.52 milliseconds (best of 269 runs)

📝 Explanation and details

Explanation of Optimizations:

  • app/sources/client/http/http_client.py.

    • Avoided recreating merged headers dictionary on every request when request.headers is empty.
    • Moved body type checks to minimize unnecessary operations.
    • Avoided unnecessary formatting operation when URL has no path params.
    • Slightly streamlined the selection of data/json/content for the request.
  • app/sources/client/http/http_response.py.

    • Cached .content property on first access to minimize repeated property lookup, which is especially beneficial with large response bodies and repeated bytes() calls.
  • app/sources/external/bookstack/bookstack.py.

    • Combined redundant assignments and inlined variables where possible.
    • Avoided copying HTTP headers unless needed.
    • Minimized repeated function lookups by localizing method references.
    • Used f-string instead of string concatenation and .format() for URL for better speed and readability.

These changes collectively provide minor runtime and memory gains and reduce per-request overhead.


Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 741 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import asyncio  # used to run async functions
# The function under test (EXACTLY as provided)
import base64
from typing import Dict, Union

import pytest  # used for our unit tests
from app.sources.external.bookstack.bookstack import BookStackDataSource

# Mocks and minimal stubs for dependencies

class DummyHTTPResponse:
    """A dummy HTTPResponse for testing purposes."""
    def __init__(self, content: bytes, content_type: str = "application/pdf"):
        self._content = content
        self.content_type = content_type

    def bytes(self) -> bytes:
        return self._content

class DummyHTTPClient:
    """A dummy HTTP client to simulate async HTTP requests."""
    def __init__(self, headers=None, should_raise=False, content=b"", content_type="application/pdf"):
        self.headers = headers or {"Authorization": "Token dummy:dummy"}
        self.should_raise = should_raise
        self._content = content
        self._content_type = content_type

    async def execute(self, request, **kwargs):
        if self.should_raise:
            raise Exception("Simulated HTTP error")
        return DummyHTTPResponse(self._content, self._content_type)

    def get_base_url(self):
        return "https://dummy.bookstack.com"

class DummyBookStackClient:
    """A dummy BookStackClient for testing."""
    def __init__(self, http_client):
        self._client = http_client

    def get_client(self):
        return self._client

# Minimal HTTPRequest stub
class HTTPRequest:
    def __init__(self, method, url, headers, query_params, body, path_params=None):
        self.method = method
        self.url = url
        self.headers = headers
        self.query_params = query_params
        self.body = body
        self.path_params = path_params or {}

# Minimal BookStackResponse stub
class BookStackResponse:
    def __init__(self, success, data=None, error=None):
        self.success = success
        self.data = data
        self.error = error
from app.sources.external.bookstack.bookstack import BookStackDataSource

# -------------------- UNIT TESTS --------------------

# 1. Basic Test Cases

@pytest.mark.asyncio
async def test_export_book_pdf_basic_success():
    """Test basic successful PDF export with valid book ID."""
    dummy_pdf = b"%PDF-1.4 dummy pdf content"
    http_client = DummyHTTPClient(content=dummy_pdf)
    client = DummyBookStackClient(http_client)
    datasource = BookStackDataSource(client)

    response = await datasource.export_book_pdf(42)
    decoded = base64.b64decode(response.data["content"].encode("utf-8"))

@pytest.mark.asyncio
async def test_export_book_pdf_basic_empty_pdf():
    """Test exporting a book that returns empty PDF bytes."""
    dummy_pdf = b""
    http_client = DummyHTTPClient(content=dummy_pdf)
    client = DummyBookStackClient(http_client)
    datasource = BookStackDataSource(client)

    response = await datasource.export_book_pdf(1)
    decoded = base64.b64decode(response.data["content"].encode("utf-8"))

@pytest.mark.asyncio
async def test_export_book_pdf_basic_non_pdf_content_type():
    """Test PDF export with a different content type."""
    dummy_pdf = b"dummy content"
    http_client = DummyHTTPClient(content=dummy_pdf, content_type="application/octet-stream")
    client = DummyBookStackClient(http_client)
    datasource = BookStackDataSource(client)

    response = await datasource.export_book_pdf(100)
    decoded = base64.b64decode(response.data["content"].encode("utf-8"))

# 2. Edge Test Cases

@pytest.mark.asyncio
async def test_export_book_pdf_error_handling():
    """Test error handling when HTTP client raises an exception."""
    http_client = DummyHTTPClient(should_raise=True)
    client = DummyBookStackClient(http_client)
    datasource = BookStackDataSource(client)

    response = await datasource.export_book_pdf(999)

@pytest.mark.asyncio
async def test_export_book_pdf_invalid_http_client_none():
    """Test error raised if HTTP client is None."""
    class BrokenBookStackClient:
        def get_client(self):
            return None
    with pytest.raises(ValueError) as excinfo:
        BookStackDataSource(BrokenBookStackClient())

@pytest.mark.asyncio
async def test_export_book_pdf_invalid_http_client_no_base_url():
    """Test error raised if HTTP client lacks get_base_url method."""
    class BrokenHTTPClient:
        pass
    class BrokenBookStackClient:
        def get_client(self):
            return BrokenHTTPClient()
    with pytest.raises(ValueError) as excinfo:
        BookStackDataSource(BrokenBookStackClient())

@pytest.mark.asyncio
async def test_export_book_pdf_concurrent_execution():
    """Test concurrent execution of multiple PDF exports."""
    dummy_pdf1 = b"PDF-1"
    dummy_pdf2 = b"PDF-2"
    http_client1 = DummyHTTPClient(content=dummy_pdf1)
    http_client2 = DummyHTTPClient(content=dummy_pdf2)
    client1 = DummyBookStackClient(http_client1)
    client2 = DummyBookStackClient(http_client2)
    datasource1 = BookStackDataSource(client1)
    datasource2 = BookStackDataSource(client2)

    results = await asyncio.gather(
        datasource1.export_book_pdf(1),
        datasource2.export_book_pdf(2)
    )

@pytest.mark.asyncio
async def test_export_book_pdf_concurrent_error_and_success():
    """Test concurrent execution with one success and one error."""
    dummy_pdf = b"PDF-OK"
    http_client_ok = DummyHTTPClient(content=dummy_pdf)
    http_client_err = DummyHTTPClient(should_raise=True)
    client_ok = DummyBookStackClient(http_client_ok)
    client_err = DummyBookStackClient(http_client_err)
    datasource_ok = BookStackDataSource(client_ok)
    datasource_err = BookStackDataSource(client_err)

    results = await asyncio.gather(
        datasource_ok.export_book_pdf(1),
        datasource_err.export_book_pdf(2)
    )

# 3. Large Scale Test Cases

@pytest.mark.asyncio
async def test_export_book_pdf_large_scale_concurrent():
    """Test large scale concurrent PDF exports (up to 50)."""
    num_requests = 50
    pdf_content = b"PDF-LARGE"
    datasources = [
        BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=pdf_content)))
        for _ in range(num_requests)
    ]
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_export_book_pdf_large_scale_mixed_success_error():
    """Test large scale concurrent PDF exports with some errors."""
    num_requests = 30
    datasources = []
    for i in range(num_requests):
        if i % 5 == 0:
            # Every 5th request should error
            ds = BookStackDataSource(DummyBookStackClient(DummyHTTPClient(should_raise=True)))
        else:
            ds = BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=b"PDF-MIXED")))
        datasources.append(ds)
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for i, resp in enumerate(results):
        if i % 5 == 0:
            pass
        else:
            pass

# 4. Throughput Test Cases

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_small_load():
    """Test throughput under small load (5 concurrent requests)."""
    datasources = [
        BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=b"PDF-SMALL")))
        for _ in range(5)
    ]
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_medium_load():
    """Test throughput under medium load (20 concurrent requests)."""
    datasources = [
        BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=b"PDF-MEDIUM")))
        for _ in range(20)
    ]
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_high_volume():
    """Test throughput under high volume load (80 concurrent requests)."""
    datasources = [
        BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=b"PDF-HIGH")))
        for _ in range(80)
    ]
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_mixed_load():
    """Test throughput under mixed load (success and error)."""
    datasources = []
    for i in range(40):
        if i % 7 == 0:
            ds = BookStackDataSource(DummyBookStackClient(DummyHTTPClient(should_raise=True)))
        else:
            ds = BookStackDataSource(DummyBookStackClient(DummyHTTPClient(content=b"PDF-MIXED-THROUGHPUT")))
        datasources.append(ds)
    tasks = [ds.export_book_pdf(i) for i, ds in enumerate(datasources)]
    results = await asyncio.gather(*tasks)
    for i, resp in enumerate(results):
        if i % 7 == 0:
            pass
        else:
            pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio  # used to run async functions
# The function to test (EXACT COPY)
import base64
from typing import Dict, Union

import pytest  # used for our unit tests
from app.sources.external.bookstack.bookstack import BookStackDataSource


# Mocks and stubs for dependencies
class DummyAsyncClient:
    """A dummy async HTTP client that simulates httpx.AsyncClient"""
    def __init__(self, *, timeout=30.0, follow_redirects=True):
        self.timeout = timeout
        self.follow_redirects = follow_redirects
        self.requests = []

    async def request(self, method, url, **kwargs):
        self.requests.append((method, url, kwargs))
        # Simulate different responses based on URL or method
        if "fail" in url:
            # Simulate an HTTP error
            raise Exception("Simulated HTTP error")
        # Simulate a PDF response
        class DummyResponse:
            def __init__(self):
                self.content = b"%PDF-1.4 binary pdf data"
                self.headers = {"Content-Type": "application/pdf"}
            @property
            def content_type(self):
                return self.headers.get("Content-Type", "application/octet-stream")
        return DummyResponse()

    async def aclose(self):
        pass

class DummyHTTPResponse:
    """A dummy HTTPResponse that wraps a simulated response"""
    def __init__(self, response):
        self.response = response
        self.content_type = getattr(response, "content_type", "application/pdf")
    def bytes(self):
        return self.response.content

class DummyHTTPClient:
    """A dummy HTTP client that mimics the BookStackRESTClientViaToken interface"""
    def __init__(self, base_url, simulate_fail=False):
        self.base_url = base_url.rstrip('/')
        self.headers = {
            "Authorization": "Token dummy:dummy",
            "Content-Type": "application/json",
            "Accept": "application/json"
        }
        self.simulate_fail = simulate_fail
        self._client = DummyAsyncClient()
    def get_base_url(self):
        return self.base_url
    async def execute(self, request, **kwargs):
        # Simulate a failure if requested
        if self.simulate_fail:
            raise Exception("Simulated HTTP error")
        # Return a simulated PDF response
        class DummyResponse:
            def __init__(self):
                self.content = b"%PDF-1.4 binary pdf data"
                self.headers = {"Content-Type": "application/pdf"}
            @property
            def content_type(self):
                return self.headers.get("Content-Type", "application/pdf")
        return DummyHTTPResponse(DummyResponse())

class DummyBookStackClient:
    """A dummy BookStackClient that returns a DummyHTTPClient"""
    def __init__(self, base_url, simulate_fail=False):
        self._client = DummyHTTPClient(base_url, simulate_fail=simulate_fail)
    def get_client(self):
        return self._client

# BookStackResponse for test assertions
class BookStackResponse:
    def __init__(self, success, data=None, error=None):
        self.success = success
        self.data = data
        self.error = error

# HTTPRequest for test
class HTTPRequest:
    def __init__(self, method, url, headers, query_params, body):
        self.method = method
        self.url = url
        self.headers = headers
        self.query_params = query_params
        self.body = body
        self.path_params = {}
from app.sources.external.bookstack.bookstack import BookStackDataSource

# ------------------ UNIT TESTS BELOW ------------------

# 1. BASIC TEST CASES

@pytest.mark.asyncio
async def test_export_book_pdf_basic_success():
    """Test exporting a book as PDF with a valid ID returns success and correct data."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    resp = await datasource.export_book_pdf(123)
    # Content should be correct base64 encoding of the dummy PDF
    expected_b64 = base64.b64encode(b"%PDF-1.4 binary pdf data").decode("utf-8")

@pytest.mark.asyncio
async def test_export_book_pdf_basic_invalid_id_type():
    """Test exporting with an invalid type for id (should still succeed as function doesn't validate id type)."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    # The function does not type-check id, so passing a string will just put it in the URL
    resp = await datasource.export_book_pdf("not_an_int")

@pytest.mark.asyncio
async def test_export_book_pdf_basic_minimal_id():
    """Test exporting with minimal valid id (e.g., 0)."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    resp = await datasource.export_book_pdf(0)

# 2. EDGE TEST CASES

@pytest.mark.asyncio
async def test_export_book_pdf_edge_http_error():
    """Test export_book_pdf returns error response on HTTP error."""
    client = DummyBookStackClient("https://fail.example.com", simulate_fail=True)
    datasource = BookStackDataSource(client)
    resp = await datasource.export_book_pdf(999)

@pytest.mark.asyncio
async def test_export_book_pdf_edge_missing_http_client():
    """Test that BookStackDataSource raises ValueError if HTTP client is missing."""
    class NoClient:
        def get_client(self):
            return None
    with pytest.raises(ValueError) as excinfo:
        BookStackDataSource(NoClient())

@pytest.mark.asyncio
async def test_export_book_pdf_edge_missing_base_url_method():
    """Test that BookStackDataSource raises ValueError if get_base_url is missing."""
    class NoBaseUrl:
        def get_client(self):
            class Dummy:
                pass
            return Dummy()
    with pytest.raises(ValueError) as excinfo:
        BookStackDataSource(NoBaseUrl())

@pytest.mark.asyncio
async def test_export_book_pdf_concurrent_execution():
    """Test concurrent execution of export_book_pdf for different IDs."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    # Run 5 concurrent requests
    ids = [1, 2, 3, 4, 5]
    results = await asyncio.gather(*(datasource.export_book_pdf(i) for i in ids))
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_export_book_pdf_concurrent_failure_and_success():
    """Test concurrent execution where some requests fail and some succeed."""
    # One client will fail, one will succeed
    client1 = DummyBookStackClient("https://example.com")
    client2 = DummyBookStackClient("https://fail.example.com", simulate_fail=True)
    datasource1 = BookStackDataSource(client1)
    datasource2 = BookStackDataSource(client2)
    results = await asyncio.gather(
        datasource1.export_book_pdf(10),
        datasource2.export_book_pdf(20)
    )

# 3. LARGE SCALE TEST CASES

@pytest.mark.asyncio
async def test_export_book_pdf_large_scale_concurrent():
    """Test export_book_pdf with a large number of concurrent requests."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    ids = list(range(50))  # 50 concurrent requests
    results = await asyncio.gather(*(datasource.export_book_pdf(i) for i in ids))

@pytest.mark.asyncio
async def test_export_book_pdf_large_scale_concurrent_with_failures():
    """Test export_book_pdf with a mix of success and failure in concurrent requests."""
    # 25 succeed, 25 fail
    client_good = DummyBookStackClient("https://example.com")
    client_bad = DummyBookStackClient("https://fail.example.com", simulate_fail=True)
    datasource_good = BookStackDataSource(client_good)
    datasource_bad = BookStackDataSource(client_bad)
    tasks = [datasource_good.export_book_pdf(i) for i in range(25)] + \
            [datasource_bad.export_book_pdf(i) for i in range(25, 50)]
    results = await asyncio.gather(*tasks)
    # First 25 succeed, last 25 fail
    for i, resp in enumerate(results):
        if i < 25:
            pass
        else:
            pass

# 4. THROUGHPUT TEST CASES

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_small_load():
    """Throughput test: small load (5 concurrent requests)."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    ids = [100, 101, 102, 103, 104]
    results = await asyncio.gather(*(datasource.export_book_pdf(i) for i in ids))

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_medium_load():
    """Throughput test: medium load (20 concurrent requests)."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    ids = list(range(200, 220))
    results = await asyncio.gather(*(datasource.export_book_pdf(i) for i in ids))

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_mixed_load():
    """Throughput test: mix of success and failure under load."""
    client_good = DummyBookStackClient("https://example.com")
    client_bad = DummyBookStackClient("https://fail.example.com", simulate_fail=True)
    datasource_good = BookStackDataSource(client_good)
    datasource_bad = BookStackDataSource(client_bad)
    # 10 good, 10 fail
    tasks = [datasource_good.export_book_pdf(i) for i in range(10)] + \
            [datasource_bad.export_book_pdf(i) for i in range(10, 20)]
    results = await asyncio.gather(*tasks)
    for i, resp in enumerate(results):
        if i < 10:
            pass
        else:
            pass

@pytest.mark.asyncio
async def test_export_book_pdf_throughput_high_volume():
    """Throughput test: high volume (100 concurrent requests, all succeed)."""
    client = DummyBookStackClient("https://example.com")
    datasource = BookStackDataSource(client)
    ids = list(range(1000, 1100))
    results = await asyncio.gather(*(datasource.export_book_pdf(i) for i in ids))
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from app.sources.external.bookstack.bookstack import BookStackDataSource

To edit these changes git checkout codeflash/optimize-BookStackDataSource.export_book_pdf-mhbhxrdd and push.

Codeflash

**Explanation of Optimizations:**

- **app/sources/client/http/http_client.py**.
  - Avoided recreating merged headers dictionary on every request when `request.headers` is empty.
  - Moved body type checks to minimize unnecessary operations.
  - Avoided unnecessary formatting operation when URL has no path params.
  - Slightly streamlined the selection of data/json/content for the request.

- **app/sources/client/http/http_response.py**.
  - Cached `.content` property on first access to minimize repeated property lookup, which is especially beneficial with large response bodies and repeated `bytes()` calls.

- **app/sources/external/bookstack/bookstack.py**.
  - Combined redundant assignments and inlined variables where possible.
  - Avoided copying HTTP headers unless needed.
  - Minimized repeated function lookups by localizing method references.
  - Used f-string instead of string concatenation and `.format()` for URL for better speed and readability.

These changes collectively provide minor runtime and memory gains and reduce per-request overhead.

---
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 04:29
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant