Skip to content

Conversation

@jimnoneill
Copy link

@jimnoneill jimnoneill commented Dec 13, 2025

  • Add ollama_service/ with Flask API using qwen3-vl:4b model
  • No PyTorch/Transformers required - just flask + requests
  • Model runs via Ollama (3.3GB vs 10GB for Transformers version)
  • Produces concise, natural alt text captions
  • Update .gitignore for Python environments

Summary by Sourcery

Introduce a lightweight Ollama-backed alt text generation microservice exposed via a Flask API.

New Features:

  • Add a Flask-based HTTP API for generating image alt text using an Ollama-hosted qwen3-vl:4b vision-language model.
  • Serve a simple landing page describing the alt text service and its API endpoints.
  • Expose health and caption-generation endpoints for checking Ollama status and generating captions from image URLs.

Enhancements:

  • Document setup, configuration, and usage of the Ollama-based alt text service in a dedicated README.
  • Define a minimal Python requirements file for running the Ollama alt text server without ML framework dependencies.

Build:

  • Update gitignore to better support Python development environments.

Documentation:

  • Add user-facing documentation for installing Ollama, running the alt text server, and calling its HTTP API.

- Add ollama_service/ with Flask API using qwen3-vl:4b model
- No PyTorch/Transformers required - just flask + requests
- Model runs via Ollama (3.3GB vs 10GB for Transformers version)
- Produces concise, natural alt text captions
- Update .gitignore for Python environments
@vercel
Copy link

vercel bot commented Dec 13, 2025

Someone is attempting to deploy a commit to the fairdataihub Team on Vercel.

A member of the Team first needs to authorize it.

@fairdataihub-bot
Copy link

Thank you for submitting this pull request! We appreciate your contribution to the project. Before we can merge it, we need to review the changes you've made to ensure they align with our code standards and meet the requirements of the project. We'll get back to you as soon as we can with feedback. Thanks again!

@sourcery-ai
Copy link

sourcery-ai bot commented Dec 13, 2025

Reviewer's Guide

Adds a new lightweight Flask-based alt text generation microservice that delegates image captioning to an Ollama-hosted qwen3-vl:4b model, along with minimal Python dependencies and documentation, without introducing any heavy ML libraries into the codebase.

Sequence diagram for Ollama alt text generation via /generate

sequenceDiagram
    actor Client
    participant FlaskServer as FlaskServer_generate
    participant ImageHost
    participant OllamaAPI

    Client->>FlaskServer: HTTP GET/POST /generate (imageUrl, prompt?)
    alt POST
        FlaskServer->>FlaskServer: Parse JSON body
    else GET
        FlaskServer->>FlaskServer: Read query parameters
    end
    FlaskServer->>FlaskServer: Validate imageUrl
    FlaskServer->>ImageHost: HTTP GET imageUrl
    ImageHost-->>FlaskServer: Image bytes
    FlaskServer->>FlaskServer: Base64 encode image
    FlaskServer->>OllamaAPI: POST /api/generate (model, prompt, images[], stream=False)
    OllamaAPI-->>FlaskServer: JSON { response, error? }
    alt Error in Ollama response
        FlaskServer-->>Client: 500 JSON { error }
    else Success
        FlaskServer-->>Client: 200 JSON { alt_text, model, runtime }
    end
    opt Ollama connection failure
        FlaskServer-->>Client: 503 JSON { error: Cannot connect to Ollama }
    end
Loading

Sequence diagram for /health endpoint checking Ollama and model

sequenceDiagram
    actor Client
    participant FlaskServer as FlaskServer_health
    participant OllamaAPI

    Client->>FlaskServer: HTTP GET /health
    FlaskServer->>OllamaAPI: GET /api/tags
    alt Ollama reachable
        OllamaAPI-->>FlaskServer: JSON { models[] }
        FlaskServer->>FlaskServer: Check MODEL_NAME in models
        alt Model available
            FlaskServer-->>Client: 200 { status: healthy, ollama_reachable: true, model_available: true, available_models }
        else Model missing
            FlaskServer-->>Client: 200 { status: degraded, ollama_reachable: true, model_available: false, available_models }
        end
    else Ollama unreachable or error
        FlaskServer-->>Client: 200 { status: degraded, ollama_reachable: false, model_available: false, available_models: [] }
    end
Loading

File-Level Changes

Change Details Files
Introduce a standalone Flask API service that generates image alt text by calling an Ollama-hosted qwen3-vl:4b VLM instead of running local ML models.
  • Configure global environment-driven settings for Ollama host, model name, and default prompt, and initialize a Flask app with structured logging.
  • Implement helper functions to download an image from a given URL, encode it as base64, and call Ollama's /api/generate endpoint with the encoded image and prompt, including error handling for API-level errors.
  • Expose an HTML landing page at the root route that documents the API, model characteristics, and example usage, styled as a simple single-page UI.
  • Add a /health endpoint that probes Ollama's /api/tags endpoint to verify connectivity and the availability of the configured model, returning a JSON health summary.
  • Add a /generate endpoint supporting both GET query params and POST JSON bodies, validating input, orchestrating image fetch plus caption generation, and returning standardized JSON responses and errors, including special handling for Ollama connection failures.
  • Wire up a main entrypoint that logs startup configuration and runs the Flask app on a configurable port without debug mode.
ollama_service/server.py
Document how to run and integrate the Ollama-based alt text service, emphasizing its minimal dependency footprint and comparison with the Transformers-based version.
  • Describe the rationale for using Ollama versus a Transformers-based stack, including dependency size, setup complexity, and portability tradeoffs.
  • Detail model information for qwen3-vl:4b and provide step-by-step prerequisites for installing Ollama, pulling the model, and starting the Ollama server.
  • Provide setup instructions for installing Python dependencies directly or via a requirements file and describe all available API endpoints with curl examples, including GET/POST generate flows.
  • Document environment variables controlling server port, Ollama host, and model name, and add a comparison table between the Ollama service and the Transformers-based implementation.
  • Include troubleshooting tips for common issues like inability to connect to Ollama or missing models, with relevant CLI commands.
ollama_service/README.md
Define a minimal Python dependency set for the Ollama alt text microservice.
  • Create a requirements file that lists only Flask and Requests with modern minimum versions, with comments clarifying that Ollama handles model inference and that no ML libraries are needed.
  • Document prerequisite Ollama installation and model setup inline in the requirements file comments.
ollama_service/requirements.txt

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@fairdataihub-bot
Copy link

Thanks for making updates to your pull request. Our team will take a look and provide feedback as soon as possible. Please wait for any GitHub Actions to complete before editing your pull request. If you have any additional questions or concerns, feel free to let us know. Thank you for your contributions!

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • Consider adding basic validation and safeguards to fetch_image_as_base64 (e.g., content-type check for images, maximum response size, and restricting or sanitizing URLs) to reduce the risk of SSRF and large payload downloads from arbitrary user-provided imageUrl values.
  • In the /generate endpoint, you might want to distinguish between failures fetching the image and failures calling Ollama (e.g., separate except blocks or error messages) so clients can more easily understand whether the image URL or the model backend caused the error.
  • The health check currently calls /api/tags and scans all models on every request; if this endpoint will be hit frequently, consider caching the model list for a short TTL or switching to a lighter-weight availability check to avoid unnecessary load on the Ollama API.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider adding basic validation and safeguards to `fetch_image_as_base64` (e.g., content-type check for images, maximum response size, and restricting or sanitizing URLs) to reduce the risk of SSRF and large payload downloads from arbitrary user-provided `imageUrl` values.
- In the `/generate` endpoint, you might want to distinguish between failures fetching the image and failures calling Ollama (e.g., separate except blocks or error messages) so clients can more easily understand whether the image URL or the model backend caused the error.
- The health check currently calls `/api/tags` and scans all models on every request; if this endpoint will be hit frequently, consider caching the model list for a short TTL or switching to a lighter-weight availability check to avoid unnecessary load on the Ollama API.

## Individual Comments

### Comment 1
<location> `ollama_service/server.py:44-46` </location>
<code_context>
+app = Flask(__name__)
+
+
+def fetch_image_as_base64(image_url: str) -> str:
+    """Download an image from URL and return as base64 string."""
+    response = requests.get(image_url, timeout=30)
+    response.raise_for_status()
+    return base64.b64encode(response.content).decode('utf-8')
</code_context>

<issue_to_address>
**🚨 suggestion (security):** Add basic validation/constraints on fetched images (content type, size, and redirects).

This function will eagerly download any URL into memory with no checks. Please add safeguards: enforce a reasonable max content length (e.g., via `stream=True` + `iter_content` and abort when the threshold is reached), restrict `Content-Type` to expected image MIME types, and cap the number of redirects to reduce abuse risk.

Suggested implementation:

```python
# Configuration
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL_NAME = os.environ.get("OLLAMA_MODEL", "qwen3-vl:4b")
DEFAULT_PROMPT = "Describe this image in one concise sentence for alt text."

# Image fetch constraints
MAX_IMAGE_BYTES = 5 * 1024 * 1024  # 5 MiB
ALLOWED_IMAGE_MIME_TYPES = {
    "image/jpeg",
    "image/png",
    "image/gif",
    "image/webp",
}
MAX_REDIRECTS = 5

# Initialize Flask app
app = Flask(__name__)

```

```python
def fetch_image_as_base64(image_url: str) -> str:
    """Download an image from URL and return as base64 string with basic validation."""
    with requests.Session() as session:
        session.max_redirects = MAX_REDIRECTS

        response = session.get(
            image_url,
            timeout=30,
            stream=True,
        )
        response.raise_for_status()

        # Validate content type
        content_type = response.headers.get("Content-Type", "")
        mime_type = content_type.split(";", 1)[0].strip().lower()
        if mime_type not in ALLOWED_IMAGE_MIME_TYPES:
            raise ValueError(f"Unsupported content type for image: {content_type!r}")

        # Check declared content length if available
        content_length = response.headers.get("Content-Length")
        if content_length is not None:
            try:
                length = int(content_length)
            except ValueError:
                length = None
            else:
                if length > MAX_IMAGE_BYTES:
                    raise ValueError(
                        f"Image is too large (declared {length} bytes, "
                        f"max {MAX_IMAGE_BYTES} bytes)"
                    )

        # Stream content and enforce max size
        chunks = bytearray()
        for chunk in response.iter_content(chunk_size=8192):
            if not chunk:
                continue
            chunks.extend(chunk)
            if len(chunks) > MAX_IMAGE_BYTES:
                raise ValueError(
                    f"Image exceeded maximum allowed size of {MAX_IMAGE_BYTES} bytes"
                )

        return base64.b64encode(bytes(chunks)).decode("utf-8")

```
</issue_to_address>

### Comment 2
<location> `ollama_service/server.py:238-239` </location>
<code_context>
+    try:
+        # Get parameters
+        if request.method == "POST":
+            data = request.get_json() or {}
+            image_url = data.get("imageUrl") or data.get("image_url")
+            prompt = data.get("prompt", DEFAULT_PROMPT)
+        else:
</code_context>

<issue_to_address>
**🚨 issue (security):** Add URL validation / allowlist to reduce SSRF risk from user-controlled image URLs.

This endpoint currently fetches arbitrary client-supplied URLs, which exposes you to SSRF and access to internal network resources. Please validate `image_url` (e.g., enforce http/https, reject private/loopback IPs, and ideally restrict to an allowlisted set of domains) before calling `fetch_image_as_base64`.
</issue_to_address>

### Comment 3
<location> `ollama_service/server.py:207-208` </location>
<code_context>
+    try:
+        response = requests.get(f"{OLLAMA_HOST}/api/tags", timeout=5)
+        ollama_ok = response.status_code == 200
+        models = [m["name"] for m in response.json().get("models", [])]
+        model_available = any(MODEL_NAME in m for m in models)
+    except Exception as e:
+        ollama_ok = False
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Use stricter matching for model availability in the health check.

Using substring matching (`MODEL_NAME in m`) can yield false positives when one model name is a substring of another (e.g. `qwen3-vl:4b` vs `qwen3-vl:4b-instruct`). Prefer exact comparison (e.g. `any(m == MODEL_NAME for m in models)`) or matching against the specific tag/name field to ensure the health check only passes when the intended model is present.

```suggestion
        models = [m["name"] for m in response.json().get("models", [])]
        # Require exact model name match to avoid false positives from substrings
        model_available = any(m == MODEL_NAME for m in models)
```
</issue_to_address>

### Comment 4
<location> `ollama_service/server.py:209-218` </location>
<code_context>
+        return jsonify({
+            "error": "Cannot connect to Ollama. Is it running? Start with: ollama serve"
+        }), 503
+    except Exception as e:
+        logger.error(f"Error generating caption: {e}", exc_info=True)
+        return jsonify({"error": str(e)}), 500
+
+
</code_context>

<issue_to_address>
**🚨 issue (security):** Avoid returning raw exception messages directly to clients in 500 responses.

Using `str(e)` in the JSON response can expose internal details to clients. Keep logging the full exception with `exc_info=True`, but return a generic, stable error message instead of the raw exception string.
</issue_to_address>

### Comment 5
<location> `ollama_service/README.md:37` </location>
<code_context>
+ollama pull qwen3-vl:4b
+```
+
+### 3. Start Ollama (if not running as service)
+
+```bash
</code_context>

<issue_to_address>
**issue (typo):** Consider changing to "if not running as a service" for correct grammar.

This keeps the heading grammatically correct and consistent with the rest of the documentation.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +44 to +46
def fetch_image_as_base64(image_url: str) -> str:
"""Download an image from URL and return as base64 string."""
response = requests.get(image_url, timeout=30)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Add basic validation/constraints on fetched images (content type, size, and redirects).

This function will eagerly download any URL into memory with no checks. Please add safeguards: enforce a reasonable max content length (e.g., via stream=True + iter_content and abort when the threshold is reached), restrict Content-Type to expected image MIME types, and cap the number of redirects to reduce abuse risk.

Suggested implementation:

# Configuration
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL_NAME = os.environ.get("OLLAMA_MODEL", "qwen3-vl:4b")
DEFAULT_PROMPT = "Describe this image in one concise sentence for alt text."

# Image fetch constraints
MAX_IMAGE_BYTES = 5 * 1024 * 1024  # 5 MiB
ALLOWED_IMAGE_MIME_TYPES = {
    "image/jpeg",
    "image/png",
    "image/gif",
    "image/webp",
}
MAX_REDIRECTS = 5

# Initialize Flask app
app = Flask(__name__)
def fetch_image_as_base64(image_url: str) -> str:
    """Download an image from URL and return as base64 string with basic validation."""
    with requests.Session() as session:
        session.max_redirects = MAX_REDIRECTS

        response = session.get(
            image_url,
            timeout=30,
            stream=True,
        )
        response.raise_for_status()

        # Validate content type
        content_type = response.headers.get("Content-Type", "")
        mime_type = content_type.split(";", 1)[0].strip().lower()
        if mime_type not in ALLOWED_IMAGE_MIME_TYPES:
            raise ValueError(f"Unsupported content type for image: {content_type!r}")

        # Check declared content length if available
        content_length = response.headers.get("Content-Length")
        if content_length is not None:
            try:
                length = int(content_length)
            except ValueError:
                length = None
            else:
                if length > MAX_IMAGE_BYTES:
                    raise ValueError(
                        f"Image is too large (declared {length} bytes, "
                        f"max {MAX_IMAGE_BYTES} bytes)"
                    )

        # Stream content and enforce max size
        chunks = bytearray()
        for chunk in response.iter_content(chunk_size=8192):
            if not chunk:
                continue
            chunks.extend(chunk)
            if len(chunks) > MAX_IMAGE_BYTES:
                raise ValueError(
                    f"Image exceeded maximum allowed size of {MAX_IMAGE_BYTES} bytes"
                )

        return base64.b64encode(bytes(chunks)).decode("utf-8")

Comment on lines +238 to +239
data = request.get_json() or {}
image_url = data.get("imageUrl") or data.get("image_url")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): Add URL validation / allowlist to reduce SSRF risk from user-controlled image URLs.

This endpoint currently fetches arbitrary client-supplied URLs, which exposes you to SSRF and access to internal network resources. Please validate image_url (e.g., enforce http/https, reject private/loopback IPs, and ideally restrict to an allowlisted set of domains) before calling fetch_image_as_base64.

Comment on lines +207 to +208
models = [m["name"] for m in response.json().get("models", [])]
model_available = any(MODEL_NAME in m for m in models)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Use stricter matching for model availability in the health check.

Using substring matching (MODEL_NAME in m) can yield false positives when one model name is a substring of another (e.g. qwen3-vl:4b vs qwen3-vl:4b-instruct). Prefer exact comparison (e.g. any(m == MODEL_NAME for m in models)) or matching against the specific tag/name field to ensure the health check only passes when the intended model is present.

Suggested change
models = [m["name"] for m in response.json().get("models", [])]
model_available = any(MODEL_NAME in m for m in models)
models = [m["name"] for m in response.json().get("models", [])]
# Require exact model name match to avoid false positives from substrings
model_available = any(m == MODEL_NAME for m in models)

Comment on lines +209 to +218
except Exception as e:
ollama_ok = False
model_available = False
models = []

return jsonify({
"status": "healthy" if ollama_ok and model_available else "degraded",
"ollama_reachable": ollama_ok,
"model": MODEL_NAME,
"model_available": model_available,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): Avoid returning raw exception messages directly to clients in 500 responses.

Using str(e) in the JSON response can expose internal details to clients. Keep logging the full exception with exc_info=True, but return a generic, stable error message instead of the raw exception string.

ollama pull qwen3-vl:4b
```

### 3. Start Ollama (if not running as service)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): Consider changing to "if not running as a service" for correct grammar.

This keeps the heading grammatically correct and consistent with the rest of the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant