-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add Ollama-based alt text service (zero ML dependencies) #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add ollama_service/ with Flask API using qwen3-vl:4b model - No PyTorch/Transformers required - just flask + requests - Model runs via Ollama (3.3GB vs 10GB for Transformers version) - Produces concise, natural alt text captions - Update .gitignore for Python environments
|
Someone is attempting to deploy a commit to the fairdataihub Team on Vercel. A member of the Team first needs to authorize it. |
|
Thank you for submitting this pull request! We appreciate your contribution to the project. Before we can merge it, we need to review the changes you've made to ensure they align with our code standards and meet the requirements of the project. We'll get back to you as soon as we can with feedback. Thanks again! |
Reviewer's GuideAdds a new lightweight Flask-based alt text generation microservice that delegates image captioning to an Ollama-hosted qwen3-vl:4b model, along with minimal Python dependencies and documentation, without introducing any heavy ML libraries into the codebase. Sequence diagram for Ollama alt text generation via /generatesequenceDiagram
actor Client
participant FlaskServer as FlaskServer_generate
participant ImageHost
participant OllamaAPI
Client->>FlaskServer: HTTP GET/POST /generate (imageUrl, prompt?)
alt POST
FlaskServer->>FlaskServer: Parse JSON body
else GET
FlaskServer->>FlaskServer: Read query parameters
end
FlaskServer->>FlaskServer: Validate imageUrl
FlaskServer->>ImageHost: HTTP GET imageUrl
ImageHost-->>FlaskServer: Image bytes
FlaskServer->>FlaskServer: Base64 encode image
FlaskServer->>OllamaAPI: POST /api/generate (model, prompt, images[], stream=False)
OllamaAPI-->>FlaskServer: JSON { response, error? }
alt Error in Ollama response
FlaskServer-->>Client: 500 JSON { error }
else Success
FlaskServer-->>Client: 200 JSON { alt_text, model, runtime }
end
opt Ollama connection failure
FlaskServer-->>Client: 503 JSON { error: Cannot connect to Ollama }
end
Sequence diagram for /health endpoint checking Ollama and modelsequenceDiagram
actor Client
participant FlaskServer as FlaskServer_health
participant OllamaAPI
Client->>FlaskServer: HTTP GET /health
FlaskServer->>OllamaAPI: GET /api/tags
alt Ollama reachable
OllamaAPI-->>FlaskServer: JSON { models[] }
FlaskServer->>FlaskServer: Check MODEL_NAME in models
alt Model available
FlaskServer-->>Client: 200 { status: healthy, ollama_reachable: true, model_available: true, available_models }
else Model missing
FlaskServer-->>Client: 200 { status: degraded, ollama_reachable: true, model_available: false, available_models }
end
else Ollama unreachable or error
FlaskServer-->>Client: 200 { status: degraded, ollama_reachable: false, model_available: false, available_models: [] }
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Thanks for making updates to your pull request. Our team will take a look and provide feedback as soon as possible. Please wait for any GitHub Actions to complete before editing your pull request. If you have any additional questions or concerns, feel free to let us know. Thank you for your contributions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- Consider adding basic validation and safeguards to
fetch_image_as_base64(e.g., content-type check for images, maximum response size, and restricting or sanitizing URLs) to reduce the risk of SSRF and large payload downloads from arbitrary user-providedimageUrlvalues. - In the
/generateendpoint, you might want to distinguish between failures fetching the image and failures calling Ollama (e.g., separate except blocks or error messages) so clients can more easily understand whether the image URL or the model backend caused the error. - The health check currently calls
/api/tagsand scans all models on every request; if this endpoint will be hit frequently, consider caching the model list for a short TTL or switching to a lighter-weight availability check to avoid unnecessary load on the Ollama API.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider adding basic validation and safeguards to `fetch_image_as_base64` (e.g., content-type check for images, maximum response size, and restricting or sanitizing URLs) to reduce the risk of SSRF and large payload downloads from arbitrary user-provided `imageUrl` values.
- In the `/generate` endpoint, you might want to distinguish between failures fetching the image and failures calling Ollama (e.g., separate except blocks or error messages) so clients can more easily understand whether the image URL or the model backend caused the error.
- The health check currently calls `/api/tags` and scans all models on every request; if this endpoint will be hit frequently, consider caching the model list for a short TTL or switching to a lighter-weight availability check to avoid unnecessary load on the Ollama API.
## Individual Comments
### Comment 1
<location> `ollama_service/server.py:44-46` </location>
<code_context>
+app = Flask(__name__)
+
+
+def fetch_image_as_base64(image_url: str) -> str:
+ """Download an image from URL and return as base64 string."""
+ response = requests.get(image_url, timeout=30)
+ response.raise_for_status()
+ return base64.b64encode(response.content).decode('utf-8')
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Add basic validation/constraints on fetched images (content type, size, and redirects).
This function will eagerly download any URL into memory with no checks. Please add safeguards: enforce a reasonable max content length (e.g., via `stream=True` + `iter_content` and abort when the threshold is reached), restrict `Content-Type` to expected image MIME types, and cap the number of redirects to reduce abuse risk.
Suggested implementation:
```python
# Configuration
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL_NAME = os.environ.get("OLLAMA_MODEL", "qwen3-vl:4b")
DEFAULT_PROMPT = "Describe this image in one concise sentence for alt text."
# Image fetch constraints
MAX_IMAGE_BYTES = 5 * 1024 * 1024 # 5 MiB
ALLOWED_IMAGE_MIME_TYPES = {
"image/jpeg",
"image/png",
"image/gif",
"image/webp",
}
MAX_REDIRECTS = 5
# Initialize Flask app
app = Flask(__name__)
```
```python
def fetch_image_as_base64(image_url: str) -> str:
"""Download an image from URL and return as base64 string with basic validation."""
with requests.Session() as session:
session.max_redirects = MAX_REDIRECTS
response = session.get(
image_url,
timeout=30,
stream=True,
)
response.raise_for_status()
# Validate content type
content_type = response.headers.get("Content-Type", "")
mime_type = content_type.split(";", 1)[0].strip().lower()
if mime_type not in ALLOWED_IMAGE_MIME_TYPES:
raise ValueError(f"Unsupported content type for image: {content_type!r}")
# Check declared content length if available
content_length = response.headers.get("Content-Length")
if content_length is not None:
try:
length = int(content_length)
except ValueError:
length = None
else:
if length > MAX_IMAGE_BYTES:
raise ValueError(
f"Image is too large (declared {length} bytes, "
f"max {MAX_IMAGE_BYTES} bytes)"
)
# Stream content and enforce max size
chunks = bytearray()
for chunk in response.iter_content(chunk_size=8192):
if not chunk:
continue
chunks.extend(chunk)
if len(chunks) > MAX_IMAGE_BYTES:
raise ValueError(
f"Image exceeded maximum allowed size of {MAX_IMAGE_BYTES} bytes"
)
return base64.b64encode(bytes(chunks)).decode("utf-8")
```
</issue_to_address>
### Comment 2
<location> `ollama_service/server.py:238-239` </location>
<code_context>
+ try:
+ # Get parameters
+ if request.method == "POST":
+ data = request.get_json() or {}
+ image_url = data.get("imageUrl") or data.get("image_url")
+ prompt = data.get("prompt", DEFAULT_PROMPT)
+ else:
</code_context>
<issue_to_address>
**🚨 issue (security):** Add URL validation / allowlist to reduce SSRF risk from user-controlled image URLs.
This endpoint currently fetches arbitrary client-supplied URLs, which exposes you to SSRF and access to internal network resources. Please validate `image_url` (e.g., enforce http/https, reject private/loopback IPs, and ideally restrict to an allowlisted set of domains) before calling `fetch_image_as_base64`.
</issue_to_address>
### Comment 3
<location> `ollama_service/server.py:207-208` </location>
<code_context>
+ try:
+ response = requests.get(f"{OLLAMA_HOST}/api/tags", timeout=5)
+ ollama_ok = response.status_code == 200
+ models = [m["name"] for m in response.json().get("models", [])]
+ model_available = any(MODEL_NAME in m for m in models)
+ except Exception as e:
+ ollama_ok = False
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use stricter matching for model availability in the health check.
Using substring matching (`MODEL_NAME in m`) can yield false positives when one model name is a substring of another (e.g. `qwen3-vl:4b` vs `qwen3-vl:4b-instruct`). Prefer exact comparison (e.g. `any(m == MODEL_NAME for m in models)`) or matching against the specific tag/name field to ensure the health check only passes when the intended model is present.
```suggestion
models = [m["name"] for m in response.json().get("models", [])]
# Require exact model name match to avoid false positives from substrings
model_available = any(m == MODEL_NAME for m in models)
```
</issue_to_address>
### Comment 4
<location> `ollama_service/server.py:209-218` </location>
<code_context>
+ return jsonify({
+ "error": "Cannot connect to Ollama. Is it running? Start with: ollama serve"
+ }), 503
+ except Exception as e:
+ logger.error(f"Error generating caption: {e}", exc_info=True)
+ return jsonify({"error": str(e)}), 500
+
+
</code_context>
<issue_to_address>
**🚨 issue (security):** Avoid returning raw exception messages directly to clients in 500 responses.
Using `str(e)` in the JSON response can expose internal details to clients. Keep logging the full exception with `exc_info=True`, but return a generic, stable error message instead of the raw exception string.
</issue_to_address>
### Comment 5
<location> `ollama_service/README.md:37` </location>
<code_context>
+ollama pull qwen3-vl:4b
+```
+
+### 3. Start Ollama (if not running as service)
+
+```bash
</code_context>
<issue_to_address>
**issue (typo):** Consider changing to "if not running as a service" for correct grammar.
This keeps the heading grammatically correct and consistent with the rest of the documentation.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def fetch_image_as_base64(image_url: str) -> str: | ||
| """Download an image from URL and return as base64 string.""" | ||
| response = requests.get(image_url, timeout=30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 suggestion (security): Add basic validation/constraints on fetched images (content type, size, and redirects).
This function will eagerly download any URL into memory with no checks. Please add safeguards: enforce a reasonable max content length (e.g., via stream=True + iter_content and abort when the threshold is reached), restrict Content-Type to expected image MIME types, and cap the number of redirects to reduce abuse risk.
Suggested implementation:
# Configuration
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL_NAME = os.environ.get("OLLAMA_MODEL", "qwen3-vl:4b")
DEFAULT_PROMPT = "Describe this image in one concise sentence for alt text."
# Image fetch constraints
MAX_IMAGE_BYTES = 5 * 1024 * 1024 # 5 MiB
ALLOWED_IMAGE_MIME_TYPES = {
"image/jpeg",
"image/png",
"image/gif",
"image/webp",
}
MAX_REDIRECTS = 5
# Initialize Flask app
app = Flask(__name__)def fetch_image_as_base64(image_url: str) -> str:
"""Download an image from URL and return as base64 string with basic validation."""
with requests.Session() as session:
session.max_redirects = MAX_REDIRECTS
response = session.get(
image_url,
timeout=30,
stream=True,
)
response.raise_for_status()
# Validate content type
content_type = response.headers.get("Content-Type", "")
mime_type = content_type.split(";", 1)[0].strip().lower()
if mime_type not in ALLOWED_IMAGE_MIME_TYPES:
raise ValueError(f"Unsupported content type for image: {content_type!r}")
# Check declared content length if available
content_length = response.headers.get("Content-Length")
if content_length is not None:
try:
length = int(content_length)
except ValueError:
length = None
else:
if length > MAX_IMAGE_BYTES:
raise ValueError(
f"Image is too large (declared {length} bytes, "
f"max {MAX_IMAGE_BYTES} bytes)"
)
# Stream content and enforce max size
chunks = bytearray()
for chunk in response.iter_content(chunk_size=8192):
if not chunk:
continue
chunks.extend(chunk)
if len(chunks) > MAX_IMAGE_BYTES:
raise ValueError(
f"Image exceeded maximum allowed size of {MAX_IMAGE_BYTES} bytes"
)
return base64.b64encode(bytes(chunks)).decode("utf-8")| data = request.get_json() or {} | ||
| image_url = data.get("imageUrl") or data.get("image_url") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 issue (security): Add URL validation / allowlist to reduce SSRF risk from user-controlled image URLs.
This endpoint currently fetches arbitrary client-supplied URLs, which exposes you to SSRF and access to internal network resources. Please validate image_url (e.g., enforce http/https, reject private/loopback IPs, and ideally restrict to an allowlisted set of domains) before calling fetch_image_as_base64.
| models = [m["name"] for m in response.json().get("models", [])] | ||
| model_available = any(MODEL_NAME in m for m in models) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Use stricter matching for model availability in the health check.
Using substring matching (MODEL_NAME in m) can yield false positives when one model name is a substring of another (e.g. qwen3-vl:4b vs qwen3-vl:4b-instruct). Prefer exact comparison (e.g. any(m == MODEL_NAME for m in models)) or matching against the specific tag/name field to ensure the health check only passes when the intended model is present.
| models = [m["name"] for m in response.json().get("models", [])] | |
| model_available = any(MODEL_NAME in m for m in models) | |
| models = [m["name"] for m in response.json().get("models", [])] | |
| # Require exact model name match to avoid false positives from substrings | |
| model_available = any(m == MODEL_NAME for m in models) |
| except Exception as e: | ||
| ollama_ok = False | ||
| model_available = False | ||
| models = [] | ||
|
|
||
| return jsonify({ | ||
| "status": "healthy" if ollama_ok and model_available else "degraded", | ||
| "ollama_reachable": ollama_ok, | ||
| "model": MODEL_NAME, | ||
| "model_available": model_available, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 issue (security): Avoid returning raw exception messages directly to clients in 500 responses.
Using str(e) in the JSON response can expose internal details to clients. Keep logging the full exception with exc_info=True, but return a generic, stable error message instead of the raw exception string.
| ollama pull qwen3-vl:4b | ||
| ``` | ||
|
|
||
| ### 3. Start Ollama (if not running as service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (typo): Consider changing to "if not running as a service" for correct grammar.
This keeps the heading grammatically correct and consistent with the rest of the documentation.
Summary by Sourcery
Introduce a lightweight Ollama-backed alt text generation microservice exposed via a Flask API.
New Features:
Enhancements:
Build:
Documentation: