YOLO-Ollama Object Detection System

Overview

This repository implements an image analysis system that integrates:

Ollama LLM for extracting image URLs or paths from user queries.
YOLO (You Only Look Once) for object detection on the extracted image.
Ollama LLM again for summarizing detected objects.
Flask API for handling user requests and processing the workflow.
CSV Logging for tracking inference time and model performance.

Workflow

1. User Input

A user submits a prompt containing an image URL or a local file path via a POST request:

curl -X POST http://localhost:5000/detect \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Here is my image: /home/pi/Desktop/yoloollama/cat1.jpg. Please analyze it!"}'

2. LLM Extraction

The first Ollama LLM extracts the image path or URL from the user’s query.
If no valid path is found, a response is returned stating no image was provided.

3. YOLO Object Detection

The extracted image path/URL is passed to the YOLO model for object detection.
The model returns detected objects along with their confidence scores.

4. LLM Summarization

The detected objects are sent to another Ollama LLM to generate a brief summary.

5. Response & Logging

The summarized response is returned to the user.
The entire process (LLM inference time, YOLO inference time, detected objects) is logged in a CSV file.

Flowchart

graph TD;
    A[User sends prompt] -->|POST /detect| B[Flask API receives request]
    B --> C[Extract image URL/path using LLM]
    C -->|Extracted URL/path| D{Is URL/path found?}
    D --No--> E[Return No valid image link extracted]
    D --Yes--> F[YOLO model processes image]
    F -->|Detected objects| G[Summarize results using LLM]
    G -->|Summary generated| H[Return JSON response]
    H --> I[Log metrics to CSV]
    H --> J[Send summarized response to user]

Technologies Used

Flask: Handles API requests.
Ollama LLM: Extracts image path & summarizes detected objects.
YOLO: Runs object detection.
CSV Logging: Stores performance metrics.

Installation

Clone the repository:

git clone https://github.com/yourusername/yolo-ollama-detection.git
cd yolo-ollama-detection

Install dependencies:
```
pip install flask requests ultralytics
```
Start the Flask server:
```
python server.py
```
Send a request using curl or Postman to test the API.

API Endpoint

`POST /detect`

Request Body:

{
  "prompt": "Analyze this image: /home/user/image.jpg"
}

Response:

{
  "extracted_url": "/home/user/image.jpg",
  "detections": [
    {"class_name": "cat", "confidence": 0.97},
    {"class_name": "sofa", "confidence": 0.85}
  ],
  "summary_paragraph": "A cat is sitting on a sofa."
}

Contributors

Partha Pratim Ray, 2025

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
0.jpg		0.jpg
1.jpg		1.jpg
2.jpg		2.jpg
3.jpg		3.jpg
4.jpg		4.jpg
LICENSE		LICENSE
README.md		README.md
curl.py		curl.py
requirements.txt		requirements.txt
yolo11n.pt		yolo11n.pt
yolo12n.pt		yolo12n.pt
yoloollama_10.py		yoloollama_10.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

YOLO-Ollama Object Detection System

Overview

Workflow

1. User Input

2. LLM Extraction

3. YOLO Object Detection

4. LLM Summarization

5. Response & Logging

Flowchart

Technologies Used

Installation

API Endpoint

`POST /detect`

Request Body:

Response:

Contributors

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ParthaPRay/yolo_ollama_raspberrypi

Folders and files

Latest commit

History

Repository files navigation

YOLO-Ollama Object Detection System

Overview

Workflow

1. User Input

2. LLM Extraction

3. YOLO Object Detection

4. LLM Summarization

5. Response & Logging

Flowchart

Technologies Used

Installation

API Endpoint

POST /detect

Request Body:

Response:

Contributors

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`POST /detect`

Packages