A from-scratch CBIR system in C++ that searches a 1,107-image database for the closest visual matches to a target query, using only classical features (color histograms, gradient magnitude, Laws texture energy) and explicit distance metrics. No neural networks, no pretrained embeddings, no shortcuts. Built for CS 5330: Pattern Recognition and Computer Vision at Northeastern (Spring 2023).
The point of the project was not to beat a deep model. It was to develop a working intuition for how images actually behave at the pixel level: how a choice of color space, histogram bin count, or texture filter changes which images come back as "similar."
Most retrieval today is a single API call against a CLIP embedding. That hides every interesting decision. I wanted to feel each tradeoff with my own hands:
- What does a 16-bin rg-chromaticity histogram lose that a 3D RGB histogram keeps?
- When does histogram intersection beat sum-of-squared-differences as a distance metric, and why?
- How much does spatial layout matter? (Top-half vs. bottom-half histograms catch sky-over-stone in a way a whole-image histogram never does.)
- Can a separable Laws filter (R5L5, L5E5) distinguish bricks from grass when color alone cannot?
The retrieval results in this repo are the answers I worked out for each.
The full set of query to top-N grids is in
CBIR Project Report.pdf. A few highlights:
| Task | Feature | Distance | Result grid |
|---|---|---|---|
| 1. Baseline | 9x9 center patch | Sum-of-squared-differences | Image*Task1.jpg |
| 2. Histogram | 2D rg-chromaticity, 16 bins | Histogram intersection | Task2Image*.jpg |
| 3. Multi-histogram | Two 3D RGB hists (top/bottom halves), 8 bins | Histogram intersection | Task3Image*.jpg |
| 4. Texture + color | 3D RGB hist + gradient-magnitude hist (Sobel), equal weight | Histogram intersection | Task4Image*.jpg |
| 5. Custom (Laws R5L5) | 3D RGB hist on original + Laws-filtered image | Histogram intersection | Task5Image*.jpg, Task55Image*.jpg |
| Ext 1. RGB-vs-rg | 3D RGB hist, 8 bins | Histogram intersection | E1Image*.jpg |
| Ext 2. L5E5 | 3D RGB hist + L5E5-filtered hist (0.3 / 0.7 weighting) | Histogram intersection | E2Image*.jpg |
| Ext 3. Gabor | Custom Gabor filter bank | Histogram intersection | GaborFIlterExtension.png |
Each row in the table corresponds to a result grid where the first image is the query and the rest are the top matches in increasing order of distance.
Requirements: g++, OpenCV 4 (with pkg-config opencv4 configured), and a
POSIX-y system (the file walker uses dirent.h).
cd "Project Files"
make # builds ./main against OpenCV 4
./main olympus # point it at the image directoryThe program then prompts interactively for:
- Task (1 through 4, baseline through texture+color)
- Target image filename (e.g.
pic.0164.jpg) - Number of top matches to display
- Number of histogram bins
It opens an OpenCV window per match, sorted by distance.
The Task 5 custom design, the basic RGB extension, and the Laws / Gabor
extensions each live in their own .cpp driver
(Task5_CustomDesign.cpp,
BasicExtension.cpp,
LawsFilterExtension.cpp,
GaborFilterExtension.cpp)
and can be built by swapping SRC in the Makefile.
main.cpp // interactive CLI: prompts for task, query, N, bins
|__ myfunctions.hpp // feature + distance interface
|__ myfunctions.cpp
|-- Sobel X / Y / magnitude (hand-rolled separable kernels)
|-- 9x9 patch extraction -> sumSquaredDiff
|-- 2D rg-chromaticity hist -> histogramIntersection2D
|-- 3D RGB histogram (n bins) -> histogramIntersection3D
|-- splitImage(top / bottom) -> multiHistogramMatching3D
|__ readFiles() // walks dir, ranks all images
readFiles() opens the dataset directory with opendir(), filters for
.jpg / .png / .ppm / .tif, computes the feature vector for every image,
ranks all images by distance to the query, and renders the top N in OpenCV
windows. Everything happens in a single process: no pre-built index, no
on-disk feature cache. On the 1,107-image Olympus dataset this finishes in a
few seconds per query on a laptop.
| Feature | Vector size (8 bins) | Captures |
|---|---|---|
| 9x9 center patch | 243 ints | Raw local color, ignores everything else |
| 2D rg-chromaticity | 256 floats | Hue-ish info, illumination-invariant-ish |
| 3D RGB (whole image) | 512 floats | Color distribution, no spatial layout |
| 3D RGB (top + bottom halves) | 1,024 floats | Coarse spatial layout |
| 3D RGB + gradient-magnitude hist | 1,024 floats | Color + texture energy |
| 3D RGB + Laws R5L5 hist | 1,024 floats | Color + oriented texture (ripples x edges) |
R5 is the "ripple" 1D kernel, L5 is the "level" kernel. The outer product R5 x L5 gives a 5x5 separable filter that fires on ripple texture (think: reflected ripples on a glossy floor, bricks). Convolving with R5L5 and then taking the 3D RGB histogram of the response acts like a learned texture-aware fingerprint at zero training cost. Per-query weighting (0.3 original / 0.7 Laws response) made the L5E5 extension visibly better at chalk-on-pavement and arrow markings than color-only retrieval.
- C++17 with OpenCV 4 (
opencv4via pkg-config) dirent.hfor directory walking- Hand-rolled Sobel X / Y and magnitude (no
cv::Sobel), to keep the pixel-level intuition honest - Build: GNU Make (single-file targets per extension)
The full set of result grids and per-task commentary is in
CBIR Project Report.pdf. The short version
of what I learned:
- Color alone is brittle. Task 2 (rg-chromaticity, 16 bins) ranks a sky photo above a building photo for a building query, because they share blue. Task 4 (color + gradient magnitude) fixes this by giving brick walls a distinct texture signature.
- Spatial layout matters more than I expected. Task 3's top/bottom split reliably retrieves photos with the same sky/ground composition. Whole-image histograms cannot.
- Histogram intersection beats SSD for histograms. Intersection rewards shared mass; SSD penalizes shape differences that the eye does not care about.
- Reducing RGB bin count from 16 to 8 was net-positive for retrieval precision on this dataset: fewer bins, more shared mass, less overfitting to exact shades.
- Laws R5L5 was the most interesting single change. It caught the ripple-on-wood-floor query in a way no color or gradient feature did.
The Project Files/olympus/ directory is the 1,107-image Olympus dataset
provided with the course. Credits to the original photographers are in
Project Files/olympus/credits.txt.
Images are kept in-repo so the retrieval results are exactly reproducible.
This makes the repo about 80 MB on disk; if you only want the code, you can
delete that directory and point ./main at any directory of images.
Built under Professor Bruce Maxwell (CS 5330, Northeastern, Spring 2023). Thanks to classmates Sumegha, Ravina, and Husain for brainstorming on the texture-feature experiments.