Library Design using C++ - Final Project
Using jpeglib.h (libjpeg) which is a commonly used jpeg library/package - http://libjpeg.sourceforge.net/
Using the library - https://github.com/md81544/libjpeg_cpp/ (https://www.martyndavis.com/?tag=libjpeg) - Which is a wrapper around the standard libjpeg/jpeglib.h written in STL around the jpeglib.h data structures that use c-style representations of the image class/strucure.
We will be using this Wrapper instead of the standard classes that Jpeglib.h offers since it is way more simpler and writing our own wrapper might be tedious.
Before starting, please install libjpeg using (Comes in-built if you are using Unix-like distributions) -
brew install libjpeg
or brew install libjpeg-dev
git clone https://github.com/Gouthamkreddy1234/Image-Dataset-Augmentor.git
brew install libjpeg
- Copy the code form the cloned directory into your project directory
- Add the
Augmentor.cpp jpeg.cpp Operation.cpp
to your makefile (follow below example assuming main.cpp is your main project file)
prod: main.cpp Augmentor.cpp jpeg.cpp Operation.cpp
g++ -O -std=c++17 -Wall -Wextra -Wpedantic -Werror -o prod main.cpp Augmentor.cpp jpeg.cpp Operation.cpp -ljpeg
test: unit_test.cpp Augmentor.cpp jpeg.cpp Operation.cpp
g++ -O -std=c++17 -Wall -Wextra -Wpedantic -Werror -o test unit_test.cpp Augmentor.cpp jpeg.cpp Operation.cpp -ljpeg -lgtest
debug: main.cpp Augmentor.cpp jpeg.cpp Operation.cpp
g++ -g -O -std=c++17 -Wall -Wextra -Wpedantic -Werror -o debug *.cpp -ljpeg
-
Command -
make prod
IMPORTANT: Use the prod target when you build the code, test target builds only the unit test code -
#include "Augmentor.h"
in your main.cpp -
Usage example:
augmentorLib::Augmentor augmentor(argv[1],argv[2]); //input and output directory path <br>
augmentor <br>
.rotate(45,90,0.5) // 45-90 degree of rotation randomness <br>
.flip(HORIZONTAL, 0.5) // 0.5 probability of flip operation being applied to an image <br>
.crop(300, 300, true) // (x, y) size of cropped image <br>
.resize(120,120,1) // (x, y) size of resized image <br>
.invert(1) // invert with probability 1 <br>
.sample(1000); // Output 1000 images
-
This will output 1000 augmented images to the provided destination directory (argv[2])
-
Documentation
LINK - http://image-augmentor.s3-website-us-east-1.amazonaws.com
PDF - https://image-augmentor-pdf.s3.amazonaws.com/Documentation.pdf
Augmentor
is a C++ library focused on image batch manipulation and processing. We would like to provide simple interface and excellent performance for users, so we have leveraged the following design ideas.
Similar as other data processing tools (e.g. Spark, Flink), our library also provides declarative APIs for users to build up their image manipulation pipelines. The advantage of declarative APIs is that the pipelines can be checked before the actual processing. The library evaluates the inputs of each operation, and make sure all of them are legal operations. For example, the input of resize
operation cannot be negative. If invalid parameters are found, the program will be terminated in the building stage.
When users setup their pipelines, it is common to concatenate a set of operations to manipulate images. Thus, the pipeline builder of our library is designed to be chain-able. For example,
augmentor
.rotate(45,90,1)
.flip(HORIZONTAL, 1)
.crop(300, 300, true)
.resize(120,120,1)
.rapid_blur(5)
.invert(1);
In order to implement this style, we design the methods like
Augmentor& Augmentor::some_operation(parameter param...);
The augmentor
has a vector to store the definitions of every operations, which looks like
std::vector<std::unique_ptr< Operation<Image> >> operations;
The reason to use unique_ptr
will be discussed more in the Section 4.
After building up a pipeline, users can run it simply by calling sample
method. The current implementation is straight forward: for each image, loop through every operation and perform the transform on the image. It looks something like
for (auto &operation : operations) {
image = operation->perform(image);
}
Since the processing of images are independent of one another, parallel programming will be used in future to speed up the processing.
There are many formats of images. They share the same APIs (e.g. getPixel
, setPixel
), but have different implementations. It is preferred to generalize the library to manipulate images in different formats, so Template is used in this library. Here is an example:
template<typename Image>
class Operation {
//
// private content...
//
public:
virtual Image* perform(Image* image) = 0;
};
Currently, this library only supports JPEG images.
// Image actually means JpegImage
Operation<Image> operations;
In future, we will introduce more image formats, like png, bmp.
Operation<PngImage> PngOperations;
Operation<BmpImage> BmpOperations;
This library is built under the C++17 standard. In future, we would like to upgrade to C++20, where Concept is introduced. Since images, despite their formats, share the same interface (e.g. getHeight
, getWidth
, getPixel
, setPixel
), it is recommended to constrain an image class using Concept. A concept can make sure every image class share the same interface.
This library uses the idea of Polymorphism to implement different operations. All operations inherit the base class Operation
. It provides a few methods: (1) a bool
randomizer to see if an operation take places this time; (2) a random number generator (between 0 and 1) to introduce randomness in each operation; and (3) a virtual function perform
to ensure the same interface of its subclasses on image processing.
It looks like:
template<typename Image>
class Operation {
private:
inline bool operate_this_time();
inline _precision_type uniform_random_number();
public:
virtual Image* perform(Image* image) = 0;
};
One subclass looks like
template<typename Image>
class ResizeOperation: public Operation<Image> {
public:
Image* perform(Image* image) override;
};
A pipeline is made up of operations, and Augmentor
stores the operations in a vector. We want to show the Augmentor
's ownership of operations. There are two safe ways to show ownership: (1) a vector of objects, (2) a vector of unique pointers. Since a vector of base classes casts the objects of subclasses into base-class objects, we decide to use unique_ptr
to show the ownership.
// Augmentor.h
class Augmentor {
std::vector<std::unique_ptr< Operation<Image> >> operations;
public:
Augmentor& some_operation(parameters param...) {
auto operation = std::make_unique<SomeOperation<Image>>(param);
operations.push_back(std::move(operation));
return *this;;
}
}
The advantage for ownership lies in memory management. Before an Augumentor
is going to be destroyed, the unique_ptr
will first release the object it points to. Therefore, we can prevent any potential memory leaks in Augmentor
class.
This library uses a lot of compile-time programming to optimize the code and performance. First of all, this library uses Template heavily. Since I have discussed Template in previous section, I will skip the basic usage of Template here. Instead, I will talk about how we can use Template to optimize code structure.
Here I will show how to use template to make either static or dynamic filter when the GaussianBlue
operation is built. The basic idea of blurring an image is to convolute between a filter and an image. The values of a filter can be either stored in array or vector. Storing in an array has advantages like fast accessing, but it requires users to specify the size before compile. Storing in a vector is more flexible, whose size is set in run-time, but it has relatively slow access. Therefore, I decide to use Template to integrate these two situations. The design looks like
template<unsigned N=0, bool Static = (N > 0) >
class gaussian_blur_filter_1D;
template<unsigned N>
class gaussian_blur_filter_1D<N, true> {
double array[N];
public:
explicit gaussian_blur_filter_1D(double sigma);
};
template<unsigned N>
class gaussian_blur_filter_1D<N, false> {
std::vector<double> vector;
public:
explicit gaussian_blur_filter_1D(double sigma, size_t n);
};
Either static and dynamic filters are created based on the constructor. Here is an example to build them.
// build a static filter with size of 5.
auto filter = gaussian_blur_filter_1D<5>(1.0);
// build a dynamic filter with size of 5.
auto filter = gaussian_blur_filter_1D(1.0, 5);
BlurOperation
class wraps these two constructors into two member methods, so is the Augmentor
class. Thus, users can build a blur operation either by blur<5>(sigma)
or blur(sigma, 5)
.
Here is another case to use template: allow a random number generator to output either integer or floating point numbers. In modern c++, <random>
package is used to generate random numbers. It has two uniform number generators:std::uniform_real_distribution
and std::uniform_int_distribution
. Normally, if we want to build a generator like:
template <typename DataType>
class UniformDistributionGenerator<DataType> {
public:
inline DataType operator()();
}
We may include both uniform number generators mentioned above as members, and call them based on datatype. However, we can slightly modify the template and split this template into two implementations:
template <typename DataType, bool IsReal = std::is_floating_point<DataType>::value>
class UniformDistributionGenerator;
class UniformDistributionGenerator<DataType, true> {
std::uniform_real_distribution<DataType> distribution;
}
class UniformDistributionGenerator<DataType, true> {
std::uniform_int_distribution<DataType> distribution;
}
In this way, there is no overhead from redundant members. Also, the performance is better since we don't have to do if-else evaluation when calling operator()
. The performance difference here may not be significant due to the simple function here, but this idea can be applied to more complex design.
There are other design features that distinguish our library from others.
Many image processing libraries (e.g. PIL) use rand()
to generate random numbers, but this method may cause a few issues. please see this Q&A. On the contrary, our library uses <random>
to achieve more realistic randomness. We use current timestamp as seed to create a generator, and then use uniform_distribution
to output random numbers. Our library should result in better randomness than others.
This library implements some optimized algorithm to increase performance. One example is the Fast Gaussian Blur. Since Gaussian Blur is expensive, whose complexity should be at least O(N * r), where N is the area of an image and r is the size of a filter. However, research have found that multiple Box Blurs can approximate the result of Gaussian Blur, and the complexity of a Box Blur can be as low as O(N). Therefore, this library decides to implement the fast Gaussian Blur to increase the performance. The details can be found in the Opperation.h
file.