Image segmentation is one of the most important steps for finding and recognizing objects. With segmentation, image is divided into pieces that do not overlap with each other and meaningful in itself such as color, intensity, texture. Segmentation is the process of assigning a label to each pixel in the image. Pixels with similar characteristics are assigned the same label and are located within the same location. A region formed as a result of segmentation shows a different characteristic from the neighboring region.
Superpixel is a group of pixels that share common characteristic. Superpixel carries more information than a pixel. Significant fragments are obtained with superpixels, such as region obtained in the segmentation process but objects can not be fully obtained. Superpixel is the result of an image oversegmentation.
Region merge is used to merge small and homogeneous regions. Homogenity is calculated for each fragmented region. If regions are homogeneous enough, the adjacent areas are joined. This process continues until there is no homogeneous area. Region merge algorithm was used to eliminate oversegmentation caused by SLIC algorithm which divide images into superpixels.
An evaluation metric is required in order to measure the success of the algorithms used. As in the case of classification problems, a comparison is not possible. Probabilistic Rand Index which emphasizes cluster similarity, was used to measure clustering success. In order to compare the results obtained, a common set of images should be used. The images in Berkeley Segmentation Dataset 500 which were segmented by people were used in the calculation of success.
Berkeley Segmentation Dataset 500 is the dataset used for evaluation SLIC - region merge algorithm. It consist of 500 images, each with 5 different ground truth which are segmented by people. This dataset divided into a training set of 200 images, a validation set of 100 images and a test set of 200 images. Since no training was performed, only test images were used. The data set contains 2 different image sizes, 481 x 321 and 321 x 481.
SLIC is an algorithm which cluster pixels and generate compact and uniform superpixels. SLIC performs clustering of pixels in 5-D space defined by RGB color space and x and y coordinates of pixels. SLIC generates superpixels by clustering pixels based on their similarity and proximity of pixel values.
Superpixel number K is given as input parameter before SLIC algorithm is executed. The area of superpixel is calculated by dividing the total number of pixels (N) in the image by the number of superpixels N / K. The edge length of a superpixel was calculated to find the center point of the superpixels S = √N / K. The center of superpixels were found starting from S / 2 and increasing S. The center point of the superpixels were chosen as the lowest local minimum gradient around 3 x 3 of the midpoint of the superpixel C = [r, g, b, x, y]. With this process, the edge and noise pixels are prevented from being the center point.
The distance between cluster center and pixels are different from the Euclidean distance. Firstly, the color distance value was found. The distance was calculated by finding the square root of the difference between the pixel value in the cluster center and the pixel value examined. Secondly, the pixel coordinate distance was calculated. The distance was calculated by finding the square root of the difference between the pixel coordinates of the cluster center and the pixel coordinate value examined. In addition, there is a coefficient indicating the compactness value of the superpixel (m). This coefficient is multiplied by the coordinate distance. After this process, the pixel value distance and pixel coordinate distance were added and the distance value was obtained.
Region merge is an algorithm which is used to merge fields with similar properties. The small and homogeneous areas generated by the SLIC algorithm are combined using region merge algorithm. In the process of region merge, how to merge regions and where to start merging them is an important problem.
Region merge was performed by looking at the average grayscale pixel value and histogram differences. A certain threshold was determined in both methods. Whether or not the regions merge was made according to the determined threshold. In addition, we first began to merge regions at the begining of image and continued to be merged until the adjacent regions and the adjacent regions of the neighbour regions. Visualization was performed by averaging pixel values in the regions.
The Rand Index (RI) was used for clustering evaluation. It shows similarity between two clusters. It works by comparing the compatibility of assignments between element pairs in clusters. This gives a measure of similarity with value ranging from 0 when the two segmentations have no similarity to 1 when the segmentations are identical. Suppose there are two S1 and S2 segmentations for an image I which has n pixel.
- a, The number of pixel pairs in S1 and S2 from the same object
- b, The number of pixel pairs in S1 and S2 on different object
The probabilistic rand index (PRI) is the average of all RI values of an image segmented by different people.
Various hyperparameters were used during superpixels seperation and merge processes. Hyperparameters were determined according to the success achieved in a small set selected in test set. The number of superpixels to be generated by the SLIC algorithm was selected as 9000. The superpixel count was kept as large as possible. As the superpixel size increases, the number of pixels in the superpixel decreases. In this way, the representation of small areas in the image is provided.
In the SLIC algorithm, each pixel is assigned to a cluster. The decisive criteria for the assignment to the cluster is the distance function. The distance is the sum of the pixel color and the pixel coordinate distances. Multiplied by a certain coefficient (m / initial superpixel area) to reduce the effect of color coordinate distance when calculate total distance. When m is large, spatial proximity is more important and the resulting superpixels are more compact. When m is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular size and shape. The m value is chosen as 10 which is widely used.
Two different methods were used for region merge process. In the first method, histogram was extracted for each superpixel region. The calculated superpixel histograms were compared with neighboring superpixels. If the similarity between two histograms are less than 0.5, the superpixel regions were combined. RGB was used as color space. Each channel can have a value between 0 and 256. During histogram extraction, the number of bin-size 256 can be selected, but the superpixel area will be small and therefore some areas within the histogram will remain blank so the number of bin-size was chosen as 8. In the second method, the average grayscale value of pixels in the superpixel were calculated. If the mean grayscale value of two superpixels are less than 20, the regions are combined.
I worked over image segmentation with SLIC and region merge. Images are divided into superpixels using SLIC algorithm. The number of superpixels which are determined at the beginning affect the area of the superpixels to be created. Large number of superpixels mean oversegmentation. With region merge algorithm, superpixels with similar properties are merged and by this method, unnecessary separated superpixels are removed and oversegmentation is eliminated.
In order to improve performance of segmentation method used, in the future, different color spaces will be tried and new combining criteria will be established to determine whether superpixels will be merged. In addition, a global optimum result will be sought by hyperparameter-tuning. Furthermore, success comparisons will be made using different evaluation criteria like variation of information and segmentation covering.