Natural Image Segmentation via Lossy Compression


Contents

Abstract


We present a novel algorithm for unsupervised segmentation of natural images that harnesses the principle of minimum description length (MDL). Our method is based on observations that a homogeneously textured region of a natural image can be well modeled by a Gaussian distribution and the region boundary can be effectively coded by an adaptive chain code. The optimal segmentation of an image is the one that gives the shortest coding length for encoding all textures and boundaries in the image, and is obtained via an agglomerative clustering process applied to a hierarchy of decreasing window sizes. The optimal segmentation also provides an accurate estimate of the overall coding length and hence the true entropy of the image. We test our algorithm on two publicly available databases: Berkeley Segmentation Dataset and MSRC Object Recognition Database. It achieves state-of-the-art segmentation results compared to other popular methods. The source code for our algorithm and detailed results on both databases are available below for peer evaluation.

Adaptive Texture and Boundary Encoding


Constructing Texture Features

For each pixel, we apply a w × w cut-off window around the pixel across the three color channels, and stack the color values inside the window in a vector form. For ease of computation, we reduce the dimensionality of these features by projecting the set of all features X onto their first D principal components.


 

Adaptive Texture Encoding

The coding length of a region is uniquely determined by the mean and covariance (μ, Σ ) of the texture windows in the region. To estimate (μ, Σ ) empirically we need to exclude the windows that cross the boundary of R:

 

 

We now describe encoding these texture vectors based on the lossy MDL principle. For a region R containing N pixels, a good approximation for the optimal coding length of R up to distortion ε2 is:

 

Adaptive Boundary Encoding

In the Freeman chain code coding scheme, the orientation of an edge otis quantized along eight discrete directions. The difference chain code between two consecutive edges ot and ot+1 is Δot = mod(ot - ot+1, 8). Given the prior distribution Po] of difference chain codes in natural images, the optimal coding of the boundary B(R) is given by:

 


Segmentation by Minimizing Coding Length


The total coding length of an image I with respect to a segmentation with non-overlapping regions R1,...,Rk is:

The optimal segmentation of I minimizes this total coding length function, and can often be found through the following agglomerative process. Given an oversegmentation of the image, at each iteration, we find the pair of regions Ri and Rj that maximally decrease the total coding length if merged:

To model the spatial locality of textures, we further construct a region adjacency graph where each vertex vi corresponds to a region Ri and the edge eij is present if and only if regions Ri and Rj are adjacent in the image. To perform image segmentation, we simply apply a constrained version of the agglomerative procedure - only merging regions that are adjacent in the image.


Results on Public Image Databases


We report the results of our image segmentation with a fixed quantization parameter (epsilon=150) on all images in the Berkeley Segmentation Dataset and the MSRC Object Recognition Dataset. For each result, the top image is a segmentation image where each region is colored by its mean color (like the above image to the left), and the bottom image superimposes the segmentation lines on the original image (like the above image to the right). Please click on the links below to see results for each dataset.

The images for these results have been compressed to fit within the file limit. For the uncompressed originals for each dataset, please refer to the links below:

MATLAB Software


Texture and Boundary Encoding-based Segmentation 1.0

Installation

This texture segmentation algorithm uses the superpixel code of Mori et al., which includes some c files that need to be compiled. To do this open MATLAB and then run the following commands:
> cd superpixels/yu_imncut
> mex csparse.c
> mex ic.c
> mex imnb.c
> mex parmatV.c
> mex spmd_imncut.c

Testing

To test our texture segmentation algorithm, run the following command in MATLAB:
> test_texture_seg

This will run TBES on an example image from the Berkeley Segmentation Dataset, compute the probabilistic Rand index and variation of information segmentation metrics (in comparison with human results), and display the segmentation results. For information on computing the precision and recall curve, please refer to the Berkeley Segmentation Dataset.

Copyright

Authors: Shankar Rao, Hossein Mobahi, Allen Yang
Contact: Shankar R. Rao: srrao AT illinois DOT EDU


(c) Copyright. University of Illinois, University of California, 2009.

Notice: The packages should NOT be used for any commercial purposes without direct consent of the authors.

Publications



Our Work in the News



References


  1. Segmentation of Multivariate Mixed Data via Lossy Coding and Compression. Yi Ma, Harm Derksen, Wei Hong and John Wright. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
  2. Recovering human body configurations: combining segmentation and recognition. Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2004.
  3. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. David Martin, Charless Fowlkes, Doron Tal and Jitendra Malik. Proceedings of the International Conference on Computer Vision, 2001.
  4. Texton-boost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. Jamie Shotton, John Winn, Carsten Rother and Antonio Crimi. Proceedings of the European Conference on Computer Vision, 2006.

Comments? Questions? Contact Shankar Rao at: srrao AT illinois DOT EDU