Progress 80%


Given a single image of an arbitrary road, that may not be well-paved, or have clearly delineated edges or texture distribution, is it possible for a computer to find this road? This paper addresses this question by investigating several techniques, and give basic results of the effectiveness of them.

Color spaces

The images are mainly formatted with the various RGB spaces, but all of them are common in the number dimensions: 3 - namely Red Green Blue. Theoretically all of them are in the range of 0-1, but in computer science they use the powers of 2 in color depth. These represents the number of color variations; 1-bit color: 21=2 colors(monochrome), 2-bit color: 22 = 4 colors, etc. 24-bit color depth (so called True color) has 16,777,216 color variations, and currently the most used representation.

RGB color space is an additive color model, so if you mix all colors, you get white. It can be unusual, because in real life you face subtractive color mixing. The representation of the color space can be imagined as a 3 variable function, which creates a 3D cube with 3 axis: red, green and blue - as seen on Fig.1.

Fig.1 - RGB cube

Image processing in MATLAB

In MATLAB images are represented as a matrix - a matrix with x columns and y rows, where x is the width of the image, and y is the height. Every element is another matrix - if the image is in RGB color space - a 3D matrix - the colors are separated. If you work with gray scale images, the it is a 2d matrix. Often, the most convenient method for expressing locations in an image is to use pixel indices. The image is treated as a grid of discrete elements, ordered from top to bottom and left to right, as illustrated by Fig.2.

Fig.2 - pixel indices
%load image
%height,width and size of the color matrix
%color of the pixel in the position(x,y)
color= impixel(I,x,y);

Color-based detection

Color-based detection is based on the mean of the 3 color. So for example the color [100 100 100] will be represented as 100, while the color [50 100 120] will be (50+100+120)/3=90. This type of detection leads to an uncertain result, because you don't know the exact color neither the road, nor the surroundings. Thus the road can be seen on the contour map of Fig. 5, but it's harder to detect the edge. If you use 5x5 pixel average than it will lead to the same result. This type of detection can be rather used for boundary detection.

Fig.3 - Color-base detection
Click for bigger image
Fig.4 - Color 3d map
Click for bigger image
Fig.5 - Color contour map
Click for bigger image

Color difference-based detection

Color difference-based detection uses the rgb distance between two points. The distance is calculated as the following: RGBdistance=sqrt((r2-r1)^2+(g2-g1)^2+(b2-b1)^2). So as the distance calculated between two points in 3D space. This method can be used for the detection, but it's harder to find the boundary edges of the road, because the difference between the right and left surroundings can be different. The contour map on Fig. 8 shows the result, which is better than the color-based detection. The 3D map on Fig. 7 shows high spikes on the road-surroundings transition. It's not continuous because I used bigger than 1 step size in both x and y directions. If you use 1 as step size, the calculation time will be much higher than in this case (I used 20 pixel step size in x direction, and 22 rows in y direction with variable step sizes. The image size was 700px-by-260px. The calculation time was ~0.5s, and ~1.5s for 1px step size).

Fig.6 - Color difference-base detection
Click for bigger image
Fig.7 - Color difference 3d map
Click for bigger image
Fig.8 - Color difference contour map
Click for bigger image

Neighborhood processing

Neighborhood processing is a simple, yet very effective method. It uses an n-by-n (n-by-m) matrix, and counts the pixels which have the same color around the given pixel. The sum of the same color pixels will be the value of the given pixel. Is is possible to use a pattern in the kernel. In this case you have to zero all the matrix elements, where you don't want to search. The maximum value in the 3D plot will be the number of elements in the kernel. I used horizontal tracking, but you can use vertical, or diagonal tracking.

Fig.9 - Neighborhood processing method
Click for bigger image
Fig.10 - Neighborhood processing result
Click for bigger image
Fig.11 - Neighborhood processing 3d map
Click for bigger image
Fig.12 - Neighborhood processing contour map
Click for bigger image

Adaptive neighborhood processing with centerline detection

The difference between adaptive and nonadaptive neighborhood processing is that the search color changes during the process. The initial condition is that the bottom-center pixel is "road" color. Than you go from bottom to top, and every row has different search color. This color is the center pixel in the current row, if the RGB distance between this and the previous color is less than the limit you set (I used 15). Than the edge detection is quite simple: you only have to find the first "high" value. I used 80% of the maximum value as limit.

The centerline detection is the opposite of the road detection. We search for a "low" value between the edges of the road. It is conceivable that we don't find any "low" value in the row. In this case we have to guess it on the basis of the surrounding points. The last step is the polyline fitting.

Fig.13 - Adaptive neighborhood processing result
Click for bigger image
Fig.14 - Adaptive neighborhood processing 3d map
Click for bigger image
Fig.15 - Adaptive neighborhood processing contour map
Click for bigger image