Non-photorealistic rendering

By Yan Li (yanli)

 

 

                      

                                 

 

 

 

Introduction

     The goal of my project is to generate non-photorealistic images using some techniques. The field of NPR has been well developed, and there are many kinds of styles of NPR, such as cartoon-like images, bitonal images and other funny images. In my final project, I have implemented two different styles of NPR image, which are cartoon-like image and gray-scale image with different types of texture mapped based on color. The two techniques are separately implemented, so my project should de divided into two parts.

 

Cartoon-like Images

     The first part of my project is the implementation of cartoon-like images. The cartoon-like image is a simplified illustration from original image. It doesn’t have fine textures and rich edges. Instead, it uses simple simple color regions and bold lines around the boundary between objects in the image. I divide my works on this style of NPR into two parts: the first part is to do color quantization, which is aiming at reducing the color complexity in the original image; the second part is to add fine edges to the image, making the contrast between different color regions more apparent.  We use such a style of image to abstract the original image, turning it to a non-photorealistic image. I take the reference of Holger Winnemöler et al. The results turn out to be good.

 

Results:

                    

 

                   

      

    

 

    

 

             

 

                                             

Steps:

1.   Convert the image from RGB space to CIE LAB space.

2.   Use fast bilateral filter to successively filter the image.

3.   Do color quantization.

4.   Use Dog to detect edges.

5.   Warp the edges, making them sharper.

     6. Overlay the edges on the color image.

Step1: Space conversion

     We need to firstly convert the image from RGB space to LAB space. LAB space is designed to approximate human vision, and I can easily get the luminance channel (L) and chrominance channel (a,b). In my project, I do color quantization only in luminance channel, and my fast bilateral filter is done in L,a,b channel.

 

Step2: Fast bilateral filter

     The second step is to use fast bilateral filter and iteratively filter the image (In my project, I do this for 3 times).  The reason for using fast bilateral filter is simple: we want to blur those local contrast while preserve the contrast between strong edges. Using a regular Gaussian filter will destroy the edge. The edges are essential since we will do edge detection based on the blurred image. On the other hand, just filter once cannot generate an enough blurry image. Since we will do color quantization later, we must reduce the local contrast to enough extent, and thus we iteratively filter the image. Naïve bilateral filter will cost too much time so I use fast bilateral filter that I have implemented in project 5. Here are the original image and images after blurring once, twice and 3 times.

 

               

    

Note that the edge is preserved very well. The image that has been filter for more than 3 times has become textureless. It’s ideal for doing color quantization.

 

Step3: Color quantization

The third step is to do color quantization. The regular color quantization is to divide the full range of one channel into several bins, and we put the pixels into the corresponding bins, and then set the specified value of that channel to the value of the bin that the pixels are in. But this method is problematic because the boundary of each bin is too hard, which may lead to sharp boundaries in the processed image. Instead of using that approach, I add a tanh() function and a parameter to make the color quantization smoother. I use the following formula to do color quantization on luminance channel of the blurred image I got from the step above.

 

The free parameter  enables me to adjust the softness of tanh(). Here is a comparison between regular approach of color quantization and my approach.

                                                                           

                                                                                                                                                                                                                         Diagrams from Holger Winnemöler’s slide

       Larger  will make the result sharper. The image below on the left side is generated by smaller , and the right side is larger .

 

          

Step4: Dog

There are many operators that can be used to detect edge, such as Canny, Sobel and Dog. In my project I use Dog operator to detect edges. It has many advantages such as computational efficiency and it is not prone to disconnectedness like Canny. Instead of regular Dog function, I use an extended version Dog that has many parameters. Modifying these parameters can generate different results, and I can easily adjust the thickness of the edge, or adjust the magnitude of noise.

 

My Dog operator uses the following formula:

Where  is the image after Gaussian filtering.

 

Here is the image after doing Dog.

 

                

Step5: Warping edge (optional)

Indeed, if having finished the above 4 steps, it’s enough to overlay the image of edge with the color image. Here I do an image-based warp in order to make the edges less blurry and sharp the edges. Image-based warping will move the pixels along the direction of its gradient. If the gradient is zero, it won’t be moved. We can consider the image-based warping as a displacement map. This method can be used to sharp the edges or attenuate the edges.

 

The approach is simple. We should firstly extract the gradient for each pixel by doing vertical and horizontal Sobel operator. Then we blur this gradient map to attenuate the range of gradient. Next we move each pixel along its direction of gradient.

 

Here is one example of comparison between original edges and edges after warping. The result of warping is the rightmost one.

 

    

 

Here is another example from Holger Winnemöler’s slide:

     

 

Step6: Overlay

Then it’s the last step, we simply overlay the edges on the image of color.

 

Failure cases

There are no ‘absolute’ failure cases. Due to the limitation of Dog, the images are often suffering from noise. I cannot use erosion to eliminate those noises since this will destroy the normal edges. A good edge detector will generate a better result, I believe, but I haven’t found such a detector. I test the images using Canny, and the results turn out to be much worse.

 

 

 

Summary

Using the method above can generate a cartoon-like image nicely, and I believe this method may help those artists who are working in the industry of cartoon to some degree. They can produce some specified scenes using this automatic algorithm to transfer ordinary images to cartoon-like images.

 

As for the algorithm itself, I think there are still many parts that can be improved. The color quantization approach can provide good results in my project, while the edge detection often are suffering from noise or losing important edges. The simple Dog operator cannot extract those important edges automatically. If I modify the parameter for Dog, like to make the edge sharper and thicker, then the noise will be apparent. The best algorithm of edge detection should preserve those important edges even though the edges are not so strong as those unimportant edges or noise.

 

 

Color-based Texture Mapping

The second approach of my project is to generate a gray-scale image, with different texture mapped to different regions. The approach is totally different from what I have illustrated above. In order to map the textures, we must firstly have a texture library. I manually generate the textures by using one commercial software called ‘Manga studio’. I take the reference of Yingge Qu et al. The basic algorithm of texture mapping can be divided into several parts: segmentation, kmeans clustering, and texture feature matching.

 

Results:

  

        

 

 

   

 

 

 

   

 

 说明: Macintosh HD:Users:yanli:Documents:ComputerPhotography:final_proj2:data1:test10.png

   

Steps:

1.   Build a texture library

2.   Build texture features for each style of texture

3.   Do segmentation

4.   Do kmeans to cluster colors

5.   Texture mapping

 

Step1: Build a texture library

     The first step is to build a texture library and we will select textures from this library to map to the image. We should have several different types of textures. In my project, I generate some common textures using a software called ‘Manga Studio’. The textures are all gray-scale image. In addition, we must generate textures of different density of each style because we should take luminance into consideration when selecting appropriate textures. Darker region should be mapped a large density texture and brighter region should be mapped a small density texture.

 

     The library should be a two-dimension library that the horizontal axis is one style of textures of different luminance, and vertical axis is the different styles of textures.

 

           

           

             

    

 

Step2: Build features for texture

     Segments exhibiting apparent texture characteristics should be assigned a texture based on texture similarity. To quantify the texture characteristics, we compute the texture features using Gabor wavelet. It’s a good texture identification technique for texture. In my project I compute Gabor wavelets of 8 different orientations and 3 different scales and for each texture I generate a 48-dimension vector to be the texture feature. Each time I use a Gabor with specified orientation and scale to filter the image, I extract the average and standard deviation value of the result image. Because we have 8*3 = 24 Gabors, so the vector of identification is 48-dimension.

 

     Here is the visualization of 24 Gabor wavelets composed by 8 orientation and 3 scales. Gabor has real part and imaginary part. This is the visualization of real part:

 

             

       

       

 

Step3: Segmentation

     We must segment the image into different regions. After segmenting, we can select appropriate texture to map to the region.

     In my project, I use Meanshift to segment image. It’s a highly color-based method. Those pixels that have similar color will be clustered into a segment.

     I didn’t implement Meanshift. Instead, I used EDISON’s implementation for better performance.

     After segmentation, I compute the texture feature for each segment for future use. The method is just like the above.

    

     Here is the visualization of each segment, with random color filled.

         

 

Step4: Kmeans clustering

     In this step, we cluster the average color of all of the segments in one image, using Kmeans algorithm. Suppose we have n texture styles, then k equals n. I firstly transfer the image from RGB space to CIE LAB space. Secondly I do Kmeans algorithm on (a,b) plane, regardless of the L channel (luminance), because we only care about the chrominance now. There are many distance metric of Kmeans, and I use the cosine as the distance metric, because the whole plane of (a,b) vectors is a plane, with zero to be achromous.

     After doing that, each segment should belong to one cluster.

 

    

 

Step5: Texture mapping

     In my project, I use two strategy of texture mapping. Firstly I compute the distance between the texture features of each segment and each texture. Those segment that is similar to the texture in the library should be mapped that texture style. I set a threshold to control if they are similar. If the segment cannot find a similar texture style, then we looks at which cluster the segment is in. We randomly assign a texture style to all of the segments in the same cluster.

 

        

 

 

Note that we should preserve the luminance of each segment. As I have mentioned, the texture library is a 2-D library. If the texture styles are different, then obviously their feature vector should be different. If the textures are similar and only differs in luminance (density), then the feature vectors are similar.

 

We have chosen a style for each segment, and we must then compute the average luminance of that segment, and then map a texture with appropriate luminance to the segment. If the average luminance of each segment is too large or too small, we simply set the color of that segment to white or black.

 

After doing texture mapping, we simply overlay the result image with the image being edge detected.

 

Limitation

This method is highly color based. If the color distribution in an image is too sparse, then the Kmeans will fail to cluster similar colors, and Meanshift will segment badly. Furthermore, Meanshift algorithm cannot segment smartly since it only clusters those color-similar regions. Sometimes an object will be divided into many segments.

 

Summary

     Both of the algorithms I used have some limitation and need to be optimized in the future work.

In addition, what I have done is only on single image. I haven’t tried to implement them in video. Obviously more problems will appear since we must preserve the temporal coherence.