OpenCV-Python

Saturday, December 29, 2012

K-Means Clustering - 2 : Working with Scipy

Hi,

In the previous article, 'K-Means Clustering - 1 : Basic Understanding', we understood what is K-Means clustering, how it works etc. In this article, we will use k-means functionality in Scipy for data clustering. OpenCV will be covered in another article.

Scipy's cluster module provides routines for clustering. The vq module in it provides k-means functionality. You will need Scipy version 0.11 to get this feature.

We also use Matplotlib to visualize the data.

Note : All the data arrays used in this article are stored in github repo for you to check. It would be nice to check it for a better understanding. It is optional. Or you can create your own data and check it.

So we start by importing all the necessary libraries.

>>> import numpy as np
>>> from scipy.cluster import vq
>>> from matplotlib import pyplot as plt

Here I would like to show three examples.

1 - Data with only one feature :

Consider, you have a set of data with only one feature, ie one-dimensional. For eg, we can take our t-shirt problem where you use only height of people to decide the size of t-shirt.

Or, from an image processing point of view, you have a grayscale image with pixel values ranges from 0 to 255. You need to group it into just two colors, may be black and white only. ( That is another version of thresholding. I don't think someone will use k-means for thresholding. So just take this as a demo of k-means.)

So we start by creating data.

>>> x = np.random.randint(25,100,25)
>>> y = np.random.randint(175,255,25)
>>> z = np.hstack((x,y))
>>> z = z.reshape((50,1))

So we have 'z' which is an array of size 50, and values ranging from 0 to 255. I have reshaped 'z' to a column vector. It is not necessary here, but it is a good practice. Reason, I will explain in coming sections. Now we can plot this using Matplotlib's histogram plot.

>>> plt.hist(z,256,[0,256]),plt.show()

We get following image :

Test Data

Now we use our k-means functions.

First function, vq.kmeans(), is used to cluster the data as per our requirements and it returns the centroids of the clusters. (Docs)

It takes our test data and number of clusters we need as inputs. Other two inputs are optional and is not of big concern now.

>>> centers,dist = vq.kmeans(z,2)
>>> centers
array([[207],
       [ 60]])

First output is 'centers', which are the centroids of clustered data. For our data, it is 60 and 207. Second output is the distortion between centroids and test data. We mark the centroids along with the inputs.

>>> plt.hist(z,256,[0,256]),plt.hist(centers,32,[0,256]),plt.show()

Below is the output we got. Those green bars are the centroids.

Green bars shows centroids after clustering

Now we have found the centroids. From first article, you might have seen our next job is to label the data '0' and '1' according to distance to the centroids. We use vq.vq() function for this purpose.

vq.vq() takes our test data and centroids as inputs and provides us the labelled data,called 'code' and distance between each data and corresponding centroids.

>>> code, distance = vq.vq(z,centers)

If you compare the arrays 'code' and 'z' in git repo, you can see all values near to first centroid will be labelled '0' and next as '1'.

Also check the distance array. 'z[0]' is 47, which is near to 60, so labelled as '1' in 'code'. And distance between them is 13, which is 'distance[0]'. Similarly you can check other data also.

Now we have the labels of all data, we can separate the data according to labels.

>>> a = z[code==0]
>>> b = z[code==1]

'a' corresponds to data with centroid = 207 and 'b' corresponds to remaining data. (Check git repo to see a&b).

Now we plot 'a' in red color, 'b' in blue color and 'centers' in yellow color as below:

>>> plt.hist(a,256,[0,256],color = 'r') # draw 'a' in red color
>>> plt.hist(b,256,[0,256],color = 'b') # draw 'b' in blue color
>>> plt.hist(centers,32,[0,256],color = 'y') # draw 'centers' in yellow color
>>> plt.show()

We get the output as follows, which is our clustered data :

Output of K-Means clustering

So, we have done a very simple and basic example on k-means clustering. Next one, we will try with more than one features.

2 - Data with more than one feature :

In previous example, we took only height for t-shirt problem. Here, we will take both height and weight, ie two features.

Remember, in previous case, we made our data to a single column vector. This is because, it is a good convention, and normally followed by people from all fields. ie each feature is arranged in a column, while each row corresponds to an input sample.

For example, in this case, we set a test data of size 50x2, which are heights and weights of 50 people. First column corresponds to height of all the 50 people and second column corresponds to their weights. First row contains two elements where first one is the height of first person and second one his weight. Similarly remaining rows corresponds to heights and weights of other people. Check image below:

So now we can prepare the data.

>>> x = np.random.randint(25,50,(25,2))
>>> y = np.random.randint(60,85,(25,2))
>>> z = np.vstack((x,y))

Now we got a 50x2 array. We plot it with 'Height' in X-axis and 'Weight' in Y-axis.

>>> plt.scatter(z[:,0],z[:,1]),plt.xlabel('Height'),plt.ylabel('Weight')
>>> plt.show()

(Some data may seem ridiculous. Never mind it, it is just a demo)

Test Data

Now we apply k-means algorithm and label the data.

>>> center,dist = vq.kmeans(z,2)
>>> code,distance = vq.vq(z,center)

This time, 'center' is a 2x2 array, first column corresponds to centroids of height, and second column corresponds to centroids of weight.(Check git repo data)

As usual, we extract data with label '0', mark it with blue, then data with label '1', mark it with red, mark centroids in yellow and check how it looks like.

>>> a = z[code==0]
>>> b = z[code==1]
>>> plt.scatter(a[:,0],a[:,1]),plt.xlabel('Height'),plt.ylabel('Weight')
>>> plt.scatter(b[:,0],b[:,1],c = 'r')
>>> plt.scatter(center[:,0],center[:,1],s = 80,c = 'y', marker = 's')
>>> plt.show()

This is the output we got :

Result of K-Means clustering

So this is how we apply k-means clustering with more than one feature.

Now we go for a simple application of k-means clustering, ie color quantization.

3 - Color Quantization :

Color Quantization is the process of reducing number of colors in an image. One reason to do so is to reduce the memory. Sometimes, some devices may have limitation such that it can produce only limited number of colors. In those cases also, color quantization is performed.

There are lot of algorithms for color quantization. Wikipedia page for color quantization gives a lot of details and references to it. Here we use k-means clustering for color quantization.

There is nothing new to be explained here. There are 3 features, say, R,G,B. So we need to reshape the image to an array of Mx3 size (M is just a number). And after the clustering, we apply centroid values (it is also R,G,B) to all pixels, such that resulting image will have specified number of colors. And again we need to reshape it back to the shape of original image. Below is the code:

import cv2
import numpy as np
from scipy.cluster import vq

img = cv2.imread('home.jpg')
z = img.reshape((-1,3))

k = 2           # Number of clusters
center,dist = vq.kmeans(z,k)
code,distance = vq.vq(z,center)
res = center[code]
res2 = res.reshape((img.shape))
cv2.imshow('res2',res2)
cv2.waitKey(0)
cv2.destroyAllWindows()

Change the value of 'k' to get different number of colors. Below is the original image and results I got for values k=2,4,8 :

Color Quantization with K-Means clustering

So, that's it !!!

In this article, we have seen how to use k-means algorithm with the help of Scipy functions. We also did 3 examples with sufficient number of images and plots. There are two more functions related to it, but I will deal it later.

In next article, we will deal with OpenCV k-means implementation.

I hope you enjoyed it...

And if you found this article useful, don't forget to share it on Google+ or facebook etc.

Regards,

Abid Rahman K.

Ref :

1 - Scipy cluster module documentation

2 - Color Quantization

K-Means Clustering - 1 : Basic Understanding

Hi,

In this section, we will be dealing with K-Means clustering, a machine learning algorithm, which is used for grouping a large set of data to desired number of groups depending on their similarity. It will be a series of articles, may be two or three articles.

In this article, we mainly focus on the understanding what is this algorithm, how it works, what are its applications etc. I think an intuitive understanding is more better than explaining the mathematical background behind the algorithm.

In the coming articles, I will explain how to use in-built functions of Scipy and OpenCV for K-Means clustering.

(Note : All images are sufficiently larger, but resized to fit in page. Just click on the image to see it in its original size)

By the way, What is this algorithm for ?

According to Wikipedia :

k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.

( Reading above passage, I hope you might have understood at-least this : It groups large data to some groups based on some features.)

If you didn't understand above statements, I will tell a couple of examples :

1 - Google News :

I am sure you will have visited Google News at-least once. For every news, Google provides a main article and just below it, there are links to corresponding news from all other newspapers. Check below image :

This is one of my favorite features in Google News, because I try to read news on same topic from different papers. (It helps me to understand how media play with our minds. Different newspapers try to present same news in different ways to audience. And it is seen frequently with respect to political news and people reading them are, most often, got biased by these news, while real truth is hidden from them. Presently the power of media is such that, if you control the media, you control the world. Anyway, it is out of our topic. )

I have always wondered how does Google achieve this. I am not claiming Google uses k-means clustering, but this is what K-Means clustering also does, You have a large set of news and group them according to its content.

2 - T-shirt size problem

This example is commonly used to explain clustering problems.

Consider a company, which is going to release a new model of T-shirt to market. Obviously they will have to manufacture models in different sizes to satisfy people of all sizes. So the company make a data of people's height and weight, and plot them on to a graph, as below:

People Height v/s Weight plot

Company can't create t-shirts with all the sizes. Instead, they divide people to Small, Medium and Large, and manufacture only these 3 models which will fit into all the people. This grouping of people into three groups can be done by k-means clustering, and algorithm provides us best 3 sizes, which will satisfy all the people. And if it doesn't, company can divide people to more groups, may be five, and so on. Check image below :


People grouped to different clusters

So, now I hope, you might have understood use of this algorithm. Now we can go to next step.

How does the algorithm work ?

This algorithm is an iterative process. We will explain it step-by-step with the help of images.

Consider a set of data as below ( Don't be bothered about what this data is, or you can consider it as t-shirt problem). We need to cluster this data into two groups.

Test Data

Step :1 - Algorithm randomly chooses two centroids, c1 and c2 (sometimes, any two data are taken as the centroids).

Step :2 - Then it calculates the distance from each point to both centroids. If a test data is more closer to c1, then that data is labelled with '0'. If it is closer to c2, then labelled as '1' (If more centroids are there, labelled as '2','3' etc).

In our case, we will color all '0' labelled with red, and '1' labelled with blue. So we get following image after above operations.

Initial centroid selection and data labelling

Step :3 - Next we calculate the average of all blue points and red points separately and that will be our new centroids. That is c1 and c2 shift to newly calculated centroids. (Remember, the images shown are not true values and not to true scale, it is just for demonstration only. So calculated averages may not be what you expect).

And again, perform step 2 with new centroids and label data to '0' and '1'.

So we get result as below :

New centroid calculated and data re-labelled

Now step - 2 and step - 3 are iterated until both centroids are converged to fixed points. These points are such that sum of distances between test data and their corresponding centroids are minimum. Or simply, sum of distances between red data & c1 and blue data & c2 is minimum.

Final result almost looks like below :

Final Result

So, I hope by this time, you would have obtained a basic understanding how K-Means clustering works. With this intuitive knowledge, read some good text books to understand the mathematics behind this algorithm. I am sure you won't find any difficulty to understand it.

It is just a top layer of K-Means clustering. There are a lot of modifications to this algorithm like, how to choose the initial centroids, how to speed up the iteration process etc. Me too, haven't gone much deeper and I think this would be sufficient to start working with K-Means clustering. Those who need more knowledge can read any good text books.

So that was a complete theoretical, easy-to-read article.

In next article, we will see how to use Scipy's kmeans implementation for data clustering, and we will deal OpenCV implementation in later article.

So that's all for now.

If you find this article useful, don't forget to share this article on Google+, facebook etc. It is just a click away !!!

Regards,

Abid Rahman K.

Ref :

1 - Stanford Machine Learning classes by Prof. Andrew Ng

Thursday, July 19, 2012

BackGround Extraction using Running Average

Hi, this is going to be a very simple article, but you will find it very helpful. It is about Background Extraction from a Video.

Suppose you are given video of footage of traffic, may be some thing like this : Traffic in India, and you are asked to find an approximate background. Or anything like that.

Background extraction comes important in object tracking. If you already have an image of the bare background, then it is simple. But in many cases, you won't have such an image and so, you will have to create one. That is where Running Average comes in handy.

( I thought about this when one guy asked a question in SOF : Link)

The function we use here to find Running Average is cv2.accumulateWeighted(). For example, if we are watching a video, we keep feeding each frame to this function, and the function keep finding the averages of all frames fed to it as per the relation below :

where :

src is nothing but our source image. It can be grayscale or color image and either 8-bit or 32-bit floating point.

dst is the output or accumulator image with same channels as that of source image, and it is either 32-bit or 64-bit floating point. Also, we should declare it first to a value which will be taken as initial value.

alpha is the weight of the input image. According to Docs, alpha regulates the update speed (how fast the accumulator “forgets” about earlier images). In simple words, if alpha is a higher value, average image tries to catch even very fast and short changes in the data. If it is lower value, average becomes sluggish and it won't consider fast changes in the input images. I will explain it a little bit with help of images at the end of article.

Code :

import cv2
import numpy as np

c = cv2.VideoCapture(0)
_,f = c.read()

avg1 = np.float32(f)
avg2 = np.float32(f)

while(1):
    _,f = c.read()
	
    cv2.accumulateWeighted(f,avg1,0.1)
    cv2.accumulateWeighted(f,avg2,0.01)
	
    res1 = cv2.convertScaleAbs(avg1)
    res2 = cv2.convertScaleAbs(avg2)

    cv2.imshow('img',f)
    cv2.imshow('avg1',res1)
    cv2.imshow('avg2',res2)
    k = cv2.waitKey(20)

    if k == 27:
        break

cv2.destroyAllWindows()
c.release()

In above code, I have set two averages, one with higher alpha value and another with lower alpha value so you can understand effect of alpha. At first both are set to initial frame of the capture. And in loop they get updated.

You can see some results in the SOF link I already provided. (I provide those results here, you can check the code and alpha value there):

Result 1 :

I used my webcam and saved original frame and running average at a particular instant.

Original Frame

As you can see, in this frame, my hand blocks the views behind.

Average Frame

But Running Average image shows the background clearly. You can see my hands ( actually lots of hands, which you can interpret as , I was waving my hand)

Result 2 :

Again from SOF :

This is a frame from a typical traffic video taken by a stationary camera. As you can see, a car is going on the road, and the person is trying to cross the road at a particular instant of time.

But see the running average at that time. There is no person and car in this image ( Actually it is there, have a close look, then you will see it, and the person is more clear than car, since car is moving very fast and across the image, it has not much effect on average, but person is there for a long time, since he is slow and moving across the road.)

Now we need to see the effect of alpha on these images.

Result 3 :

Original Frame

Alpha = 0.1

Alpha = 0.01

I was just waving my hand across my face in front of camera. Original frame caught it without any mercy. Running Average with alpha 0.1 has caught it as a transparent hand, with main emphasis on background. As alpha again reduced, you can see no hand there in front of face. ie the effect, as alpha decreases, sudden changes shows no effect on running averages.

Result 4 :

From the traffic video I have given in beginning of this article:

Original Frame

Alpha = 0.1

Alpha = 0.01

As alpha decreases, more vehicles are removed to create an approximate background. Unlike other examples, there is not a single instant in this video without vehicles and camera is shaky and moving, we still got a good background (although blurry)

Result 5:

Original Frame

Alpha = 0.1

Alpha = 0.01

Explanations are same as above. So reducing alpha creates a good background image. So choose the alpha value wisely.

So this is all about today.

As I mentioned, it is a simple article, but there is a good point to learn which we did with help of lot of images and simple code. Try it yourself and have some hacks to create something cool.

Let me know your feedbacks, thoughts etc !!!

Regards,
ARK

Wednesday, July 11, 2012

SOF : Watershed in OpenCV

Hi,

Do you like to have a very simple example for watershed algorithm in OpenCV ?

Please visit : Watershed in OpenCV-Python

See the demo result below :

Original Image

Foreground extracted using watershed.

With Regards,

ARK.

Saturday, July 7, 2012

Sudoku - Extra

Hi,

This is not an important article, but just wanted to show you a nice piece of code on how to remove any convexity defects on the sudoku square to get a perfect square. For now, i am not writing a complete tutorial, but visit :

Remove Curvature on Sudoku Square

With Regards,
ARK

Friday, June 22, 2012

Sudoku Solver - Part 3

Hi friends,

Recently I got busy with some other projects, so I couldn't post remaining part.

In the last article, we found the four corners of Sudoku border. In this session, we will be correcting the perspective of Sudoku and making it straight with a uniform size for further works.

So what we have now ? We have four corners of the Sudoku Square. These are the corners, or approx when printed.

[[[491 68]]
[[ 73 84]]
[[ 34 516]]
[[520 522]]]

This is a 4x1x2 array. But check out the values. First row is the TOP-RIGHT corner. Second row is the TOP-LEFT corner. Third row is the BOTTOM-LEFT corner. Finally, fourth one is the BOTTOM-RIGHT corner.

The problem is that, there is no guarantee that for next image, the corners found out will be in this same order. And why should we have the same order?

In the next step, we have to convert these points into another square of size 450x450 such that, the point at [491,68] will be at [449,0] in the new image, similarly for all points. If the order changes, we get some rotated images or even mirrored images.

Therefore , we have to keep them in uniform order. I used this order, [ TOP-LEFT, TOP-RIGHT, BOTTOM-RIGHT, BOTTOM-LEFT ]. You can use whatever you like.

I did it as follows.

First take the sum of x,y coordinates. TOP-LEFT has least sum, and BOTTOM-RIGHT has maximum sum. Now find the difference, ie y-x. TOP-RIGHT has minimum and BOTTOM-LEFT has maximum. It is written as a function. For this, we need to reshape 'approx' to (4,2).

def rectify(h):
        h = h.reshape((4,2))
        hnew = np.zeros((4,2),dtype = np.float32)

        add = h.sum(1)
        hnew[0] = h[np.argmin(add)]
        hnew[2] = h[np.argmax(add)]
        
        diff = np.diff(h,axis = 1)
        hnew[1] = h[np.argmin(diff)]
        hnew[3] = h[np.argmax(diff)]
 
        return hnew

This function should be included at the beginning. Now let's come back to end of code.

approx=rectify(approx)

Now we have the 4 points in order. Now we need corresponding points to where they should be mapped. I took a 450x450 image, and took points as below:

h = np.array([ [0,0],[449,0],[449,449],[0,449] ],np.float32)

Hope it is clear for you. We want the point [ 73 84] to be at [0 0] in new image, point [520 522] should be at [449,449] and so on.

Why image of 450x450 ? Now it has no particular reason, I just took it.

So earlier ? Yeah, it had some significance back then, when I first trying to develop this. But later I understood it is not needed, and there are more better ways to do it, so gave up the idea,but I retained the size. (Well, that is a little story)

Okay, but why all images should be resized to this ? You will understand it in OCR part. Digits are found using their height and width. So we need all digits to have same size, whatever input image we give.

So now we have the input array (approx) and output array (h). Next we apply the perspective correction.

retval = cv2.getPerspectiveTransform(approx,h)
warp = cv2.warpPerspective(gray,retval,(450,450))

The result is obtained as below :

There is some defects at the top. To correct that also, we need to do some extra efforts, which I will explain in another article.

So we are ready to do OCR work. In next article, I will explain the OCR training.

With Regards,
ARK

Monday, June 18, 2012

Contours - 4 : Ultimate

Hi,

This is the fourth and final article on Contours. This is the continuation of below articles:

1 - Contours - 1 : Getting Started
2 - Contours - 2 : Brotherhood
3 - Contours - 3 : Extraction

In this article we will deal with PointPolygonTest and Convexity Defects.

1 - PointPolygonTest :

This function finds the shortest distance between a point in the image and a contour. It returns the distance which is negative when point is outside the contour, positive when point is inside and zero if point is on the contour.

For example, we can check the point (50,50) as follows:

dist = cv2.pointPolygonTest(cnt,(50,50),True)

In the function, third argument is " measureDist ". If it is True, it finds the signed distance. If False, it finds only if the point is inside or outside or on the contour.

And if you don't want to find the distance, make sure third argument is False, because, it is a time consuming process. So, making it False gives about 2-3X performance boost.

I have written another article on how to speed up programs using Numpy techniques where I have taken PointPolygonTest as the test case.

Visit : Fast Array Manipulation in Numpy

2 - Convexity Defects :

I have already explained convex hull. Any deviation of the object from this hull can be considered as convexity defect. I have explained it with the help of images in second part of this series. ( Please read it ).

OpenCV comes with a ready-made function for this, cv2.convexityDefects(). Let's see how we can use it.

hull = cv2.convexHull(cnt,returnPoints = False)
defects = cv2.convexityDefects(cnt,hull)

Notice that "returnPoints = False" in first line to get indices of the contour points, because input to convexityDefects() should be these indices, not original points.

It returns a defects structure, an array of four values - [ start point, end point, farthest point, approximate distance to farthest point ]

We can visualize it using an image. We draw a line joining start point and end point, then draw a circle at the farthest point.

Now we take each row of the defects, then from that draw, extract four values, draw line using first two values, then draw the point using third value. Remember first three values returned are indices of cnt. So we have to bring those values from cnt.

for i in range(defects.shape[0]):
    s,e,f,d = defects[i,0]
    start = tuple(cnt[s][0])
    end = tuple(cnt[e][0])
    far = tuple(cnt[f][0])
    cv2.line(img,start,end,[0,255,0],2)
    cv2.circle(img,far,5,[0,0,255],-1)

And below are the various results :

So these are two functions I wanted to discuss. With this article, series on Contours is over.

I would like to hear your feedback, comments, suggestions etc.

With Regards,
ARK

Pages

Saturday, December 29, 2012

K-Means Clustering - 2 : Working with Scipy

K-Means Clustering - 1 : Basic Understanding

Thursday, July 19, 2012

BackGround Extraction using Running Average

Wednesday, July 11, 2012

SOF : Watershed in OpenCV

Saturday, July 7, 2012

Sudoku - Extra

Friday, June 22, 2012

Sudoku Solver - Part 3

Monday, June 18, 2012

Contours - 4 : Ultimate

Latex