HIGH CONTRAST TEXT DETECTION WITH CONNECTED COMPONENT ANALYSIS


Taking cue from my previous post .Our original image was 32

This is a high contrast image with the intensity of the letters much lower than the surroundings.In my previous post ,I showed how to condense the text into groups called blobs.The result was ,

dilate

Though the image may seem unintelligible,but this is what I would describe as a near perfect result.The next part is Binarizing the image with Canny edge detection.A big reason why we prefer canny edget detection is that the edge-field is binarized

via hysteresis Thresholding method. First strong edges are obtained with a high threshold value, then weak edges are included provided they are connected to strong edges.So this accentuates connected components even if some part of it is of low intensity.So the result after edge detection is as follows,

canny

Now Canny gives us this image.Now we need to separate these images and process it.This is where Contour comes handy.cvFindContour as the name suggests finds the connected components from an image and stores them in a sequence of structures.

The diagram below shows 1 form  of storing.

list

 

The diagram below shows another hierarchical form of storing in which contours which are inside another contour ,is stored as a child of the contour.

hierrachy

It can also entirely hierarchical like a tree.Any way a simple google search will give you all the possible from of storing contours.Contours are stored as cvSeq meaning a sequence of curves.So basically contour analysis lets us take individually each contour and analyze it.

So basically now we just need to extract each contour,find its Bounding rectangle and map it too the original image.The contours extracted and their bounding rectangles mapped to the original image.

12Capture

 

The total image generated with the text regions marked are .We can see when we use CvRect ,to bound the contour as a rectangle.Some bits and pieces of non contour regions also creep in.This is normal as all contours are not shaped as a rectangle.

blobs

 

The final  result is this.The arrow has also been marked.This can easily be eliminated by the OCR.So now we have successfully isolated the portion of texts.

Detection-normal

The source code is as follows

  1: IplImage* img = cvLoadImage("C:\\samples\\test\\32.jpg");
  2: 	IplImage* img1=cvCreateImage(cvSize(img->width,img->height),img->depth, 1 );
  3: 	cvConvertImage(img, img1,0);
  4: 	IplImage* img2=cvCreateImage(cvSize(img->width,img->height),img->depth, 1 );
  5: 	IplImage* img3=cvCreateImage(cvSize(img->width,img->height),img->depth, 1 );
  6: 	cvSetZero(img2);
  7: 	cvSetZero(img3);
  8: 
  9: 
 10: 	 CvMemStorage *mem;
 11: 	mem = cvCreateMemStorage(0);
 12: 	CvSeq *contours = 0;
 13:     CvSeq *ptr,*polygon;
 14: 
 15: 	cvMorphologyEx(img1,img1,img2,cvCreateStructuringElementEx(21,3,10,2,CV_SHAPE_RECT,NULL),CV_MOP_TOPHAT,1);
 16: 	display(img1);
 17: 	
 18: cvThreshold(img1,img1,128,255,CV_THRESH_BINARY);
 19: 	 cvSaveImage("thres.png",img1,0);
 20: 	display(img1);
 21: cvSetZero(img2);
 22: 	
 23: 	cvSmooth(img1, img1, CV_GAUSSIAN, 3, 3 );
 24: 	cvSaveImage("smooth-gaussian.png",img1,0);
 25: 	display(img1);
 26: 	cvDilate(img1,img1,cvCreateStructuringElementEx(21,3,10,2,CV_SHAPE_RECT,NULL),2);
 27: 	display(img1);
 28: 	cvCanny(img1,img1,500,900,3);
 29: 	display(img1);
 30: 	  cvSaveImage("canny.png",img1,0);
 31: 	cvFindContours(img1, mem, &contours, sizeof(CvContour), CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0));
 32: 	
 33: 	for (ptr =contours; ptr != NULL; ptr = ptr->h_next) 
 34: 	{
 35: 		
 36: 			  double reg = fabs(cvContourArea(ptr, CV_WHOLE_SEQ));
 37: 			//if(reg >600 && reg <10000)
 38: 			{
 39: 			CvScalar ext_color = CV_RGB( 255, 255, 255 ); //randomly coloring different contours
 40: 			cvDrawContours(img3, ptr,ext_color,CV_RGB(0,0,0), -1, CV_FILLED, 8, cvPoint(0,0));
 41: 			CvRect rectEst = cvBoundingRect( ptr, 0 );
 42: 			 CvPoint pt1,pt2;
 43:                 pt1.x = rectEst.x;
 44:                 pt1.y = rectEst.y;
 45:                 pt2.x = rectEst.x+ rectEst.width;
 46:                 pt2.y = rectEst.y+ rectEst.height;
 47: 				int thickness =1 ;
 48:                 cvRectangle( img, pt1, pt2, CV_RGB(255,255,255 ), thickness );
 49:  cvRectangle( img3, pt1, pt2, CV_RGB(255,255,255 ), thickness );
 50: 			//display( img);
 51: 			cvSetImageROI(img,rectEst);
 52: 			display(img);
 53: 			
 54: 		
 55: 		
 56: 			cvResetImageROI(img);
 57: 			
 58: 			
 59: 			
 60: 			}
 61: 			
 62: 	}
 63: 	 cvSaveImage("Detection-normal.png",img,0);
 64: 	  cvSaveImage("blobs.png",img3,0);
Advertisements

GOD of small things……..


Technorati Tags: ,,,,,,,,,

Sathya Sai Baba (Telugu: సత్య సాయిబాబా) born as Sathyanarayana Raju (23 November 1926 – 24 April 2011[4][5]) was an Indian guru, spiritual figure, philanthropist and educator.

     This is what Wikipedia describes Shri Sathya Sai Baba as.Recently I saw an article in Times Of India that questions Baba’s powers (do read that article to understand my post better).As a non-spiritual and logical guy I definitely cannot refute the writer’s claim.That is probably the exact reason why Wikipedia describes him as just a “spiritual figure”.Baba was known to conjure “small” things from thin air,and that is why I have named this post as such.

     In spite of all these,I really don’t understand why should we at all question his abilities.If he was called “God” by his followers ,then it was not only for his miracles…It was for his virtues…for his philanthropic work ,his ideals and teachings.His miracles were just a a minor part in the big picture.If his “miracles” were the only reason ,then a lot of magician could have had millions of followers.

    Obviously the next question you would ask is that then if he is just a magician then why such “pretense”.To understand this we need to understand the Indian culture.A culture that can be united by only three things.

  • Politics
  • Religion/Spirituality
  • Sports

The type of philanthropic work can only be done if he can get mass followers by the first two methods.Out of which politics is overcrowded with such people that,it is impossible for a single spoke to turn in the opposite direction and help the masses.Moreover surviving in todays political movement is not at all easy unless you are from a political family.

Baba chose the second way,he united people through faith and spirituality.The faith of people cannot come in a day.There needs to be something which will make people notice you.This is where the “miracles” came handy.But before we start shouting ,we must realize that it is not possible for a single common man to build hospitals/villages around the world,without anyone’s support.He had founded healthcare facilities,schools,universities and water projects .

The Sri Sathya Sai General Hospital was opened in Whitefield, Bangalore, in 1977 and provides complex surgeries, food and medicines free of cost. The hospital has treated over 2 million patients.

If he is called God,then its because he chose to help people……Its because he chose to actually do something and not sit and criticize ….Its because he had the ability to inspire faith in millions of people…..its because he could unite people from different race and religion.These are the things that make him a God…and these are things that everybody should respect him for.

Morphological Operations for Text Detection in high contrast Images


In my stint with image processing.I have come to believe that the most important part of image processing is not the actual “detection” but the preprocessing that goes before the operation.In my previous post I had talked about how to use cvDilate .CvDilate is specifically used when the connected components are not properly connected.

As opposed to Dilation ,we have something called erosion which involves choosing of Maxima.In this post I will be showing you the effects of Morphological operations on an image with text.Our sample image will be an image with high contrast.Sample text with Image

Now our first step will be to convert this image to gray scale using cvConvertImage;

Now we apply a morphological operation called Top Hat.Before explaining Tophat operate let me tell you a little about “Closing” an image.Closing(Morphological Close) is basically dilation done after erosion.The reason for erosion is to eliminate noise and speckle in an image.The reason for use of erosion over blurring is that large significant regions do not get affected.It is the protrusions that get eroded.

After that Dilation is done to connect the components which are very close to each other giving rise to connected components.

Thus Morphological Close = Erosion and then Dilation.

Morphological Top Hat Operation:

TopHat(src) = src–open(src)

Thus TOP HAT reveals areas that are lighter than the surroundings,which is exactly that we would require in this case.Notice that color of the text is lighter than the surroundings.

The GrayScale Image

gray

In my previous post while explaining Dilation I asked you to imagine a disc(Kernel) moving over the image and replacing all the cells that it covers with the local maxima.Now instead of a disc,I will be using my own kernel.

Making your kernel is pretty easy using cvCreateStructuringElementEx().I have used a 21*3 kernel,specifically because I want my algorithm to work on horizontal text like License Plates.

After application of TOPHAT morphological operations.

top_hat

You can notice slight stretch marks sort of.That is because we have used a fairly large rectangular Kernel.The result is exactly as we wanted.Now the next step is Thresholding.

We will be apply Binary thresholding in which any pixel above 128 will be replaced with 255 and all the ones below it will become 0.Thus actually we are brightening the brighter parts of the image.

Result after Thresholding.

Threshold

Now the stretch marks appear more visible cause they have been reduced to “O”

The image is also speckled so we need to smooth it using cvSmooth

smooth

Now the last step is obviously cvDilate as we can see gaps between the connected components.Example :The letter R which should have been together is filled with gaps.Now we need to apply dilation.But since we are also thinking about license plates,so we already know that these letters will be closely placed and it may be difficult to get each letter as a connected component.I will talk about that in my next post.

Anyway I again use my own kernel,optimized for horizontal closely placed texts.

Final Result.

dilate

You can see the texts are not visible at all instead they have been blurred together to form a Blob or a CONTOUR.This is exactly what you would want when going for license plates.Getting each letter is not only tedious but a waste of time .Localization can be best done when you can Coalesce the text into a blob.

The Source:

  1:         img3=cvCreateImage(cvSize(img->width,img->height),img->depth, 1 );
  2: 	IplImage *img_temp=cvCreateImage(cvSize(img->width,img->height),img->depth, 1 );
  3: 	cvSetZero(img_temp);
  4: 	cvConvertImage(img, img3,0);
  5: 	display(img3);
  6: 	IplImage *cc_color=cvCreateImage(cvGetSize(img3), IPL_DEPTH_8U, 3);
  7: 	cvMorphologyEx(img3,img3,img_temp,cvCreateStructuringElementEx(21,3,10,2,CV_SHAPE_RECT,NULL),CV_MOP_TOPHAT,1);
  8: 	display(img3);
  9: 	cvThreshold(img3,img3,128,255,CV_THRESH_BINARY);
 10: 	display(img3);
 11: 	cvSmooth(img3, img3, CV_GAUSSIAN, 5, 5 );
 12: 	display(img3);
 13: 	cvDilate(img3,img3,cvCreateStructuringElementEx(21,3,10,2,CV_SHAPE_RECT,NULL),2);
 14: 	display(img3);

I have used a high contrast image so that it resembles a typical License Plate.

The next steps will be done in my next post.Stay Tuned. Smile

The Two Contrasting Faces of India


 

deepika siddharth kissmoral-policing

Pictures speaks a thousand words.That is exactly the reason why I have started the post with two pictures.One depicts Siddharth Malya son of the liquor baron Vijay Malya kissing his girl friend Deepika Padukone in front of the entire stadium.The second depicts an act of moral policing where couples were being arrested for just sitting in the parks.

This is the face of Indian hypocrisy.People criticize “Public Display of Affection” but when the same is done by a business Magnate no one seems to oppose.The point is if people think kissing publicly is a problem ,then it should be the same for all people.I fail to understand why the discrimination.

Recently I came across a Bengali cross over film called “Gandu”.

The Bengali Cross Over Film called Gandu

This film depicts the true society.But alas!this film that has won awards all over the globe wont be releasing in India.We have a problem with its name.We have a problem with the “true face” of the society.

These days “GANDU” is a word that is used here more often than the word “democracy”

If our police were as vigilant to crimes as our censor boards are then probably India would have been a better place.Another classic example of moral policing is  “Fahrenheit 9/11”.This film that depicts how the politicians in US used the tragic 9/11 event to push forward their own agendas. This film which was released in US, was not released in India!

The way we have been brought up makes us overlook the Hypocrisy we are staying in.A recent example would be “Sheila Ki Jawani “.The song with its catchy name and sexy dance sequences has enthralled us all.But when it comes to “Mumbai Dance bars”/”Night Clubs” the society is quick to brand it as “immoral”.We have no problem with watching “Sheila ki Jawani“ on screen ,but when some  girl dances to its tune in a Mumbai dance bar,we get offended.

It is about time we start erasing preconceived ideas about what is “Good” and what is “Bad”.The society is no one to decide whether “Live Together” is good or Bad.It is you and solely you ,that should decide on your morality.

It is time we start giving a damn to the society and change the face of India.

Windows Azure basics


Windows Azure is the cloud operating system by Microsoft.Now if you you have never heard of it or don’t know what a cloud operating system means then head over to Wikipedia ,here.The aim of this post to give you an inside view of the architecture of Windows Azure.azure-services

This image pretty much sums up the top layer architecture of Windows Azure.On the bottom layer is the windows azure which takes care of the “cloud” related technicalities.The power of windows azure is being harnessed by the above services.Things will become easier if we imagine a Hyper-V layer between Windows Azure and the services.The services runs in their own Virtual Machine oblivious of the fact that they are running on azure.Your application will typically connect to these services and through these services it will do its processing.

azure-platform

This is just an illustration of the above architecture.Bot your web applications and local applications can access these services directly.Your local app can also connect to your app and access these services.For eg.Suppose you have a web application that requires user to sign in.Instead of making custom registration you use the “Live API”,so that anyone with a valid “Hotmail/Live” ID can sign in.So basically your web app will harness the power of the cloud through the “Live Services”API.

Inside Windows Azure

Framework

This is the most important part of this blog .If you have read Windows Azure you will already know,Windows Azure has 3 parts.

  • Compute
  • Storage
  • Fabric

Compute

As the name suggests “Compute” handles all the programming and computations.It is the “compute” part that will actually run your program/apps.So the main processing is done by the Compute.Compute is divided into 3 roles.

  • Web Role
  • Worker Role
  • VM (virtual machine Role)

Web Role:This is the process that actually interacts with the presentation layer.It is similar to a web application running on a server.The user or the local machine at any moment will be interacting with the Web Role.All web apps that you deploy use the Web Role.

Worker Role:This role is similar to a background process in operating systems.This has no interaction with the presentation layer whatsoever.The user does not interact with the worker role.It is generally used for background work like checking the integrity of data before committing to the database,etc.

VM role:It runs an image (a VHD) of a Windows Server 2008 R2 virtual machine. This VHD is created using an on-premises Windows Server machine, then uploaded to Windows Azure.

The roles that you would need to code for your first program is worker role and the web role.

Storage

As shown in the figure,storage is divide into 3 parts,

  • Blobs
  • Queues
  • Tables

 

Blobs:The diagram below is probably the best way to learn about Blobs.storage

 

Account represents the Windows Azure account that user has.Container can be thought of as “Folders: that can contain blobs.Blobs are similar to files in our local file system.And lastly Each blobs is divided into blocks(similar to frames).The interesting part is we can manually decide which block will be uploaded and in what sequence.WE can actually decide the storage of our data right down to the block level.

Blob storage works in this way.

  • Identify the file to be uploaded
  • Split the file into Blocks
  • Upload every block in any order you want to.
  • Commit the block to a blob

Tables:These tables are a means to provide structured storage through rows and columns.Relations are not supported so we cannot use it as a RDBMS.We can run LINQ on it and query like a table.

Queue:It is primarily used for message passing between the web role and the worker role.This provides reliable storage and delivery of messages for an application.

 

Fabric

So finally we come to Fabric,this is an abstraction over the datacenters and servers that Microsoft provides.This is the part where the OS interacts with the hardware.the fabric is named because windows azure maintains a network of datacenters such that it represents a closely woven fabric.

SQL Azure

Instead of putting the database on the cloud Microsoft has put the entire RDMS on the cloud.This manages your data needs across the cloud.So we don’t have to think about relational data storage.SQL managers our data across several datacenters.

App fabric

The App Fabric again a software concept deals with the accessing the application from the cloud.It is divided in two parts.

  • Service Bus-Generally you as a client will have some on premise infrastructure through which you would like to connect to the cloud.Your data security is based on how you connect.You can connect through a port which may not be as secure,or else you can use a VPN which is quite complex.So Microsoft provides us with service bus,through which we can connect to the cloud.service bus takes care of all authentication,authorization and additional access for Azure apps from on premises and connect across firewall.
  • Access Control Services-It is used to simplify the process of accessing the cloud.your company may use one form of authentication.Your clients may use completely different form of authentication.ACS helps to cross boundary collaboration like external organization,resources with different identification.Through ACS you can even decide which parts are accessible to your clients and which part are hidden from them.

This marks the end of the post.In my next post I will post some link to videos that may help you understand these concepts easily.

IPL vs Saurav Ganguly


With the IPL going on and “No Dada – No IPL” slogans at its peak,I couldn’t help myself but write about this.

We,Bengalis are always looking for an “ICON” .After Rabindranath Tagore’s demise no one could become a mass icon as “Saurav Ganguly” did.Saurav Ganguly was Bengal’s pride.With Ganguly being belittled,entire Bengal was angry.But the fact was that Bengal was not merely angry because a “good cricketer” was gone,but because Ganguly was the sole and only icon of this “Star-starved” state.

This feeling was justified 3 years ago,but now I really don think there is any need for such things.This only goes to prove how pathetic we Bengalis are.

Let me give you an example,Eden Gardens did not get the world cup match between India and England.A lot had been said how the “Bengal” was again ignored.But I think we did not deserve a match at eden ,here’s why.

  • Eden is probably the only place which has insulted the Indian cricket team.I still don’t find why the Indian cricket team needs to be insulted just because Saurav Ganguly did not get selected.People are so narrow minded that they do not even think twice about insulting their own country for regional differences.
  • Few days ago,I saw this facebook group ” No dada…..No eden ” http://www.facebook.com/NoDada.NoEden .Maybe this attitude is the reason why Eden was ignored.May be BCCI thought that some narrow minded people would visit the stadium with “No DADA…No WORLD CUP” flags and pray for India’s defeat.

Unfortunately,the majority people in Bengal are actually not “India”-haters.It is these handful of lowly people with their facebook groups..media attention has succeeded in giving a wrong impression about “Kolkata”.

These so called “Dada” supporters had lost their steam when India won the world cup.But their back again,now their opponent being “IPL” . They burn SRK’s effigy and call him a “villain”.

Firstly ,IMHO

SRK was kind enough to buy the Kolkata Team.

We should be thankful to him for that.There was no one from Bengal to even buy our team!!Not even “Mr.Subrata Roy,Chairman Sahara” who is a bengali too!

Look at the “Deccan Chargers”.They have the local press sponsoring them.But here in Bengal we have the local press supporting such narrow minded campaigns.I want to say to the local press “Anandabazar patrika”,If you guys know so much about the team why don’t you “sponsor” KKR if not buy it!

Kolkata being a cosmopolitan metro is still looked down because of these narrow minded campaigns,that makes us look like a fool.I have interviewed 3 people from my age group . (18-22)

This what they had to say .I asked 2 questions

  • What do you think about this campaign “No Dada ..no IPl”
  • How do you think this is affecting Kolkata’s impression to other outsiders.

Somdeb Mittra,Engineering Student, Age :22

I think it is a very disgusting campaign.Its normal to like a player and have emotions but there is no point in making it public through fb pages ,effigy burning and hence ruining Kolkata’s image.If each state tries to disrupt IPL since their favorite player did not get selected,then IPL altogether has to be stopped.

I think these people are indeed ruining Kolkata’s image.For an outsider this campaign would give an impression that Kolkata is a city of beauty without brains.

Arka Bhattacharyya,Engineering Student,Age 20

Basically this campaign is the work of people with a low mentality who have nothing good to do in life.It is because of this people that Kolkata is still considered as a “backward city” among Mumbai,Bangalore ,etc.These people have no moral ethics and do not think twice before going against their own country.

Kolkata in spite of being a cosmoplitan metro still does not have the modern image that new metros like Pune or Bangalore possess.A part of this can be attributed to these people who have tried their best to keep the “backward” image of Kolkata alive.

Sukrit Mukherjee,Engineering Student,Age 21

No Individual is higher than the team or the tournament.Its true that injustice had been done,but there is no point in bringing it here.IPL is not a committee to check injustice it is tournament where cricketers show their skills.All the teams did not select Saurav in the first round.There must be some reason why they thought Saurav was not fit for their team.Bengalis need to accept the fact that IPL was not made only for Saurav nor Saurav was made only for IPL.

Kolkata’s impression as a sports loving city is being marred by such stupid campaigns.It may seem that Kolkata isnt at all passionate about cricket ,and they only visited the stadium to watch Saurav,which is totally wrong. still don’t understand the hue and cry over KKR not choosing Saurav.Punjab has not chosen Yuvraj.Its about time these peole act like grown –ups.

My post is a testament of the fact that all Bengalis do not think the way they are generally portrayed as.We are actually a part of the  modern culture  and we are a truly sports loving state Smile

Dilation and Edge detection for License Plate recognition.


One of  first thing that one needs to learn in image processing is Connected Component Analysis.Connected components are large discrete regions of similar pixel intensity.There is really no point in talking about what exactly is “connected component analysis ” since you already have Wikipedia for that purpose . 🙂

OpenCv provides various methods for connected component analysis.The foremost methods are morphological operations and Contour extraction.The basic morphological transformations are called dilation and erosion, and they arise in a wide variety of contexts such as removing noise, isolating individual elements, and joining disparate elements in an image.

With respect to License Plate Recognition ,I have found dilation particularly effective after localization of the license plate.Let me show you an example of using canny and dilation together.

Dilation involves scanning an image with a kernel (may be user define ,or may be the default ones).Think about a 3*3 square.The square slides across the image.At any moment it is covering  9 pixels.The local maxima is computed and all the pixels are replaced with the values of that maxima.

With respect to License plates lets say we have a sample of partly localized preprocessed license plate.The original image is given below.

original

Most people would be tempted to apply canny edge detection for the text detection phase.So the result after cvCanny(img,img,100,500) is

canny without dilation

As you can see a lot of unnecessary noise that can hamper your text detection process.Adjusting the threshold also doesn’t work every time.

So the better process would be to apply Dilation on the original Image ,this is the result after dilation.

dilation first

Then apply Canny edge detection.

dilation then canny

This image now forms the base for connected component analysis and text detection as well as license plate localization.The image is much more cleaner and is devoid of noise.The code is pretty basic

 

</p> <p>cvCanny(img3,img3,100,500);</p> <p>cvDilate(img3,img3,0,2);</p> <p>display(img3);</p> <p>