OPENCV HAAR-TRAINING

AUTHORS:ABHISHEK KUMAR ANNAMRAJU,AKASH DEEP SINGH,ADHESH SHRIVASTAVA

Hi Friends,

Lets have a look at the method to do haar-training in opencv for ubuntu systems

Step1:-GET THE MATERIAL

a)Create a folder named haartraining on the desktop

cd Desktop

mkdir haartraining

cd haartraining

b)Inside the folder create four new folders

mkdir pos

mkdir neg

mkdir samples

mkdir haar

c)Download image database of the object you want to detect,or if you have a set of images with you try cropping them manually to set the region of interest.Copy and paste the positive images in the pos folder and negative images in the neg folder.The following is a database for car.

http://cogcomp.cs.illinois.edu/Data/Car/

d)Create a pearl file named createtrainsamples.pl and copy the following code in it and save in haartraining directory

Note:-The following code has been written by Naotoshi Seo and the copyrights are owned by the same person.I am just using the code for educational purpose.

**************code starts here*********************************

Click once somewhere on the code and press ctrl+A to select whole code.You may not see the whole code so its better to copy the code and paste it in your favourite text editor and then go through it.

#!/usr/bin/perl
use File::Basename;
use strict;
##########################################################################
# Create samples from an image applying distortions repeatedly
# (create many many samples from many images applying distortions)
#
#  perl createtrainsamples.pl <positives.dat> <negatives.dat> <vec_output_dir>
#      [<totalnum = 7000>] [<createsample_command_options = ./createsamples -w 20 -h 20...>]
#  ex) perl createtrainsamples.pl positives.dat negatives.dat samples
#
# Author: Naotoshi Seo
# Date  : 09/12/2008 Add <totalnum> and <createsample_command_options> options
# Date  : 06/02/2007
# Date  : 03/12/2006
#########################################################################
my $cmd = './createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1 -maxyangle 1.1 maxzangle 0.5 -maxidev 40 -w 20 -h 20';
my $totalnum = 7000;
my $tmpfile  = 'tmp';

if ($#ARGV < 2) {
print "Usage: perl createtrainsamples.pl\n";
print "  <positives_collection_filename>\n";
print "  <negatives_collection_filename>\n";
print "  <output_dirname>\n";
print "  [<totalnum = " . $totalnum . ">]\n";
print "  [<createsample_command_options = '" . $cmd . "'>]\n";
exit;
}
my $positive  = $ARGV[0];
my $negative  = $ARGV[1];
my $outputdir = $ARGV[2];
$totalnum     = $ARGV[3] if ($#ARGV > 2);
$cmd          = $ARGV[4] if ($#ARGV > 3);

open(POSITIVE, "< $positive");
my @positives = <POSITIVE>;
close(POSITIVE);

open(NEGATIVE, "< $negative");
my @negatives = <NEGATIVE>;
close(NEGATIVE);

# number of generated images from one image so that total will be $totalnum
my $numfloor  = int($totalnum / $#positives);
my $numremain = $totalnum - $numfloor * $#positives;

# Get the directory name of positives
my $first = $positives[0];
my $last  = $positives[$#positives];
while ($first ne $last) {
$first = dirname($first);
$last  = dirname($last);
if ( $first eq "" ) { last; }
}
my $imgdir = $first;
my $imgdirlen = length($first);

for (my $k = 0; $k < $#positives; $k++ ) {
my $img = $positives[$k];
my $num = ($k < $numremain) ? $numfloor + 1 : $numfloor;

# Pick up negative images randomly
my @localnegatives = ();
for (my $i = 0; $i < $num; $i++) {
my $ind = int(rand($#negatives));
push(@localnegatives, $negatives[$ind]);
}
open(TMP, "> $tmpfile");
print TMP @localnegatives;
close(TMP);
#system("cat $tmpfile");

!chomp($img);
my $vec = $outputdir . substr($img, $imgdirlen) . ".vec" ;
print "$cmd -img $img -bg $tmpfile -vec $vec -num $num" . "\n";
system("$cmd -img $img -bg $tmpfile -vec $vec -num $num");
}
unlink($tmpfile);

***********************************code ends here**************************

e)Create a c++ fine named mergevec.cpp,copy the following code and save it in haartraining directory

Note:-The following code has been written by Naotoshi Seo and the copyrights are owned by the same person.I have made a few changes to be able to compile them.I am just using the code for educational purpose.

***********************code starts here*****************************

Click once somewhere on the code and press ctrl+A to select whole code.You may not see the whole code so its better to copy the code and paste it in your favourite text editor and then go through it.

#include <cv.h>
#include <highgui.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#include "cvhaartraining.h"
#include "_cvhaartraining.h" // Load CvVecFile
// Write a vec header into the vec file (located at cvsamples.cpp)
void icvWriteVecHeader( FILE* file, int count, int width, int height );
// Write a sample image into file in the vec format (located at cvsamples.cpp)
void icvWriteVecSample( FILE* file, CvArr* sample );
// Append the body of the input vec to the ouput vec
void icvAppendVec( CvVecFile &in, CvVecFile &out, int *showsamples, int winwidth, int winheight );
// Merge vec files
void icvMergeVecs( char* infoname, const char* outvecname, int showsamples, int width, int height );

// Append the body of the input vec to the ouput vec
void icvAppendVec( CvVecFile &in, CvVecFile &out, int *showsamples, int winwidth, int winheight )
{
CvMat* sample;

if( *showsamples )
{
cvNamedWindow( "Sample", CV_WINDOW_AUTOSIZE );
}
if( !feof( in.input ) )
{
in.last = 0;
in.vector = (short*) cvAlloc( sizeof( *in.vector ) * in.vecsize );
if ( *showsamples )
{
if ( in.vecsize != winheight * winwidth )
{
fprintf( stderr, "ERROR: -show: the size of images inside of vec files does not match with %d x %d, but %d\n", winheight, winwidth, in.vecsize );
exit(1);
}
sample = cvCreateMat( winheight, winwidth, CV_8UC1 );
}
else
{
sample = cvCreateMat( in.vecsize, 1, CV_8UC1 );
}
for( int i = 0; i < in.count; i++ )
{
icvGetHaarTraininDataFromVecCallback( sample, &in );
icvWriteVecSample ( out.input, sample );
if( *showsamples )
{
cvShowImage( "Sample", sample );
if( cvWaitKey( 0 ) == 27 )
{
*showsamples = 0;
}
}
}
cvReleaseMat( &sample );
cvFree( (void**) &in.vector );
}
}

void icvMergeVecs( char* infoname, const char* outvecname, int showsamples, int width, int height )
{
char onevecname[PATH_MAX];
int i = 0;
int filenum = 0;
short tmp;
FILE *info;
CvVecFile outvec;
CvVecFile invec;
int prev_vecsize;

// fopen input and output file
info = fopen( infoname, "r" );
if ( info == NULL )
{
fprintf( stderr, "ERROR: Input file %s does not exist or not readable.\n", infoname );
exit(1);
}
outvec.input = fopen( outvecname, "wb" );
if ( outvec.input == NULL )
{
fprintf( stderr, "ERROR: Output file %s is not writable.\n", outvecname );
exit(1);
}

// Header
rewind( info );
outvec.count = 0;
for ( filenum = 0; ; filenum++ )
{
if ( fscanf( info, "%s", onevecname ) == EOF )
{
break;
}
invec.input = fopen( onevecname, "rb" );
if ( invec.input == NULL )
{
fprintf( stderr, "ERROR: Input file %s does not exist or not readable.\n", onevecname );
exit(1);
}
fread( &invec.count,   sizeof( invec.count )  , 1, invec.input );
fread( &invec.vecsize, sizeof( invec.vecsize ), 1, invec.input );
fread( &tmp, sizeof( tmp ), 1, invec.input );
fread( &tmp, sizeof( tmp ), 1, invec.input );

outvec.count += invec.count;
if( i > 0 &&  invec.vecsize != prev_vecsize )
{
fprintf( stderr, "ERROR: The size of images in %s(%d) is different with the previous vec file(%d).\n", onevecname, invec.vecsize, prev_vecsize );
exit(1);
}
prev_vecsize = invec.vecsize;
fclose( invec.input );
}
outvec.vecsize = invec.vecsize;
icvWriteVecHeader( outvec.input, outvec.count, outvec.vecsize, 1);

// Contents
rewind( info );
outvec.count = 0;
for ( i = 0; i < filenum ; i++ )
{
if (fscanf( info, "%s", onevecname ) == EOF) {
break;
}
invec.input = fopen( onevecname, "rb" );
fread( &invec.count,   sizeof( invec.count )  , 1, invec.input );
fread( &invec.vecsize, sizeof( invec.vecsize ), 1, invec.input );
fread( &tmp, sizeof( tmp ), 1, invec.input );
fread( &tmp, sizeof( tmp ), 1, invec.input );

icvAppendVec( invec, outvec, &showsamples, width, height );
fclose( invec.input );
}
fclose( outvec.input );
}

int main( int argc, char **argv )
{
int i;
char *infoname   = NULL;
char *outvecname = NULL;
int showsamples  = 0;
int width        = 24;
int height       = 24;

if( argc == 1 )
{
printf( "Usage: %s\n  <collection_file_of_vecs>\n"
"  <output_vec_filename>\n"
"  [-show] [-w <sample_width = %d>] [-h <sample_height = %d>]\n",
argv[0], width, height );
return 0;
}
for( i = 1; i < argc; ++i )
{
if( !strcmp( argv[i], "-show" ) )
{
showsamples = 1;
// width = atoi( argv[++i] ); // obsolete -show width height
// height = atoi( argv[++i] );
}
else if( !strcmp( argv[i], "-w" ) )
{
width = atoi( argv[++i] );
}
else if( !strcmp( argv[i], "-h" ) )
{
height = atoi( argv[++i] );
}
else if( argv[i][0] == '-' )
{
fprintf( stderr, "ERROR: The option %s does not exist. n", argv[i] );
exit(1);
}
else if( infoname == NULL )
{
infoname = argv[i];
}
else if( outvecname == NULL )
{
outvecname = argv[i];
}
}
if( infoname == NULL )
{
fprintf( stderr, "ERROR: No input file\n" );
exit(1);
}
if( outvecname == NULL )
{
fprintf( stderr, "ERROR: No output file\n" );
exit(1);
}
icvMergeVecs( infoname, outvecname, showsamples, width, height );
return 0;
}

**************************code ends here********************

f)compiling the mergevec.cpp to get an executable file

copy the following files from your installed opencv folder and save them in haartraining

cvboost.cpp
cvclassifier.h
cvcommon.cpp
_cvcommon.h
cvhaarclassifier.cpp
cvhaartraining.cpp
cvhaartraining.h
_cvhaartraining.h
cvsamples.cpp

open a terminal in the haartraining folder and type

1. chmod +x mergevec.cpp

2. g++ `pkg-config --cflags opencv` -o mergevec mergevec.cpp cvboost.cpp cvcommon.cpp cvsamples.cpp cvhaarclassifier.cpp cvhaartraining.cpp `pkg-config --libs opencv`

Step2:-Starting haartraining

open a terminal and go to the haartraining folder and type

1. find ./neg -iname "*.(EXTENSION OF YOUR IMAGES)" > negatives.txt 

2. find ./pos -iname "*.(EXTENSION OF YOUR IMAGES)" > positives.txt 

3. perl createtrainsamples.pl positives.txt negatives.txt samples 2000
"opencv_createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1
-maxyangle 1.1 maxzangle 0.5 -maxidev 40 -w 20 -h 20"

Note:-width(-w) and height(-h) should be of the images in pos folder.I assume you would keep every image of same dimensions)

4. find ./samples -name '*.vec' > samples.txt

5.    ./mergevec samples.txt samples.vec

6. opencv_haartraining -data haar -vec samples.vec -bg negatives.txt -nstages 20 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -npos 7000 -nneg 3019 -w 20 -h 20 -nonsym -mem 512 -mode ALL

Note:

a)-npos and -nneg must have the number of photos you have in pos and neg folder respectively.
b)choosing minhitrate and maxfalsealarm

For example you have 1000 positive samples. You want your system to detect 900 of them. So desired hitrate = 900/1000 = 0.9. Commonly, put minhitrate = 0.999^number of stages

For example you have 1000 negative samples. Because it’s negative, you don’t want your system to detect them. But your system, because it has error, will detect some of them. Let error be about 490 samples, so false alarm = 490/1000 = 0.49. Commonly,put false alarm  = 0.5^number of stages

Note:

1)The number of negative images must be greater than twice the number of positive images.

2)Try to set npos = 0.9 * number_of_positive_samples and 0.99 as a minHitRate.

3)vec-file has to contain >= (npos + (numStages-1) * (1 – minHitRate) * numPose) + S, where S is a count of samples from vec-file.S is a count of samples from vec-file that can be recognized as background right away.

Step3:-

This is the most important step

KEEP INFINITE PATIENCE TILL THE PROCESS ENDS.FOR A GOOD TRAINING SOMETIMES IT TAKES ABOUT 10 DAYS.YOU COULD TEST THE ABOVE PROCESS BY TAKING FEW IMAGES AND REDUCING THE NUMBER OF STAGES

Note:to speed up the process refer to

http://www.computer-vision-software.com/blog/2009/11/faq-opencv-haartraining/

Thus a xml file will be created in the haar folder which you can use further.

For example,the face detction code in the opencv examples could be used but with change in the xml file name.

A better way of using the xml file is running it through the code I made for object detection using multiple xml file
Refer:https://abhishek4273.wordpress.com/2014/03/16/object-detection-using-multiple-traincascaded-xml-files/

Code and other xml files(do read README.nd):https://github.com/abhi-kumar/CAR-DETECTION.git

This ends up the process

Thank you 🙂

####################CHEERS#########################

See also :- http://blindperception.wordpress.com/

Advertisements

27 thoughts on “OPENCV HAAR-TRAINING”

  1. Hi Abhishek..
    Thank you for the the tutorial of training Haar cascade.I using opencv 2.4.6 on Ubuntu.while executing this command
    ./mergevec samples.txt samples.vec
    i got the following error::
    ./mergevec: error while loading share libraries:libopencv_calib3d.so.2.4: cannot open shared object file: No such file or directory
    Can you please suggest me something to solve out the error.
    I an new to opencv and ubuntu.

    Thankyou

    1. Sorry for the late reply,
      1)First make sure that the mergevec.exe got generated in the same folder.I am sure it will but just check it once
      2)Now open a terminal and type:
      sudo gedit /etc/ld.so.conf.d/opencv.conf
      and in that file type:
      usr/local/lib

      then reboot the system.
      I hope with this the errors wont reappear.
      Let me know if you still get errors

      Hope this helps you. 🙂

  2. Hi Abhishek..
    Thanks for this tutorial. I am trying for the haartraining and whenever I am trying to execute the command,
    ./mergevec samples.txt samples.vec
    I am getting error as:
    OpenCV Error: Assertion failed (elements_read == 1) in icvGetHaarTraininDataFromVecCallback, file cvhaartraining.cpp, line 1859
    terminate called after throwing an instance of ‘cv::Exception’
    what(): cvhaartraining.cpp:1859: error: (-215) elements_read == 1 in function icvGetHaarTraininDataFromVecCallback

    Aborted (core dumped).
    The mergevec.exe is generated in the same folder.
    Can you please guide me how to solve this error?
    I am using opencv-2.4.7 and working on ubuntu.
    Thanks!

    1. Hi Sneha,

      The error which you are getting is mostly observed when you run the final command to start the training, i.e. when the haartraining.cpp is unable to fetch the required number of samples to train the classifier. This is the first time I have heard that the error is coming while running the mergevec command.

      Just make sure the positives.txt, negatives.txt and samples.txt are not empty documents

      If none of them are empty documents just mail me up the following details to abhishek4273@gmail.com :.
      1. Number of positive images
      2. Number of negative images
      3. If you have made any croppings done in the image data,then that also.
      4. The function(command) you have used to create positives.txt and negatives.txt and the samples

      Thank you for contacting 🙂

      1. Hi Abhishek,

        Thanks for your reply. I got the error. My positive images were not of the same size. Now the training is running properly.

  3. After executing perl script i mentioned 200 samples but it has created only 12(as that of my sample images) positive11.jpg.vec files, and created samples.txt file successfully,

    then I executed this,
    $./mergevec samples.txt samples.vec

    And got the following error,

    OpenCV Error: Assertion failed (elements_read == 1) in icvGetHaarTraininDataFromVecCallback, file cvhaartraining.cpp, line 1859
    terminate called after throwing an instance of ‘cv::Exception’
    what(): cvhaartraining.cpp:1859: error: (-215) elements_read == 1 in function icvGetHaarTraininDataFromVecCallback

    Aborted (core dumped)

    Please help me author, i followed everything you specified.

    –Thanks in advance–

      1. Thanks a lot man, that fixed my error.

        I’ve started trainig the haar classifier for the car database you provided.
        I thought there was an error with number of positive and negative samples I chose.

        I have one question, Can we able to do the training for 100 positive images and 200 negative images. or do I have to go for atleast 1000 positive and 2000 negative images for trainig always.

        Can I use 10 stages to train?

        It started 3rd august evening it’s still running and in first stage,
        How long it may take to end ( Used 1000 positive and 2000 negative samples and -nstages 20 and -mem 1024 as specifed in your blog) and it is creating folders with name 0 and 1 …. how to get xml file out of them please explain,

        This is the recent output of my training.

        Tree Classifier
        Stage
        +—+—+
        | 0| 1|
        +—+—+

        0—1

        Parent node: 1

        *** 1 cluster ***
        POS: 999 999 1.000000
        NEG: 2000 0.170372
        BACKGROUND PROCESSING TIME: 16.00
        Precalculation time: 0.00
        +—-+—-+-+———+———+———+———+
        | N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
        +—-+—-+-+———+———+———+———+
        | 1|100%|-|-0.381544| 1.000000| 1.000000| 0.166722|
        +—-+—-+-+———+———+———+———+
        | 2|100%|-|-1.051057| 1.000000| 1.000000| 0.198066|
        +—-+—-+-+———+———+———+———+
        | 3|100%|-|-1.445988| 1.000000| 1.000000| 0.112371|
        +—-+—-+-+———+———+———+———+
        | 4| 91%|-|-1.848670| 1.000000| 1.000000| 0.052351|
        +—-+—-+-+———+———+———+———+
        | 5| 95%|-|-1.118966| 1.000000| 0.551000| 0.057352|
        +—-+—-+-+———+———+———+———+
        | 6| 77%|-|-1.271395| 1.000000| 0.503000| 0.027342|
        +—-+—-+-+———+———+———+———+
        | 7| 79%|-|-1.761679| 1.000000| 0.615000| 0.016005|
        +—-+—-+-+———+———+———+———+

      2. Hi Punith,
        I am happy that the solution worked for you. As to your question there is no rule that you have to train using larger set of images only,it is only recommended for good training. You can have a test with 100 images also. Now any training is a game of parameters, you must research on it according to your image-base and then set the parameters. Don’t directly go for a 20 stage training, go for a 5 now,you can run 2-3 training on your system simultaneously, just make sure the system doesn’t heat up above a level. Go on experimenting.

  4. I think i stuck in parent node4:

    Parent node: 3

    *** 1 cluster ***
    POS: 999 999 1.000000
    NEG: 2000 0.00611975
    BACKGROUND PROCESSING TIME: 38.00
    Precalculation time: 0.00
    +—-+—-+-+———+———+———+———+
    | N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
    +—-+—-+-+———+———+———+———+
    | 1|100%|-| 1.000000| 1.000000| 0.000000| 0.000000|
    +—-+—-+-+———+———+———+———+
    Stage training time: 3060.00
    Number of used features: 2

    Parent node: 3
    Chosen number of splits: 0

    Total number of splits: 0

    Tree Classifier
    Stage
    +—+—+—+—+—+
    | 0| 1| 2| 3| 4|
    +—+—+—+—+—+

    0—1—2—3—4

    Parent node: 4

    *** 1 cluster ***
    POS: 999 999 1.000000

    I can see above line since 1 hour. Can I expect the training to continue properly.

    I have another question, how to create .xml file from folders in data folder (i.e I have folders 1,2,3,4 in my data folder).

    I found this code to create xml,

    $convert_cascade –size=”* .xml

    1. Hi Punith,
      You let the process go on its own, it will definitely take time. Keep patience. And to your second question, the function to generate cascade file is correct. Its format is “convert_cascade –size=”x” ” in linux background.

  5. hi abhishek,
    i’ve got a problem i’ll be happy if you can help
    here’s the issue:
    when i run the code : perl createtestsamples.pl positives.txt negatives.txt samples 200 “opencv_createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1 -maxyangle 1.1 -maxzangle 0.5 maxidev 40 -w 25 -h 56”
    i get sample images in the samples folder but no vec file what i am doing wrong..

  6. i think i solved the problem by giving “-vec samples.vec” parameter to the opencv_createsamples. this gives you a .vec file without using mergevec. but now when i run opencv_haartraining it seems like stucks at parent node 4. i dont know if its working. im gonna wait i think. i’ll let you know the result for feedback. thank you..

  7. Hello, Abhishek
    My name is Matheus, live in Brazil, Sorry for the mistakes of gramatic…

    I followed the tutorial you posted, about Haar Training with OpenCV;
    I also used the database of cars and everything went well,
    but when he arrives on stage 11, no continues, still stands.

    You know what possible reason?

    Thank you​

    1. how many positives and negatives are you using?
      what are the resolutions of positives and negatives?
      and what is minhitrate and maxfalse alarm?

      1. Hi Ozgur,

        It depends on what the object of training is. In my post I think I used 500 positives, 550 negatives of resolution 100×40, but you must use more number of images. minhitrate be around 0.7-0.9 and maxfalsealarm be around 0.1-0.3. Again I must emphasize that all the parameters are subject to trial and error basis with respect to the object of interest and number of training images.

        Regards,
        Abhishek

    2. Hi Matheus,

      Sorry for the extreme delay in my reply.

      This is one of the biggest problems with haar-training, at higher stages the probability of finding false positives from negatives becomes extremely less(1 in 100 Million), that is why the process seems to be stuck. Stop that process, manipulate a few parameters like increasing samples, increasing maxfalsealarm etc.

      Regards,
      Abhishek

  8. Hello abhishek.
    I have the same error as Sneha reported:

    OpenCV Error: Assertion failed (elements_read == 1) in icvGetHaarTraininDataFromVecCallback, file cvhaartraining.cpp, line 1859
    terminate called after throwing an instance of ‘cv::Exception’
    what(): cvhaartraining.cpp:1859: error: (-215) elements_read == 1 in function icvGetHaarTraininDataFromVecCallback

    Aborted (core dumped)

    But my input files have all the same size (25×150). Moreover, none of the files (positives.txt, negatives.txt and samples.txt) are empty. I have 13 sample.vec files. I am using 14 positive images and 600 negative images.
    OpenCV 2.4.5 on Ubuntu 14.04 LTS.

  9. ./cardetect –cascade=data/cascade.xml test-1.pgm./cardetect: line 1: syntax error near unexpected token `newline’
    ./cardetect: line 1: `’

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s