AUTHORS:Abhishek Kumar Annamraju,Akashdeep Singh,Adhesh Shrivastava

Hello Friends

My last post explained how segmentation can be used to detect roads.
This post will explain the following things:
1.Optimum use of traincascade
2.Creating xml files for object detection
3.Using multiple xml files to detect object,here it is cars
4.Using multiple xml files without detecting a single object twice.

Haartraining is stated to provide better results than traincascade but it is extremely slow.Sometimes it may take one to two weeks to train a classifier.So we shifted our goal to traincascade.

The whole procedure from now will focus on car detection

Download the image database from here :

Inside a folder:
1.Copy all the positive images in a folder named pos.
2.Copy all the negative images in a folder named neg.
3.Create a folder named data to store cascade file generated later on.

Open a terminal,navigate to the requires folder and type

1.find pos -iname “*.pgm” -exec echo \{\} 1 0 0 100 40 \; >

Result will be file like this :

2.find neg -iname “*.pgm” > bg.txt
Result will be file like this :

3.opencv_createsamples -info -num 550 -w 48 -h 24 -vec cars.vec
(width and height parameters change with change of database,-num is the number of images in pos folder)

4.opencv_traincascade -data data -vec cars.vec -bg bg.txt -numStages 10 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -numPos 500 -numNeg 500 -w 48 -h 24
a)-numPos and -numNeg must have the number of photos you have in pos and neg folder respectively.
b)numPos < number of samples in vec
c)choosing minhitrate and maxfalsealarm

To understand how the parameters of training affect the output refer :ย

For example you have 1000 positive samples. You want your system to detect 900 of them. So desired hitrate = 900/1000 = 0.9. Commonly, put minhitrate = 0.999^number of stages

For example you have 1000 negative samples. Because itโ€™s negative, you donโ€™t want your system to detect them. But your system, because it has error, will detect some of them. Let error be about 490 samples, so false alarm = 490/1000 = 0.49. Commonly,put false alarm = 0.5^number of stages


1)The number of negative images must be greater than the number of positive images.

2)Try to set npos = 0.9 * number_of_positive_samples and 0.99 as a minHitRate.

3)vec-file has to contain >= (npos + (numStages-1) * (1 – minHitRate) * numPose) + S, where S is a count of samples from vec-file.S is a count of samples from vec-file that can be recognized as background right away.

This will generates xml file in data folder.The face detction code in the opencv examples could be used but with change in the xml file name.

A better way of using the xml file is running it through the code I made for object detection using multiple xml file

Code and other xml files(do read README.nd):

The cascade files used can be downloaded from here:

These are sum of the results:
Detection time:38.2002 ms




Detection time:38.4898 ms




You are always welcome to download the code and modify it for better usages.Any contribution to the code is highly appreciated.

Thank you ๐Ÿ™‚

Science lovers must see this :

See also :-



  1. Hello I have try made xml file for car detection using train cascade, I have 1098 positive images (front, back, and side view with different size), and 1198 negative images, made 20 stages.. I tested it with positive images, and 4 from 5 images detected car.. The problem it’s not fully rectangle detected car.. Can you help me how to fix it??

    1. Hi Shiloh,
      If your problem is that the rectangles being drawn are not covering the cars properly then most probably the training has not been done in a perfect manner and it is therefore detecting garbage.Just mail me atleast three to four images(result) on which you have tested your classifier.

      Thank you for contacting,

  2. Hi. I’m tryng to use your recognizer in an iOS application , i’m not getting good results; I have doubts on xml to load, that is I’m loading all of them, with this code, but maybe this is not the right way ?


    NSString *checkcascade_path = [NSString stringWithFormat:@”%@/CAR-DETECTION/%@”,
    [NSBundle mainBundle].bundlePath, @”checkcas.xml”];

    NSLog(@”%@”, checkcascade_path);

    _detectcars.checkcascade_load([checkcascade_path UTF8String]);

    // everyone ( 4 xml ) or only one ??? and which ?
    for (int i=0; i<4; i++) {

    NSString *cascade_path = [NSString stringWithFormat:@"%@/CAR-DETECTION/cas%d.xml",
    [NSBundle mainBundle].bundlePath, i+1];

    NSLog(@"%@", cascade_path);

    _detectcars.cascade_load([cascade_path UTF8String]);


    1. Hi atrebbi,
      See if you are loading every xml classifier in the same way to detect a single object,the result of one classifier will contradict the other.Its better you mail me up the entire details.


      1. Having trouble compile the C++ program. Can you create the C++ visual studio 10 project.

        : Command line error D8016: ‘/ZI’ and ‘/clr’ command-line options are incompatible

  3. hi .
    i do step but when call traincascade.exe get this error

    ===== TRAINING 0-stage =====
    POS count : consumed 500 : 500
    Train dataset for temp stage can not be filled. Branch training terminated.
    Cascade classifier can't be trained. Check the used training parameters.

    i try with old version of haartraining.exe and worked but with traincascade.exe in opencv 2.4.6 dont work.

    plz help me if you know reason of this error


    1. Hello Keivan,

      Please specify the number of training images used by you(both the positive and negative).
      Also just brief up the process used by you in creating samples.
      Then tell me the function(command line) you used while training with the traincascade function.


  4. Hi,
    I have followed the same steps as you, and it works. It detects the cars most of the time, but there is one problem. It also detects other stuff as cars. Even the sky or some buildings, it creates a rectangle around them. I did a stage 16 training (which is as far as the trainer goes). Even when I show it a picture of a forest with no cars in it, it will draw rectangles all over the picture. Hope to hear from you soon.

    1. Hi Patrick,
      That is the problem with traincascade using LBP features, the training is fast compared to haarcascade but there is a major problem of detection of False Positives. To avoid most of the false positives, I came up with the code that uses multiple XML files that reduced the percentage of detection of false positive detection by almost 80% on a rough scale. I think you should try haar training. In the “opencv_traincascade function” add this parameter in the end : “-featureType Haar” or go for the basic haarcascade training.

      If you get stuck at some point in that, just ping me up at

      1. Thank you for your reply Mr. Kumar. I will try using Haar training.
        And yes, I did try your code and was surprised at its accuracy. The problem was that in certain pictures it would not recognize some cars because it was in a different angle while the one used in this tutorial did recognize it.
        By the way, this was a great tutorial, and thank you for taking your time to write it. I will ping you if I get stuck at some point.

  5. Hello Kumar,

    I create vec file from your example pictures but at the end i have only one xml file . when i looked your git repo i saw car1,car2, car3 xml and chackcass xml how you create car1, car2 and etc.. i mean you create car1 xml small set of positive car picture?

    1. Hi Suleyman,

      Every training will create onle one xml files. The other XMLs you found out were calculated on the same dataset but were a result of poor training(less stages,low resolution of samples,etc)

      Thanks for contacting ๐Ÿ™‚

  6. I downloaded database you have mentioned in the link, but it has 550 positive and 500 negative images. Each image is having 100*40 dimension but you have used 48*24 while creating .vec file and you are using less number of negative images than you specified. I am confused with output.

    everythig went well till 2nd stage, after i got, following error.
    ===== TRAINING 2-stage =====
    POS current samples:OpenCV Error: Bad argument (Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.
    ) in get, file /home/project/OpenCV/opencv-2.4.9/apps/traincascade/imagestorage.cpp, line 162
    terminate called after throwing an instance of 'cv::Exception'
    what(): /home/project/OpenCV/opencv-2.4.9/apps/traincascade/imagestorage.cpp:162: error: (-5) Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.
    in function get

    Aborted (core dumped)548

    Please do help me.

    1. Hello Mr.Punith,
      All the parameters used in the tutorial are not specific to any training data. Obviously the number of negatives must be greater than that of positives by almost double the amount. The reason I have used 48*24,basically a dimension less than 100*40(actual dimension of the image) is that the training with a downsampled image set is more reliable and fast. Your training got stopped because it was unable to find samples. Now what you should do is increase the negatives to around 1000 assuming the positives are of the same number(Make sure all the new images are of made to the same size of the previous ones). Then create around 4000-6000 samples. And the main error of yours can be eliminated like this- when you give the final training command, the number of positives you are choosing must be atleast a 1000-1500 less than actual number of samples. If there exists any other problem just ping me up.

      1. Hi Abhishek
        I did for live preview but its not working at all.As soon as i am getting the frame Even if there is no car is present the count of number of Car detected is increasing .because of this Its not at all drawing the box over detection area.

    1. Hi,

      I have mentioned in the blog post that these cascades that I have trained will not work for real world applications. These cascades have been trained with minimal stages and using LBP features which are not suitable for the real world. I think you should go for collecting a dataset according to your requirements. Then train them using haar features. This training again is a very tiring and frustrating job. I will tell you where the problem is :
      Now, according to my research these processes take time because the the program finds it difficult to find negatives from the negatives image dataset. It’s because of the algorithm and you can do nothing to change it. Have a look at this :

      I was able to go up for Open MP but I have not been able to implement Open MPI to haarcascades as my background is not related to deep computer science,I am an electrical engineering student. It will take time for me to implement parallel computing.

      For training,make sure all the images are grayscale and histogram equilized and of same size. You will need around 10K positives and around 15K negatives for a perfect training. This is the max possible required dataset. Again I am emphasizing that these trainings are not suitable if you have limited time constraints. You need to be patient.

      Now you may also go for “good features to track” for tracking cars and stuffs. I have worked on these features with traincascades to improve efficiency for a project at IIM Ahmedabad, but I cannot share the project stuffs with you.

      Hope this helps you ๐Ÿ™‚

  7. Good evening sir,
    I am trying to create classifier using your database images.
    But this images are in .pgm format that system not understanding.
    I have created positive image folder, negative image folder with positive.txt and negative .txt files.
    When i went to create .vec file, it gives error
    Unable to open image: mypath/pos-532.pgm
    Please help me to create classifier file

      1. Hi Sir.
        I am mailing my positives.txt and negatives.txt file with commands that i were used to make .vec file and haarcascade.xml file.
        Before two days, i was created xml file, but it is too small file and only two stages are create where i gaved 20 stages.

  8. Hi Abhishek,

    When i ran the opencv_traincascade, only 2 stages were executed correctly when the 3rd stage started the windows just threw an exception and stopped. i looked for a solution in many places for the past week not able to understand why it is happening and how to avoid it. Please help

      1. Hi,

        Sorry for the inconvenience, i had made a very fundamental mistake which i had not observed before, The names of few of the neg images were different, i had created it manually, so when the program tried finding the images from the neg.txt it couldn’t and was throwing an exception

  9. Good morning sir,
    I have created .vec file successfully.
    Command is:
    -info “E:\ObjectDetection1\positives.txt” -vec “E:\ObjectDetection1\positives.vec” -num 550 -w 24 -h 24 PAUSE

    Then i ran following command for training:
    -data “E:\ObjectDetection1\haarcascade” -vec “E:\ObjectDetection1\positives.vec” -bg “E:\ObjectDetection1\negatives.txt” -nstages 10 -featureType HOG -precalcValBufSize 1024 -precalcIdxBufSize 1024 -npos 50 -nneg 500 -w 24 -h 24 -mem 512 -mode ALL

    XML get created but it not completes all stages.
    It exits after following stage completed:
    parent node: 1
    within 2 to 3 minutes.

    What’s going wrong? please help me.

  10. Hi Abhishek!
    Thank you for your post.
    Can you help me with a problem related to using traincascade? I have a positive sample vec of 2010 images and a negative data file with 2350 images. The original positive image size is 49×49.

    I’m trying to train the cascade with the comand:

    opencv_traincascade -data data -vec positive.vec -bg negative.txt -numPos 1900 -numNeg 2350 -numStages 20 -w 49 -h 49 -minhitrate 0.999 -maxFalseAlarmRate 0.5 -precalcValBufSize 22528 -precalcIdxBufSize 22528 -mode ALL

    As you point numPos have to be less than numNeg. However, when I try to run this training, it always throws me: Segmentation fault (core dumped). I’m working in a Centos 7 machine.

    The only way it works is creating sample vec with -w 32 -h 32 size, and also numPos had to be greater than numNeg:

    opencv_traincascade -data data -vec positive.vec -bg negative.txt -numPos 1578 -numNeg 1000 -numStages 20 -w 32 -h 32 -minhitrate 0.999 -maxFalseAlarmRate 0.5 -precalcValBufSize 25152 -precalcIdxBufSize 25152 -mode ALL

    If I increase the number of numNeg or numPos it doesn’t work. Also if I try to use the same amount of positive and negative images ( -numPos 1000 -numNeg 1000) or even only increase numNeg (-numPos 1500 -numNeg 2000). I also try to use less Stage number but I didn’t work.

    Can you help me to understand why different parameters don’t work?
    Thanks in advance.
    Have a nice day!

    1. I’m sorry to bother you with this issue. I really appreciate any help you can provide me.

      Many thanks for considering my request. ๐Ÿ™‚

    2. Hello Diana,

      Sorry for the delay in reply.

      According to my experience of traincascade/haarcascade, segmentation faults occurred in the following situations for me:
      a) The path to the images/vec files doesn’t exist, which in your case is not an issue as you are able to work with 32×32 dimensioned samples.
      b) Allocated memory is not available with the machine or currently not available with the system as some other big program is running.
      c) I remember, a few of the times just by changing the mode to BASIC solved the issue.

      Try doing this:
      Keep numPos around 500-600 ( if stages are more than 15) or around 1000-1200,numNeg around 2000-2200,w and h as 49×49, minHitRate to be around 0.9 to 0.95 and falseRate to be around 0.1-0.3, memory as 4-6 GB(4096-6144 MB) and mode as BASIC.

      Please also let me know whether the centOS is actually installed in your PC or you are using it as a virtual machine, if it being used as a virtual machine then these problems will arise at almost every step.

      If you have any other issue or if this is not solved just ping me up here or an my email ID :


      1. Hi Abhishek!
        Thank you so much for your answer.
        I’m working in CentOS Linux 7 64-bit pc . It has 32GB of Ram and a Intelยฎ Xeon(R) CPU E3-1245 v3 @ 3.40GHz ร— 8 processor and it isn’t a virtual machine.
        I try your suggestion but it didn’t work ๐Ÿ˜ฆ Segmentation Fault (core dumped) again.
        I’m going to try in a Ubuntu Pc to see if the problem persists.
        Thank you so much for all your help ๐Ÿ™‚

      2. Hi Diana,

        I have searched for some more sources of seg faults :
        a) if in your negative training image textfile [ie: bg.txt], you have the path specified to an image that doesn’t exist, or the path is misspelled.
        b) Sometimes changing featureType to LBP solves the issue.

        I will search for more on this issue if I get time.


  11. Hi Abhishek!
    Thank you so much for all your help and for taking the time to help me.
    I will check the name files.
    Have a nice day! ๐Ÿ˜€

    1. Hi Diana

      I think it occurs because you set memory 25152 for precalcValBufSize & precalcIdxBufSize is larger and couldn’t be allocatged.

      so try changing it to like:
      precalcValBufSize 512 -precalcIdxBufSize 512

      It should work.

  12. hi….
    I am doing face detection and am using MIT database for positive and negative images .my question is ,i have downloaded ipos and nag images in separate folder.but how to invoke it into openCv for training and they are 19*19 pixel images

  13. Thanks for your blog. It is helpful.


    There’ a bug in your multiple cascade thing – you’re not loading the first cascade as you start at argv[3] – it should be argv[2].

    Also your use of -w 48 -h 24 is a bit puzzling as it is a different ratio to the training images.

    1. Hello,

      At argv[2] the code loads a cascade file, namely “ckeckcas”,line 236, which is a starting criteria of detection of various parts of the car. Also the resolution ratio taken for training was due to the assumption of rectangular nature of the car.


  14. This is a great page with useful links and feedback. Good work on the detector.
    I tried to train my own detector using the process herein with the exact same commands.
    What is puzzling is that even for a few stages (which I know will not give me the best results) the hit rate (HR) and false acceptance (FA) constantly remain the same no matter what I train. There doesn’t seem to be even the slightest improvement. I have checked the vec files and the images load correctly.
    As a side note when I try to run the cascade it only detects something that the centre of the image regardless of the content. This of course stems from the obviously erroneous training process. I use the facedetect file to run the cascade which I know that it works for face detection.

    What am I missing in the training process?? I appreciate your help!!

    Thank you!!

    1. Hello Chris,
      When it is detecting nothing but the center of image, then it means that the trained cascade has learnt almost zero features. If changing the HR and FA is not working in your case, then you may try changing the width and height parameters in sampling as well as training command with the intuition of ratio of object dimension to that of the image.

      Thank you for contacting,

  15. Hi Abhishek,

    Thank you very much for your speedy reply! You have confirmed my thoughts that the cascade learns nothing which is good!! So I wonder what could cause this since I use the same training set and parameters in the links you provide?
    Also after close examination of the xml file I have noticed that the stageThreshold is set to NaN as follows:
    whereas in other xml files it is a number. I couldn’t find a setting that affects this however, and was wondering why it is so.

    1. Hey Chris,
      To be very frank with you, keeping same parameters, training set, and the computer never gave me same results, so I stopped using traincascade. I believe that it would be very professional to extract genuine object oriented features and train them using using machine learning. I think the stageThreshold’s Nan may be a bug of the version of OpenCV you are using, try going for haarcascade is this is creating these sort of issues.


    1. Dear Abhishek,

      I would like to inform you that I run the code on another machine and it worked!! It is up and running now! So perhaps it is a bug or a version issue as you suggested!! ๐Ÿ™‚ Thank you for your useful suggestions that allowed me to pinpoint the problem!!!

      1. Hello Chris,

        I am glad that a solution popped up. Also, as I mentioned earlier do try implementing other techniques rather than just sticking to haar-wavelets and LBP for your project.

        Best Luck!!!

  16. Hey Abhishek,
    Thanks for the tutorial. I was going through the xml files that you shared and noticed that cas2.xml used HOG features instead of LBP. I tested all the xml files and saw that cas2.xml had the best performance, which is what I would have expected, given that HOG features are most often used to detect objects with well defined silhouttes like cars and pedestrians. My question is, how exactly did you manage to train a HOG cascade that is compatible with OpenCV’s cascade classifier when the documentation only describes how to train haar and LBP cascades. It would be really helpful for a project I’m working on.

  17. Hello Abhishek,
    how are you?
    Hope you’re doing well .
    I have some question 1. How can I create my own pgm files from my owns picture ??
    2. How can I make sure all the images are grayscale and histogram equalized and of the same size ??
    I want the fastest way please

    thank you

    1. Hello Abdulla,

      First of all you don’t need to create pgm files from any image, “pgm” is just a format like “jpeg”, so just make sure every image is of same type, either “jpeg”, or “png” , or “pgm, or anything else. Write a code to read all the images you have, convert them to grayscale using cvtColor() function, equalize them using equalizehist() function, and write the images using imwrite().


  18. Hi Abhishek

    I am new to openCV.I tried this tutorial.I am trying to implement vehicle detection.My question is the training images are of size 100×40 pixel.The project aims at detecting car in real time.Will training with small images impact the outcome when detecting car in real time?Also how to determine the value the -w & -h parameters while training.

    1. Hello Harita,

      Your statement is correct, size of training images may affect the situation when you are working in the real-time domain. The -w and the -h parameters are selected on the basis of object-to-image ratio in test images.


  19. Thank you very much for this tutorial . I have some concern about the photo type , in some tutorial they used PNG , here we used pgm my question what is the benefit of using pgm format rather the other picture types and how can we generate this type like this tutorial

  20. Hi Abhishek,
    I was able to run your code as it is,using this command at the terminal:
    ./outputfile image.jpg checkcas.xml cas1.xml cas2.xml cas3.xml cas4.xml , however i got a success rate of 2/5 for different images that i chose.
    Is there any changes that have to be made for running video files?
    Also,as per your previous suggestion i changed LPB to HOG in all the xml files and i got the following error:
    Segmentation fault(core dumped)
    Eagerly awaiting your response…Thank you!


    1. Hello Sammy,

      By changing LBP to HOG I meant, training new cascades based on HOG, sorry if my statement reflected something else. Changing LBP to HOG in XML file will never work. I think for a high quality video you must train new cascade files.


      1. Hi Abhishek,

        Thanks for your post!! I want to use HOG, which is more useful for pedestrian tracking, but I don’t understand what you mean with “training new cascades based on HOG”.

        Could you tell me which parameters would I have to change in opencv_traincascade??

        I would appreciate a response.


  21. Hi Abhishek,

    I was create xml file by using opencv traincascade with LBP ,but when i test this in code the file can’t be loaded and this error : opencv error unspecified error <the node does not represent a user object >in cvRead

    how can i know if my xml correct or not?
    any help please .


  22. Hi… Am Balaji… I am actually trying to detect human upper bodies… I have trained my samples by HOG cascade technique using traincascade in OpenCV.. Now I would like to detect the trained .xml in Opencv., but am not able to find a code which can detect the HOG cascade model… I am using OpenCV V3.0….. Please guide me on how to proceed…

    1. Hi Balaji,

      Its been a long time since I worked on this cascade classifier. But of what I remember is that I used a command, “face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(20, 20) );” somewhat similar to like this. I have the old file too, if you want just mail me up at


  23. i Mr.
    I wil test the object detceion in cascade with opencv_traincascade in windows. I have a 160 positif images with 30*40 resolution, but i juste 140 negatif images with size different from positif images, I resized it to have the same resolution. I have finaly my negatif.txt file initied by negatif/positive1.jpg
    and the by
    positif/positive92.jpg 1 0 0 30 40

    I generated the file.vec and I used this inscrution to train the simples :

    opencv_traincascade -data data -vec file.vec -bg negatif.txt -numPos 140 -numNeg 140 -numStages 15 -w 30 -h 40 -featureType LBP -precalcValBufSize 1024 -precalcIdxBufSize 1024

    but in stage 1 , I have this problem :
    Train dataset for temp stage can not be filled. Branch training terminated.
    Can you help me please?

      1. โ€‹Thank you for your answer,
        in fact, I have already created an empty folder “data” folder in my work,
        but when I run, it crashes, and it gives me this problem:
        Train dataset for Temp internship can not be filled. Branch terminated trainingโ€‹

  24. Hello Mr.Abhishek

    first of all thank you a lot for this tutorial..i have just one problem when i run the code the maximum number of cars that can be detect is two

    can you help me please

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s