Comments on dlib C++ Library: Dlib 18.7 released: Make your own object detector in Python!

Your labels are most likely inaccurate or inconsis...

2020-11-01T07:52:20.070-05:00

Your labels are most likely inaccurate or inconsistent in some way. Train on a smaller dataset that you are sure is labeled the way you really want. Get it working on that, then run that resulting model on the other images and see where it disagrees with labels. Or add more images but review them to make sure the boxes are in the right places.

Hello David, First, thank you so much for your wor...

2020-10-27T08:18:59.660-04:00

Hello David,
First, thank you so much for your work it is really helpful in many ways.

I am trying to retrain the face detector on some thermal images. To do so I am using the Python code train_object_detector.py and I am actually having some issue with the dlib.train_simple_object_detector() function.

My first goal was to train it on 5000 images of dimension 160x120 pixels.

But I have been having some RAM issues. I try to resized the images but then the bounding boxes were too small "smaller than about 400
pixels in area".
So I found out that 500 images were the maximum I could use.
So now I am training on those 500 images and I am always getting:

Training accuracy: precision: 1, recall: 0, average precision: 0
Testing accuracy: precision: 1, recall: 0, average precision: 0

Do you have any idea of what could be wrong in what I am doing?

Thanks a lot

Many thanks for the reply.

2019-04-01T04:22:57.750-04:00

Many thanks for the reply.

The output includes the SVM confidence values. Co...

2019-03-31T15:39:48.427-04:00

The output includes the SVM confidence values. Consult the documentation to see how to get it.

Hi Davis, In my training data with images...

2019-03-30T12:44:08.310-04:00

Hi Davis,
In my training data with images, I have six classes which includes a background class (negative class). I would like to know whether it is possible to obtain Multiclass SVM probabilities in dlib. I want the SVM to output not only the class labels but also it's confidence value. Please help me in this regard.

Hi Davis I am working on blind spot detection pro...

2018-08-26T16:40:07.325-04:00

Hi Davis

I am working on blind spot detection problem of a vehicle. So, I want to detect cars, motorbikes, pedestrians or any vulnerable vehicles while changing lanes. So, I thought of detecting vehicles using dlib. We wanted to try HOG + SVM detector. I tried using detection window size - 80x80, 60x60, 40x40 etc and also changed pyramid param from 6 to 12. But, it always produces errors as shown below. So, I think the problem is with varying aspect ratios. So, I get errors like below with an exception -

An impossible set of object labels was detected. This is happening because none
of the object locations checked by the supplied image scanner is a close enough
match to one of the truth boxes in your training dataset. To resolve this you
need to either lower the match_eps, adjust the settings of the image scanner so
that it is capable of hitting this truth box, or adjust the offending truth
rectangle so it can be matched by the current image scanner. Also, if you are
using the scan_fhog_pyramid object then you could try using a finer image
pyramid. Additionally, the scan_fhog_pyramid scans a fixed aspect ratio box
across the image when it searches for objects. So if you are getting this error
and you are using the scan_fhog_pyramid, it's very likely the problem is that
your training dataset contains truth rectangles of widely varying aspect
ratios. The solution is to make sure your training boxes all have about the
same aspect ratio.

image index 2
match_eps: 0.5
best possible match: 0.457987
truth rect: [(561, 484) (621, 566)]
truth rect width/height: 0.73494
truth rect area: 5063
nearest detection template rect: [(572, 492) (652, 572)]
nearest detection template rect width/height: 1
nearest detection template rect area: 6561

Would you be kind enough to tell me what does match_eps stands for. I could just understand that its the 3rd image in order, but what do other parameters represent ? So, could you please suggest me how to go ahead to the problem ? We wanted to do it on CPU preferably, if possible

2018-08-26T16:33:46.534-04:00

This comment has been removed by the author.

Hi Davis, get_frontal_face_detector is based on HO...

2018-05-26T15:25:49.969-04:00

Hi Davis,
get_frontal_face_detector is based on HOG features and a linear classifier (SVM). You call get_frontal_face_detector in face detection programs without deserializing the previous SVM training results. I wonder how get_frontal_face_detector works without training data.

Nothing in dlib cares about the file extension.

2018-03-19T07:07:28.592-04:00

Nothing in dlib cares about the file extension.

How to convert SVM file to DAT file extension?

2018-03-19T06:23:30.752-04:00

How to convert SVM file to DAT file extension?

Not all images in the training data need labels. ...

2018-02-20T21:33:05.903-05:00

Not all images in the training data need labels. Any part of any image that doesn't have a box on it is treated as negative data and the algorithm will learn to not put boxes there.

Hi Davis; Is it possible to use images that conta...

2018-02-20T13:24:57.680-05:00

Hi Davis;

Is it possible to use images that contain any target object -so no box in xml for this image- in training?

As you would expect, more training data makes trai...

2017-06-16T06:47:46.706-04:00

As you would expect, more training data makes training take longer.

how about training time, how different would that ...

2017-06-16T03:35:54.520-04:00

how about training time, how different would that be?

The detection time is always the same.

2016-08-16T06:50:49.523-04:00

The detection time is always the same.

hi,if I use more training data to train an object ...

2016-08-16T05:14:10.406-04:00

hi,if I use more training data to train an object detector, does the detection time will be longer than before? For example, I use 50 people to train detector V1, and I use 100 people to train detector V2. Then I use V1 and V2 to detector face, I want to know if the detection time is the same? Many thanks.

i am using the Python example, to train a custom o...

2016-01-28T05:16:11.687-05:00

i am using the Python example, to train a custom object (road sign), but the detection window draws a bigger arbitrary box around the detection area. I thought it would be accurate and just draw it exactly around the matching object. obviously something has gone wrong with the training. has anyone else experienced this before. I resized all boxes to 80x80 and set my detection size to 6400.

Tuning is difficult for me Thank you Davis

2015-12-17T05:19:46.187-05:00

Tuning is difficult for me
Thank you Davis

2015-12-17T05:16:17.892-05:00

This comment has been removed by the author.

Which is best really depends on your application, ...

2015-12-16T18:54:08.362-05:00

Which is best really depends on your application, and in particular, how much you care about different types of errors.

Thank you Davis I have created my original detecto...

2015-12-16T16:03:25.794-05:00

Thank you Davis
I have created my original detector.
that's result is

Trained with C: 5
Training accuracy: precision: 0.991111, recall: 0.771626, average precision: 0.769863
Testing accuracy: precision: 0.986111, recall: 0.731959, average precision: 0.723225

Trained with C: 10
Training accuracy: precision: 0.991701, recall: 0.82699, average precision: 0.82468
Testing accuracy: precision: 0.975309, recall: 0.814433, average precision: 0.804037

Trained with C: 20
Training accuracy: precision: 0.992248, recall: 0.885813, average precision: 0.883479
Testing accuracy: precision: 0.976744, recall: 0.865979, average precision: 0.854488

Trained with C: 25
Training accuracy: precision: 0.996169, recall: 0.899654, average precision: 0.897599
Testing accuracy: precision: 0.977011, recall: 0.876289, average precision: 0.864523

Trained with C: 30
Training accuracy: precision: 0.996212, recall: 0.910035, average precision: 0.908016
Testing accuracy: precision: 0.967033, recall: 0.907216, average precision: 0.894226

Trained with C: 40
Training accuracy: precision: 0.996255, recall: 0.920415, average precision: 0.918458
Testing accuracy: precision: 0.967033, recall: 0.907216, average precision: 0.895631

Trained with C: 50
Training accuracy: precision: 0.99631, recall: 0.934256, average precision: 0.932443
Testing accuracy: precision: 0.967391, recall: 0.917526, average precision: 0.904212

Trained with C: 100
Training accuracy: precision: 0.996377, recall: 0.951557, average precision: 0.949977
Testing accuracy: precision: 0.9375, recall: 0.927835, average precision: 0.913309

I think C: 30 is the best
What do you think about this?

hi davis, how do i reuse .svm generated by hog_obj...

2015-12-11T07:54:57.951-05:00

hi davis,
how do i reuse .svm generated by hog_object_detector. I am using visual studio 12 as compiler.

how to put text on the top of the detection rectan...

2015-09-10T09:12:34.051-04:00

how to put text on the top of the detection rectangle

Tracking is a little bit more than just detection....

2015-07-09T04:53:44.471-04:00

Tracking is a little bit more than just detection. You might want to use dlib's Real Time Video Object Tracking: http://blog.dlib.net/2015/02/dlib-1813-released.html

Can i use the dlib training method to detect peopl...

2015-07-09T04:49:32.111-04:00

Can i use the dlib training method to detect people by feeding it pictures of people body shape etc. and track them using a RASPBERRY PI? would it have enough power to do tracking using dlib in real time?