Wednesday, April 9, 2014

Dlib 18.7 released: Make your own object detector in Python!

A while ago I boasted about how dlib's object detection tools are better than OpenCV's. However, one thing OpenCV had on dlib was a nice Python API, but no longer!  The new version of dlib is out and it includes a Python API for using and creating object detectors. What does this API look like? Well, lets start by imagining you want to detect faces in this image:


You would begin by importing dlib and scikit-image:
import dlib
from skimage import io
Then you load dlib's default face detector, the image of Obama, and then invoke the detector on the image:
detector = dlib.get_frontal_face_detector()
img = io.imread('obama.jpg')
faces = detector(img)
The result is an array of boxes called faces. Each box gives the pixel coordinates that bound each detected face. To get these coordinates out of faces you do something like:
for d in faces:
    print "left,top,right,bottom:", d.left(), d.top(), d.right(), d.bottom()
We can also view the results graphically by running:
win = dlib.image_window()
win.set_image(img)
win.add_overlay(faces)

But what if you wanted to create your own object detector?  That's easy too.  Dlib comes with an example program and a sample training dataset showing how to this.  But to summarize, you do:
options = dlib.simple_object_detector_training_options()
options.C = 5  # Set the SVM C parameter to 5.  
dlib.train_simple_object_detector("training.xml","detector.svm", options)
That will run the trainer and save the learned detector to a file called detector.svm. The training data is read from training.xml which contains a list of images and bounding boxes. The example that comes with dlib shows the format of the XML file. There is also a graphical tool included that lets you mark up images with a mouse and save these XML files. Finally, to load your custom detector you do:
detector = dlib.simple_object_detector("detector.svm")
If you want to try it out yourself you can download the new dlib release here.

24 comments :

Manuel said...

This library looks amazing!

How can I go and install it? I cannot seem to find a good tutorial.

Davis King said...

The comment at the top of each python example tells you what to do to compile the library.

Tester said...

Hello and thanks for your work.

I tried to train a detector with the .py file you provided. It works well on about 10 images (each about 2000x2000, jpg), but it fails with "Memory Error" on more than 10 images.
Sorry if the solution to this problem is obvious.

OS: Windows 7 64bit (using 32bit Python 2.7)

Tester said...

Oh guess I forget an actual question: do you know why exactly this error occours and how I can prevent it while still training on more images? My goal is to train on some hundreds of images each of the same size.

manas dalal said...

I used the imglab exe to make the file with the boxes. while running the code to build the svm file on certain occasions it fails somewhere so i checked i changed the width and the height to random value it worked but that will increase the chances of misclassifications. How is it the bounding boxes are affecting this process of training?

Davis King said...

What happens when it fails? Is there an error message?

manas dalal said...

Hi Davis,

Theres absolutely no error message the last check point is when it counts the no of images and then the crash

manas dalal said...

so is there a certain aspect ratio to maintained while drawing the bounding box over the object? because certain occasions the default window size 80 x 80 does not seem to work unless changed to 50 x 50. What features should be common? similar height, width , aspect ratio , area etc..

Davis King said...

There is no error message at all? What happens? The program terminates and nothing is output to disk or the screen?

You should try to make all your boxes have a similar aspect ratio.

manas dalal said...

There is absolutely no message on the screen just crashes . i think most of the boxes are made to maintain the aspect ratio. I can share the xml with you if you wish to analyse it?

Davis King said...

Sure, if you can post a complete program that demonstrates the error you are seeing that would be great.

Anon Anon said...

How do I save the image to a file? I don't have a GUI.

manas dalal said...

I am using evaluate_detectors(), how do i know which detector has returned the true value for the rectangle.?

Thanks

Davis King said...

The documentation for evaluate_detectors() tells you how: http://dlib.net/dlib/image_processing/scan_fhog_pyramid_abstract.h.html#evaluate_detectors

manas dalal said...

Managed to complete the entire thing the only thing that is stopping me is this

I have added the entire training into a function, the training happens fine everything is ok it generates the detector but just crashes at the function exit point. any idea about that?

Tried everything I have a hunch that the thread (dlib::Svm_thread) is not getting released may be. could that be the issue? if so how do i ask the function to wait for the thread to be finished?

Davis King said...

Do the example programs run without crashing if you don't modify them? If yes then there is probably a bug in your code, not in dlib.

Anon Anon said...

I'm using Dlib to redact people's heads from body camera footage to post at https://www.youtube.com/channel/UCcdSPRNt1HmzkTL9aSDfKuA Should I be making different svm files for the various head positions? How many different videos do I need to train on in order to create a very reliable head detector?

Davis King said...

If you have heads it isn't detecting then yes, you need to train more models for those head poses. A few hundred examples is usually sufficient for the training to give quite good results.

mdfwn said...

Hello Mr. King,
I have two questions:

i) do I have independent control over the width and height of the detection_window_size? I could not set a tuple to this option and I need the detection area to be a non-quadratic rectangle

ii) do I have control over the pyramid size? For a current project, I don't need/want to apply the algorithm on different scales

I tried experimenting and reading the docs, so I suspect the answer is 'no' to both questions. Since these options are available in the C++ implementation: would it be much work to re-compile the c++ code to get a new dlib.pyd file which uses the needed options?

Thanks for your time.

Davis King said...

The python interface picks the best aspect ratio for the detection window based on your training data. So if most of your training boxes are two times as tall as wide then the detection window will be like that too.

If you want more control then you need to use the C++ API rather than trying to modify the python API as that is a lot of work. I mean, you can, but if you have enough ability to modify the underlying C++->Python API implementation then you can just work in C++ in a fraction of the time.

Tim S said...

Hello. Looks cool. I tried following along but am too dumb to install dlib so that python import works.

I followed the usual install instructions as far as

cmake --build . --config Release

which seemed to work but Python remains unaware. Any ideas or is there an idiots guide as to how to do this?

Ta

Tim S said...

Opps - just saw the comment to read the python examples - I'll try that

mdfwn said...

Hello Mr. King,
can you elaborate on the .svm file that is produced (and re-used) by the object detector/trainer?

i) What information is stored in this file?
ii) How can I read and modify it?
iii) Is this exact .svm file compatible with the pure C++ implementation and therefore also usable by this (e.g. if I want to train in python but someday decide to switch to C++)?
iv) Am I dependent on dlib or can I somehow access the svm parameters which are stored in that file (and therefore use it with another SVM module)?

Thanks for your time.

Davis King said...

The python code is just a wrapper around dlib's C++ code. So you can load and use the object detectors without issue in C++.

The file isn't somehow encrypted, so you can read the values out of it and do whatever you want if you were motivated to write your own processing code. It is however highly technical, but all the details are documented in the main C++ side of dlib and in this paper: http://arxiv.org/abs/1502.00046