dlib C++ Library: Python Stuff and Real-Time Video Object Tracking

Tuesday, February 3, 2015

Python Stuff and Real-Time Video Object Tracking

The new version of dlib is out today. As promised, there is now a full Python API for using dlib's state-of-the-art object pose estimation and learning tools. You can see examples of this API here and here. Thank Patrick Snape, one of the main developers of the menpo project, for this addition.

Also, I've added an implementation of the winning algorithm from last year's Visual Object Tracking Challenge. This was a method described in the paper:

Danelljan, Martin, et al. "Accurate scale estimation for robust visual tracking." Proceedings of the British Machine Vision Conference BMVC. 2014.

You can see some videos showing dlib's implementation of this new tracker in action on youtube:

All these videos were processed by exactly the same piece of software. No hand tweaking or any funny business. The only required input (other than the raw video) is a bounding box on the first frame and then the tracker automatically follows whatever is inside the box after that. The whole thing runs at over 150fps on my desktop. You can see an example program showing how to use it here, or just go download the new dlib instead :)

I've also finally posted the paper I've been writing on dlib's structural SVM based training algorithm, which is the algorithm behind the easy to use object detector.

88 comments :

Shervin EmamiFebruary 4, 2015 at 12:58 AM
Great addition, the object tracker seems quite robust in the video!
ReplyDelete
Replies
Davis KingFebruary 4, 2015 at 8:14 PM
Thanks! :)

Yeah, for objects that don't undergo rapid out of plane rotations it works pretty well.
ReplyDelete
Replies
StefanelusFebruary 15, 2015 at 6:42 AM
hey Davis,

I tried to test the tracker and when I execute the sample I get something like this:

Error detected in function void __thiscall dlib::matrix,struct dlib::row_major_layout>::set_size(
long,long).

Failing expression was (NR == 0 || NR == rows) && ( NC == 0 || NC == cols) && ro
ws >= 0 && cols >= 0.
void matrix::set_size(rows, cols)
You have supplied conflicting matrix dimensions
rows: 0
cols: 0
NR: 0
NC: 1
this: 008EFBF0

I used the frames provided in the library.

Best regards,
Stefan
ReplyDelete
Replies
Davis KingFebruary 15, 2015 at 8:09 AM
Oops. I have an assert statement triggering in debug mode that I need to fix. However, if you run it release mode it will work fine.

Running in debug mode is very slow anyway (http://dlib.net/faq.html#Why%20is%20dlib%20slow?)
ReplyDelete
Replies
StefanelusFebruary 15, 2015 at 10:26 AM
it did the trick, from the posted video the tracker looks really cool.
ReplyDelete
Replies
StefanelusMarch 2, 2015 at 9:06 AM
Dear Davis,

I'm trying to classify some image patches(96 x 96) which are faces. I have a few images per subject, around 30 up to 50 patches.

I have played with the image recognition from OpenCV but is very sensitive to a lot of things.

My question is I can use a feature descriptor from dlib and then train a classifier. In the dlib I saw SURF and a few other feature descriptors.

What is your advices ? It will make sense to use some descriptors from dlib for image recognition ?

Best regards,
Stefan

ReplyDelete
Replies
Davis KingMarch 2, 2015 at 6:03 PM
Sure, you can try using extract_highdim_face_lbp_descriptors(), extract_fhog_features(), or extract_uniform_lbp_descriptors() with a linear SVM. Any of those features generally give reasonable results for this kind of thing.
ReplyDelete
Replies
StefanelusMarch 3, 2015 at 2:50 AM
many thanks, I'll gave it a try.
ReplyDelete
Replies
UnknownMarch 4, 2015 at 12:00 AM
Dear Davis,

I noticed the tracker run over at 150fps on your desktop, but in my testing it just have about 50fps.
Can you help me figure out where the problem occured?

Best,
Max
ReplyDelete
Replies
Davis KingMarch 4, 2015 at 6:52 AM
Does this answer your question? http://dlib.net/faq.html#Whyisdlibslow
ReplyDelete
Replies
UnknownApril 22, 2015 at 11:05 AM
Hello Davis, this seems interesting and the tracker looks robust, but what happens for example when in the next frame the object isn't in screen anymore? Do you get an error message? Or there is a way I can initialize the tracker again on other object I have detected? Thanks in advance!
ReplyDelete
Replies
Davis KingApril 22, 2015 at 5:09 PM
The correlation tracker doesn't deal with or detect any of those cases. To get an entire tracking system you must combine it with many other tools and how you do that depends on your application
ReplyDelete
Replies
UnknownApril 23, 2015 at 10:33 PM
Thanks for the response!, somehow I was able to reset the tracking of the pedestrian when out of my region of interest and worked pretty well. Now I have another doubt: Sometimes on semi-occluded pedestrians the bounding box jumps from the tracked one to another. Can I improve the performance tweaking something on the algorithm or I have to use another aproach for those cases?
ReplyDelete
Replies
Davis KingApril 24, 2015 at 6:48 AM
The most common approach is to use something like this correlation tracker to generate short tracks (people call them tracklets). So you have to be able to identify when the tracker will fail so you can chop its output into tracklets. Then you use some additional processing to figure out which tracklets should associate together. There is a large literature on this. I would google for tracklet association and terms like that.

Personally, I would use http://dlib.net/ml.html#structural_assignment_trainer to perform tracklet association.
ReplyDelete
Replies
UnknownMay 20, 2015 at 2:20 PM
How could I make my own landmark detector xml file so that I could train the program to detect cars for example?
ReplyDelete
Replies
Davis KingMay 20, 2015 at 9:14 PM
You can use the imglab program in the tools sub folder to label images.
ReplyDelete
Replies
UnknownMay 20, 2015 at 10:04 PM
Is there a python version to imglab?
ReplyDelete
Replies
Davis KingMay 21, 2015 at 7:51 AM
Imglab is a graphical program. You don't need to look at its source code to use it so it doesn't matter what language it's written in.
ReplyDelete
Replies
UnknownMay 24, 2015 at 4:13 PM
When do you expect that this will be usable from Python?
ReplyDelete
Replies
Davis KingMay 24, 2015 at 7:43 PM
Right now :)

see https://github.com/davisking/dlib/blob/master/python_examples/correlation_tracker.py
ReplyDelete
Replies
UnknownMay 24, 2015 at 7:46 PM
Thank you! Nice to be able to objection detection AND tracking from Python.
ReplyDelete
Replies
UnknownMay 24, 2015 at 8:23 PM
@Davis King, Could you make a .exe for the imglab program so that everyone could run it? If not, is there any similar program??
ReplyDelete
Replies
Davis KingMay 24, 2015 at 8:41 PM
Why not compile it yourself? You just run cmake in the folder and it will shoot out the exe.
ReplyDelete
Replies
UnknownMay 24, 2015 at 9:33 PM
I have no experience with C++ and I've tried to compile it but I get errors. I also have a problem with correlation tracker.py I receive a no module error.
ReplyDelete
Replies
Davis KingMay 24, 2015 at 9:51 PM
The README.txt file in tools/imglab tells you exactly what to type to compile it. It should have worked. What happened when you tried those commands?
ReplyDelete
Replies
UnknownMay 24, 2015 at 10:11 PM
I believe there were some import errors. How would I solve the tracker correlation problem.

AttributeError: 'module' object has no attribute 'correlation_tracker'

ReplyDelete
Replies
Davis KingMay 25, 2015 at 8:15 AM
Did you compile the dlib library code from https://github.com/davisking/dlib?

The correlation tracker was only added to the python interface a few days ago so if you are trying to use an older version it won't work.
ReplyDelete
Replies
UnknownMay 25, 2015 at 12:31 PM
I recieve an error regard cl in cmake .. when compiling imglab.
ReplyDelete
Replies
UnknownMay 25, 2015 at 1:43 PM
My goal is to blur all heads in any police body camera video. My thought is to do head detection and then track forwards and backwards any detection with this real-time video object tracking script. Hopefully then it won't miss much. Any suggestions for making it efficient? Is there a better way than running the script per each detection?
ReplyDelete
Replies
Davis KingMay 25, 2015 at 6:01 PM
I would just run the face detector on each frame and not worry about tracking.
ReplyDelete
Replies
UnknownMay 25, 2015 at 6:18 PM
How do I deal with half a head etc then?
ReplyDelete
Replies
Davis KingMay 25, 2015 at 6:37 PM
The tracker might not work any better for partially occluded heads. You will just have to experiment and see what works.
ReplyDelete
Replies
UnknownMay 25, 2015 at 6:53 PM
Shucks. I was hoping to be able to keep people's heads blurred as they leave the frame. Thank you for the quick responses.
ReplyDelete
Replies
UnknownMay 26, 2015 at 2:55 PM
How to integrate the code with Live Video Stream....
ReplyDelete
Replies
UnknownMay 30, 2015 at 8:13 PM
I ran tracking on the guy on left side of frame 1 of https://www.youtube.com/watch?v=F0HkplIekOQ When the officer walks away for a few frames the guy in frame 1 is never tracked again. Is there anyway to track in a situation like this without redrawing the rectangles each time? Doing head detection hasn't been much successful ran into too many instances of "killed" so gave up on training.
ReplyDelete
Replies
UnknownJune 1, 2015 at 5:04 AM
Hi Davis,

Thanks to the wonderful algorithm, it looks attracting in the video.

I followed the instructions on the top of python version example and successfully compile the bat file. But still cannot import dlib and get the ERROR 'ImportError: No module named dlib'. I'm totally a beginner in either computer vision or python, could u please help me out of this problem? (BTW, I'm using mac os 10.10 and python 2.7)

Thanks~
ReplyDelete
Replies
Davis KingJune 1, 2015 at 6:41 AM
Did you run the python example by typing

python correlation_tracker.py

From within the python_examples folder?
ReplyDelete
Replies
UnknownJune 1, 2015 at 9:34 PM
Thanks for the quick response. I solved the importing thing by moving dlib.so to the site-package file but another error arose: AttributeError: 'module' object has no attribute 'image_window', any suggestions?

I found some hints from GitHub says that Deleted examples build and recompiled, and now I can't recreate the error. But I'm kind of confused what should be deleted and what should be recompiled (maybe the compile_dlib_python_module.bat again) ?
ReplyDelete
Replies
UnknownJune 2, 2015 at 2:08 AM
Sorry I missed your commend at the bottom of the GitHub issue, and now I can perfectly use the dlib as well as the correlation tracking method. I tried the tracking algorithm in my testing video and the performance is awesome !!
ReplyDelete
Replies
UnknownJune 13, 2015 at 5:02 AM
hi,
im trying to build a camera which is able to detect people staying in one place for a long time... can I use dlib to detect multiple people from a libe video feed that is already subjected to background subtraction(backgroundsubtractorgmg) Thankyou...
ReplyDelete
Replies
Davis KingJune 13, 2015 at 8:03 AM
Yes, there are many tools useful for that in dlib. I would try training a HOG filter to find the people. See http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html for example.
ReplyDelete
Replies
UnknownJuly 4, 2015 at 12:41 AM
Thank you Mr.Davis for your advise.
ReplyDelete
Replies
UnknownJuly 17, 2015 at 8:14 PM
Hello, i have a problem with speed with the correlation tracker. I am using linux and i compiled the library as written in the compilation instructions (with release mode on). Anyhow, with the provided test example with resolution 320x480, i only get about 25-30 fps, which is far from 150 fps.
Is this because i am using dlib with Python API?

Thank you for your answer!
ReplyDelete
Replies
Davis KingJuly 18, 2015 at 1:32 PM
No, the Python API isn't too much slower than the C++ API. Maybe your computer is super slow? Maybe you didn't really compile it in release mode. I don't know. I just tried it on my computer and I get 150fps in C++ including file I/O. In Python I'm getting 107fps but only because Python's image loading is much slower.
ReplyDelete
Replies
DashesyAugust 18, 2015 at 12:27 PM
Any chance you submit dlib for PyPI, it makes the impact orders of magnitude higher, thanks for great utility.
ReplyDelete
Replies
Davis KingAugust 18, 2015 at 4:45 PM
I'm not going to but you are welcome to do it if you are interested :)
ReplyDelete
Replies
Andrés FelipeAugust 19, 2015 at 5:59 PM
Hi Davis, I'm having problems when I try to compile the program in Eclipse.

make all
Building target: Tracker_dlib
Invoking: Cross G++ Linker
g++ -L/usr/lib/ -o "Tracker_dlib" ./src/Tracker_dlib.o -lpthread -lblas -llapack -ljpeg -lpng -lX11
/usr/bin/ld: cannot find -lblas
/usr/bin/ld: cannot find -llapack

However, when I check for libblas and liblabpack, these are the outputs that I get.

anfedres@anfedres-ThinkPad-W530:~/Documents/dlib-18.17/examples/build$ ldconfig -p | grep liblapack
liblapack.so.3 (libc6,x86-64) => /usr/lib/liblapack.so.3
anfedres@anfedres-ThinkPad-W530:~/Documents/dlib-18.17/examples/build$ ldconfig -p | grep libblas
libblas.so.3 (libc6,x86-64) => /usr/lib/libblas.so.3

I've added already /usr/lib/ to the library search path. Any clue which could be the problem?

Thank you.

ReplyDelete
Replies
Davis KingAugust 19, 2015 at 7:40 PM
There are instructions for compiling dlib here http://dlib.net/compile.html. CMake can also generate an eclipse project if you really want to use eclipse.
ReplyDelete
Replies
Andrés FelipeAugust 19, 2015 at 7:44 PM
Solved, thank Davis.
ReplyDelete
Replies
Andrés FelipeAugust 20, 2015 at 10:30 AM
Is there a way to make the tracker recover if it is lost?. Thanks Davis.
ReplyDelete
Replies
Davis KingAugust 20, 2015 at 10:05 PM
No, you will need to include it inside some larger tracking framework to deal with that sort of issue.
ReplyDelete
Replies
UnknownAugust 24, 2015 at 12:40 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownAugust 24, 2015 at 12:48 AM
Hi! I have multiple (x,y) coordinates for multiple people in each frame. Is it possible to track multiple people using this tracker? I saw in one of your previous comments that you have suggested using HOG but I already have my coordinates and I just need to track those multiple targets.
I see that after we start tracking, all I can do is send my image to tracker.update().
Is it possible to see what tracker.update() is actually doing?
ReplyDelete
Replies
Davis KingAugust 24, 2015 at 6:30 AM
Create multiple instances of the tracker, one for each object to track.
ReplyDelete
Replies
mces89August 25, 2015 at 11:57 AM
Hi, I want to do some visual tracking of objects in the video, I'm a little bit confused why this kind of tracking does not need object detection? What's the difference for the following two ways?

1. object detection for each frame, and then do some post-processing
2. object detection for the first frame, then do what was done in dlib.

Thanks.
ReplyDelete
Replies
ChicharitoSeptember 4, 2015 at 4:11 AM
Hi,

Im trying to use dlib with Qt framework (http://www.qt.io/). How to push object QImage or QVideoFrame into correlation_tracker object?

Thanks!
ReplyDelete
Replies
UnknownSeptember 12, 2015 at 5:41 AM
Nice video,I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
Regards,
Python Training in Chennai|Python Taining
ReplyDelete
Replies
UnknownOctober 27, 2015 at 11:42 AM
Hi Davis, Thanks for the nice video. May I ask if I/we can get the unboxed videos? So that I can test it.

Kind regards,
Tian
ReplyDelete
Replies
UnknownNovember 17, 2015 at 6:40 AM
Hi Davis,

First of all, thanks for your work! I would like to ask you a question. I am using the vot-toolkit tool to evaluate your correlation tracker (DSST) in order to compare your results with the ones obtained with the original DSST (Matlab). I get worse results with the VOT 2014 data set with your tracker than the original DSST. Why does this happen?

Thanks again! Have a nice day.
ReplyDelete
Replies
Davis KingNovember 17, 2015 at 8:08 AM
I also ran dlib's version on the vot-toolkit and didn't get results as good as what was reported in the DSST paper. I'm not sure why that is but my guess is that there are additional things they did beyond what is reported in the paper. Maybe there is some reacquisition logic? I'm not sure.
ReplyDelete
Replies
UnknownNovember 17, 2015 at 11:55 AM
Hi Davis,

Yeah! I was examining the DLIB code (which I think It's really good) and compare It with the Matlab original version (https://github.com/gnebehay/DSST). And the most selectable difference that I encounter was that the DLIB version always uses square filters (64 x 64) but the Matlab version adapts the filter size to the patch size. Do you think that could be some important point?

Thanks!
ReplyDelete
Replies
Davis KingNovember 17, 2015 at 9:51 PM
I tried it both ways when I was implementing it and there didn't seem to be any significant difference in accuracy between square vs. non-square filter shapes. It was a little bit faster and simpler to use square filters so I did it that way. Although you never know, maybe that's part of the difference.
ReplyDelete
Replies
Anguo YangDecember 2, 2015 at 2:33 AM
Hi, davis, can this video object detector be used as a head counter?
as this this video:
https://www.youtube.com/watch?v=OWab2_ete7s

head area(camera above) could also be seen as one "type of" visual object.
ReplyDelete
Replies
eyebiesDecember 2, 2015 at 2:35 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownMarch 23, 2016 at 11:52 AM
Hi Davis,
are you open for custom job?I need to create an app which can track object. mike.sorochev@gmail.com
ReplyDelete
Replies
Drew SunMay 13, 2016 at 6:41 PM
Hello Davis.

First off, Kudos for the good work.

We are noticing that the ConfidenceLevel of tracker is high even when the subject has moved out of the video frame. Can you please confirm the range of values for the tracker confidence level we should be looking at. Should we be looking at some other parameter for continued tracking.

Thanks.
ReplyDelete
Replies
Davis KingMay 13, 2016 at 6:46 PM
The confidence value is only loosely correlated with track breaking. To get a good estimate of track failure you need to include additional machenry, what exactly depends on your application.
ReplyDelete
Replies
Drew SunMay 13, 2016 at 10:58 PM
In our tests, We are tracking faces of pedestrians. They walk in front of camera and move out of view. Can you please clarify what you mean by additional machenry.

Thanks for the quick response.
ReplyDelete
Replies
Davis KingMay 13, 2016 at 11:02 PM
I would run a face detector every few frames to make sure the objects are still present.
ReplyDelete
Replies
Drew SunMay 13, 2016 at 11:57 PM
We are running face detector every few frames.

Wondering how this would behave when one or more subjects tracked leave the scene and few others enter the field of view of camera Not sure if running just a face detector suffices in this case.

Can you please confirm if ConfidenceLevel is good indicator of effective tracking when the subject just moves within the frame. Should we be looking at some other indicators.

Would a delimiter defining the perimeter of the frame for effective tracking help? i.e., dont track beyond a predefined boundary or something similar.

ReplyDelete
Replies
Davis KingMay 14, 2016 at 6:25 AM
Those are good ideas and you will need to test them out when you develop your system to see what works. That is the only way to know.
ReplyDelete
Replies
dapper danJuly 12, 2016 at 1:04 PM
Great work. Can I modify the tracking parameters in python or do I need to recompile dlib for each change?
ReplyDelete
Replies
Davis KingJuly 12, 2016 at 1:29 PM
You have to use C++ to do that.
ReplyDelete
Replies
UnknownJuly 30, 2016 at 7:03 AM
Really cool!
ReplyDelete
Replies
UnknownJanuary 30, 2017 at 10:23 PM
Hi Davis

Thanks for creating Dlib. I found it really useful.

I have a couple questions about correation_trakcer:

1) How can I obtain relevant values of tracking-rectangle (e.g., center position, width, height, etc.)? It seems that I need to do something with get_position() but I cannot get the value I want.

2) When an tracked-object moves fast, the tracker can lose the object. Is there any way to minimize this possibility? (I know that I need to detect the object again in case it is lost.)

Thanks in advance.
ReplyDelete
Replies
Celebnews CornerMay 9, 2017 at 1:57 PM
This comment has been removed by the author.
ReplyDelete
Replies
Celebnews CornerMay 9, 2017 at 2:01 PM
hi davis, i tried to use dlib to place face andmarks but it gives out this error henever i try to use shape_pridictor:

error : predictor = dlib.shape_predictor(predictor_path)
RuntimeError: Error deserializing a floating point number.
while deserializing a dlib::matrix
while deserializing object of type std::vector
while deserializing object of type std::vector
while deserializing object of type std::vector

i am using latest version on my raspberry pi.
i need to get this done because i have a presentation for my final year project please i need help !
ReplyDelete
Replies
UnknownAugust 10, 2017 at 3:19 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownAugust 10, 2017 at 3:30 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownAugust 11, 2017 at 5:07 AM
That is a good tracker. However, it is unfortunately not possible to track multiple objects.

In my case there is many tiny objects which I do not have any problems detecting them. However, when it comes to tracking. I have no clues how to do so.

how exactly one is able to create multiple tracker.update() while having not much speed performance degradation?
I mean how is it possible to use parallel processing in GPU track each object individually using parallel processing.
I have tried to reproduce such a system: https://www.youtube.com/watch?v=3IR1h6i31JI
so far detection is good, but tracking fails miserably.
ReplyDelete
Replies
YZ RFebruary 26, 2018 at 4:09 AM
Hi Davis,

I am particular interested in the part as shown in the video from 3:32 - 3:44, where two person cross and intersect.

I have implement the tracker where I track faces, the tracker works very well with one face in the screen. When I tried to track two face where they crosses, when the tracked face is in the front it is still working good. However when the tracked face is in the back, the front face will 'bring' the tracker away and the tracked object will now become the face in the back.

I am running the tracker at around a 30fps video (real time from webcam). Unlike shown in the video where the tracker will still recognize the tracking object even when two person cross and intersects, regardless the tracking object is in the back or the front.

Is there any additional algorithm applied in order to achieve the performance as shown in the video? As my understanding is that the tracker algorithm will look at the closest pixel in the bounding box of current and subsequence frame, hence what I observed from my implementation should be correct.

Thank you.

Regards
YZ
ReplyDelete
Replies
Davis KingFebruary 26, 2018 at 6:54 AM
The video doesn't use any additional processing tricks. But in general this kind of algorithm will often, but not always, get confused if two similar looking objects briefly occlude each other. To make it more robust to this kind of thing you need to add some stronger appearance based features like pull out a face descriptor and use that to deal with track swaps. There is also an extended version of this algorithm that is better at disambiguating this kind of issue (http://openaccess.thecvf.com/content_cvpr_2017/papers/Mueller_Context-Aware_Correlation_Filter_CVPR_2017_paper.pdf) which was presented at last year's CVPR. I haven't added it to dlib yet though.
ReplyDelete
Replies
UnknownJanuary 10, 2019 at 3:02 PM
Hi David, is possible to use a GPU with this algorithm ? for one up to 5 objects it's work fine. But add one more and all is going very slow.
Best regards.
Martín.
ReplyDelete
Replies
UnknownJanuary 10, 2019 at 3:05 PM
Hi David, is possible to use a GPU with this algorithm ? for one up to 5 objects it's work fine. But add one more and all is going very slow.
Best regards.
Martín.
ReplyDelete
Replies
Davis KingJanuary 10, 2019 at 7:18 PM
There isn't any GPU accelerated version of this.
ReplyDelete
Replies
JayMarch 24, 2019 at 10:57 AM
Hi Davis,
I am currently using Correlation tracker to track speed limit signs in the videos. The tracker works fine, however when the speed sign goes out of the image the tracker returns negative x and y values. I used

tracker.update(current_image)
cout<<tracker.get_position().left()<< " "<<tracker.get_position().top()<<" "<<tracker.get_position().right()<<" "<<tracker.get_position().bottom()<<endl;

I tested the tracker with three videos and I observe this behaviour whenever the speed sign goes out of the image. Please suggest whether this is an expected behaviour or not. Thanks.
ReplyDelete
Replies
Davis KingMarch 24, 2019 at 6:00 PM
I assume it's going out of the image to the left or top? Those areas have negative coordinates, so this is expected.
ReplyDelete
Replies
JayMarch 26, 2019 at 10:48 AM
Many thanks for confirming that negative coordinates are expected. As I drive, the speed signs in the captured video goes out of the image to the top and left. I now handled this exception in my code.
ReplyDelete
Replies

Add comment