Also, I've added an implementation of the winning algorithm from last year's Visual Object Tracking Challenge. This was a method described in the paper:
Danelljan, Martin, et al. "Accurate scale estimation for robust visual tracking." Proceedings of the British Machine Vision Conference BMVC. 2014.You can see some videos showing dlib's implementation of this new tracker in action on youtube:
All these videos were processed by exactly the same piece of software. No hand tweaking or any funny business. The only required input (other than the raw video) is a bounding box on the first frame and then the tracker automatically follows whatever is inside the box after that. The whole thing runs at over 150fps on my desktop. You can see an example program showing how to use it here, or just go download the new dlib instead :)
I've also finally posted the paper I've been writing on dlib's structural SVM based training algorithm, which is the algorithm behind the easy to use object detector.
Great addition, the object tracker seems quite robust in the video!
ReplyDeleteThanks! :)
ReplyDeleteYeah, for objects that don't undergo rapid out of plane rotations it works pretty well.
hey Davis,
ReplyDeleteI tried to test the tracker and when I execute the sample I get something like this:
Error detected in function void __thiscall dlib::matrix,struct dlib::row_major_layout>::set_size(
long,long).
Failing expression was (NR == 0 || NR == rows) && ( NC == 0 || NC == cols) && ro
ws >= 0 && cols >= 0.
void matrix::set_size(rows, cols)
You have supplied conflicting matrix dimensions
rows: 0
cols: 0
NR: 0
NC: 1
this: 008EFBF0
I used the frames provided in the library.
Best regards,
Stefan
Oops. I have an assert statement triggering in debug mode that I need to fix. However, if you run it release mode it will work fine.
ReplyDeleteRunning in debug mode is very slow anyway (http://dlib.net/faq.html#Why%20is%20dlib%20slow?)
it did the trick, from the posted video the tracker looks really cool.
ReplyDeleteDear Davis,
ReplyDeleteI'm trying to classify some image patches(96 x 96) which are faces. I have a few images per subject, around 30 up to 50 patches.
I have played with the image recognition from OpenCV but is very sensitive to a lot of things.
My question is I can use a feature descriptor from dlib and then train a classifier. In the dlib I saw SURF and a few other feature descriptors.
What is your advices ? It will make sense to use some descriptors from dlib for image recognition ?
Best regards,
Stefan
Sure, you can try using extract_highdim_face_lbp_descriptors(), extract_fhog_features(), or extract_uniform_lbp_descriptors() with a linear SVM. Any of those features generally give reasonable results for this kind of thing.
ReplyDeletemany thanks, I'll gave it a try.
ReplyDeleteDear Davis,
ReplyDeleteI noticed the tracker run over at 150fps on your desktop, but in my testing it just have about 50fps.
Can you help me figure out where the problem occured?
Best,
Max
Does this answer your question? http://dlib.net/faq.html#Whyisdlibslow
ReplyDeleteHello Davis, this seems interesting and the tracker looks robust, but what happens for example when in the next frame the object isn't in screen anymore? Do you get an error message? Or there is a way I can initialize the tracker again on other object I have detected? Thanks in advance!
ReplyDeleteThe correlation tracker doesn't deal with or detect any of those cases. To get an entire tracking system you must combine it with many other tools and how you do that depends on your application
ReplyDeleteThanks for the response!, somehow I was able to reset the tracking of the pedestrian when out of my region of interest and worked pretty well. Now I have another doubt: Sometimes on semi-occluded pedestrians the bounding box jumps from the tracked one to another. Can I improve the performance tweaking something on the algorithm or I have to use another aproach for those cases?
ReplyDeleteThe most common approach is to use something like this correlation tracker to generate short tracks (people call them tracklets). So you have to be able to identify when the tracker will fail so you can chop its output into tracklets. Then you use some additional processing to figure out which tracklets should associate together. There is a large literature on this. I would google for tracklet association and terms like that.
ReplyDeletePersonally, I would use http://dlib.net/ml.html#structural_assignment_trainer to perform tracklet association.
How could I make my own landmark detector xml file so that I could train the program to detect cars for example?
ReplyDeleteYou can use the imglab program in the tools sub folder to label images.
ReplyDeleteIs there a python version to imglab?
ReplyDeleteImglab is a graphical program. You don't need to look at its source code to use it so it doesn't matter what language it's written in.
ReplyDeleteWhen do you expect that this will be usable from Python?
ReplyDeleteRight now :)
ReplyDeletesee https://github.com/davisking/dlib/blob/master/python_examples/correlation_tracker.py
Thank you! Nice to be able to objection detection AND tracking from Python.
ReplyDelete@Davis King, Could you make a .exe for the imglab program so that everyone could run it? If not, is there any similar program??
ReplyDeleteWhy not compile it yourself? You just run cmake in the folder and it will shoot out the exe.
ReplyDeleteI have no experience with C++ and I've tried to compile it but I get errors. I also have a problem with correlation tracker.py I receive a no module error.
ReplyDeleteThe README.txt file in tools/imglab tells you exactly what to type to compile it. It should have worked. What happened when you tried those commands?
ReplyDeleteI believe there were some import errors. How would I solve the tracker correlation problem.
ReplyDeleteAttributeError: 'module' object has no attribute 'correlation_tracker'
Did you compile the dlib library code from https://github.com/davisking/dlib?
ReplyDeleteThe correlation tracker was only added to the python interface a few days ago so if you are trying to use an older version it won't work.
I recieve an error regard cl in cmake .. when compiling imglab.
ReplyDeleteMy goal is to blur all heads in any police body camera video. My thought is to do head detection and then track forwards and backwards any detection with this real-time video object tracking script. Hopefully then it won't miss much. Any suggestions for making it efficient? Is there a better way than running the script per each detection?
ReplyDeleteI would just run the face detector on each frame and not worry about tracking.
ReplyDeleteHow do I deal with half a head etc then?
ReplyDeleteThe tracker might not work any better for partially occluded heads. You will just have to experiment and see what works.
ReplyDeleteShucks. I was hoping to be able to keep people's heads blurred as they leave the frame. Thank you for the quick responses.
ReplyDeleteHow to integrate the code with Live Video Stream....
ReplyDeleteI ran tracking on the guy on left side of frame 1 of https://www.youtube.com/watch?v=F0HkplIekOQ When the officer walks away for a few frames the guy in frame 1 is never tracked again. Is there anyway to track in a situation like this without redrawing the rectangles each time? Doing head detection hasn't been much successful ran into too many instances of "killed" so gave up on training.
ReplyDeleteHi Davis,
ReplyDeleteThanks to the wonderful algorithm, it looks attracting in the video.
I followed the instructions on the top of python version example and successfully compile the bat file. But still cannot import dlib and get the ERROR 'ImportError: No module named dlib'. I'm totally a beginner in either computer vision or python, could u please help me out of this problem? (BTW, I'm using mac os 10.10 and python 2.7)
Thanks~
Did you run the python example by typing
ReplyDeletepython correlation_tracker.py
From within the python_examples folder?
Thanks for the quick response. I solved the importing thing by moving dlib.so to the site-package file but another error arose: AttributeError: 'module' object has no attribute 'image_window', any suggestions?
ReplyDeleteI found some hints from GitHub says that Deleted examples build and recompiled, and now I can't recreate the error. But I'm kind of confused what should be deleted and what should be recompiled (maybe the compile_dlib_python_module.bat again) ?
Sorry I missed your commend at the bottom of the GitHub issue, and now I can perfectly use the dlib as well as the correlation tracking method. I tried the tracking algorithm in my testing video and the performance is awesome !!
ReplyDeletehi,
ReplyDeleteim trying to build a camera which is able to detect people staying in one place for a long time... can I use dlib to detect multiple people from a libe video feed that is already subjected to background subtraction(backgroundsubtractorgmg) Thankyou...
Yes, there are many tools useful for that in dlib. I would try training a HOG filter to find the people. See http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html for example.
ReplyDeleteThank you Mr.Davis for your advise.
ReplyDeleteHello, i have a problem with speed with the correlation tracker. I am using linux and i compiled the library as written in the compilation instructions (with release mode on). Anyhow, with the provided test example with resolution 320x480, i only get about 25-30 fps, which is far from 150 fps.
ReplyDeleteIs this because i am using dlib with Python API?
Thank you for your answer!
No, the Python API isn't too much slower than the C++ API. Maybe your computer is super slow? Maybe you didn't really compile it in release mode. I don't know. I just tried it on my computer and I get 150fps in C++ including file I/O. In Python I'm getting 107fps but only because Python's image loading is much slower.
ReplyDeleteAny chance you submit dlib for PyPI, it makes the impact orders of magnitude higher, thanks for great utility.
ReplyDeleteI'm not going to but you are welcome to do it if you are interested :)
ReplyDeleteHi Davis, I'm having problems when I try to compile the program in Eclipse.
ReplyDeletemake all
Building target: Tracker_dlib
Invoking: Cross G++ Linker
g++ -L/usr/lib/ -o "Tracker_dlib" ./src/Tracker_dlib.o -lpthread -lblas -llapack -ljpeg -lpng -lX11
/usr/bin/ld: cannot find -lblas
/usr/bin/ld: cannot find -llapack
However, when I check for libblas and liblabpack, these are the outputs that I get.
anfedres@anfedres-ThinkPad-W530:~/Documents/dlib-18.17/examples/build$ ldconfig -p | grep liblapack
liblapack.so.3 (libc6,x86-64) => /usr/lib/liblapack.so.3
anfedres@anfedres-ThinkPad-W530:~/Documents/dlib-18.17/examples/build$ ldconfig -p | grep libblas
libblas.so.3 (libc6,x86-64) => /usr/lib/libblas.so.3
I've added already /usr/lib/ to the library search path. Any clue which could be the problem?
Thank you.
There are instructions for compiling dlib here http://dlib.net/compile.html. CMake can also generate an eclipse project if you really want to use eclipse.
ReplyDeleteSolved, thank Davis.
ReplyDeleteIs there a way to make the tracker recover if it is lost?. Thanks Davis.
ReplyDeleteNo, you will need to include it inside some larger tracking framework to deal with that sort of issue.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteHi! I have multiple (x,y) coordinates for multiple people in each frame. Is it possible to track multiple people using this tracker? I saw in one of your previous comments that you have suggested using HOG but I already have my coordinates and I just need to track those multiple targets.
ReplyDeleteI see that after we start tracking, all I can do is send my image to tracker.update().
Is it possible to see what tracker.update() is actually doing?
Create multiple instances of the tracker, one for each object to track.
ReplyDeleteHi, I want to do some visual tracking of objects in the video, I'm a little bit confused why this kind of tracking does not need object detection? What's the difference for the following two ways?
ReplyDelete1. object detection for each frame, and then do some post-processing
2. object detection for the first frame, then do what was done in dlib.
Thanks.
Hi,
ReplyDeleteIm trying to use dlib with Qt framework (http://www.qt.io/). How to push object QImage or QVideoFrame into correlation_tracker object?
Thanks!
Nice video,I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
ReplyDeleteRegards,
Python Training in Chennai|Python Taining
Hi Davis, Thanks for the nice video. May I ask if I/we can get the unboxed videos? So that I can test it.
ReplyDeleteKind regards,
Tian
Hi Davis,
ReplyDeleteFirst of all, thanks for your work! I would like to ask you a question. I am using the vot-toolkit tool to evaluate your correlation tracker (DSST) in order to compare your results with the ones obtained with the original DSST (Matlab). I get worse results with the VOT 2014 data set with your tracker than the original DSST. Why does this happen?
Thanks again! Have a nice day.
I also ran dlib's version on the vot-toolkit and didn't get results as good as what was reported in the DSST paper. I'm not sure why that is but my guess is that there are additional things they did beyond what is reported in the paper. Maybe there is some reacquisition logic? I'm not sure.
ReplyDeleteHi Davis,
ReplyDeleteYeah! I was examining the DLIB code (which I think It's really good) and compare It with the Matlab original version (https://github.com/gnebehay/DSST). And the most selectable difference that I encounter was that the DLIB version always uses square filters (64 x 64) but the Matlab version adapts the filter size to the patch size. Do you think that could be some important point?
Thanks!
I tried it both ways when I was implementing it and there didn't seem to be any significant difference in accuracy between square vs. non-square filter shapes. It was a little bit faster and simpler to use square filters so I did it that way. Although you never know, maybe that's part of the difference.
ReplyDeleteHi, davis, can this video object detector be used as a head counter?
ReplyDeleteas this this video:
https://www.youtube.com/watch?v=OWab2_ete7s
head area(camera above) could also be seen as one "type of" visual object.
This comment has been removed by the author.
ReplyDeleteHi Davis,
ReplyDeleteare you open for custom job?I need to create an app which can track object. mike.sorochev@gmail.com
Hello Davis.
ReplyDeleteFirst off, Kudos for the good work.
We are noticing that the ConfidenceLevel of tracker is high even when the subject has moved out of the video frame. Can you please confirm the range of values for the tracker confidence level we should be looking at. Should we be looking at some other parameter for continued tracking.
Thanks.
The confidence value is only loosely correlated with track breaking. To get a good estimate of track failure you need to include additional machenry, what exactly depends on your application.
ReplyDeleteIn our tests, We are tracking faces of pedestrians. They walk in front of camera and move out of view. Can you please clarify what you mean by additional machenry.
ReplyDeleteThanks for the quick response.
I would run a face detector every few frames to make sure the objects are still present.
ReplyDeleteWe are running face detector every few frames.
ReplyDeleteWondering how this would behave when one or more subjects tracked leave the scene and few others enter the field of view of camera Not sure if running just a face detector suffices in this case.
Can you please confirm if ConfidenceLevel is good indicator of effective tracking when the subject just moves within the frame. Should we be looking at some other indicators.
Would a delimiter defining the perimeter of the frame for effective tracking help? i.e., dont track beyond a predefined boundary or something similar.
Those are good ideas and you will need to test them out when you develop your system to see what works. That is the only way to know.
ReplyDeleteGreat work. Can I modify the tracking parameters in python or do I need to recompile dlib for each change?
ReplyDeleteYou have to use C++ to do that.
ReplyDeleteReally cool!
ReplyDeleteHi Davis
ReplyDeleteThanks for creating Dlib. I found it really useful.
I have a couple questions about correation_trakcer:
1) How can I obtain relevant values of tracking-rectangle (e.g., center position, width, height, etc.)? It seems that I need to do something with get_position() but I cannot get the value I want.
2) When an tracked-object moves fast, the tracker can lose the object. Is there any way to minimize this possibility? (I know that I need to detect the object again in case it is lost.)
Thanks in advance.
This comment has been removed by the author.
ReplyDeletehi davis, i tried to use dlib to place face andmarks but it gives out this error henever i try to use shape_pridictor:
ReplyDeleteerror : predictor = dlib.shape_predictor(predictor_path)
RuntimeError: Error deserializing a floating point number.
while deserializing a dlib::matrix
while deserializing object of type std::vector
while deserializing object of type std::vector
while deserializing object of type std::vector
i am using latest version on my raspberry pi.
i need to get this done because i have a presentation for my final year project please i need help !
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThat is a good tracker. However, it is unfortunately not possible to track multiple objects.
ReplyDeleteIn my case there is many tiny objects which I do not have any problems detecting them. However, when it comes to tracking. I have no clues how to do so.
how exactly one is able to create multiple tracker.update() while having not much speed performance degradation?
I mean how is it possible to use parallel processing in GPU track each object individually using parallel processing.
I have tried to reproduce such a system: https://www.youtube.com/watch?v=3IR1h6i31JI
so far detection is good, but tracking fails miserably.
Hi Davis,
ReplyDeleteI am particular interested in the part as shown in the video from 3:32 - 3:44, where two person cross and intersect.
I have implement the tracker where I track faces, the tracker works very well with one face in the screen. When I tried to track two face where they crosses, when the tracked face is in the front it is still working good. However when the tracked face is in the back, the front face will 'bring' the tracker away and the tracked object will now become the face in the back.
I am running the tracker at around a 30fps video (real time from webcam). Unlike shown in the video where the tracker will still recognize the tracking object even when two person cross and intersects, regardless the tracking object is in the back or the front.
Is there any additional algorithm applied in order to achieve the performance as shown in the video? As my understanding is that the tracker algorithm will look at the closest pixel in the bounding box of current and subsequence frame, hence what I observed from my implementation should be correct.
Thank you.
Regards
YZ
The video doesn't use any additional processing tricks. But in general this kind of algorithm will often, but not always, get confused if two similar looking objects briefly occlude each other. To make it more robust to this kind of thing you need to add some stronger appearance based features like pull out a face descriptor and use that to deal with track swaps. There is also an extended version of this algorithm that is better at disambiguating this kind of issue (http://openaccess.thecvf.com/content_cvpr_2017/papers/Mueller_Context-Aware_Correlation_Filter_CVPR_2017_paper.pdf) which was presented at last year's CVPR. I haven't added it to dlib yet though.
ReplyDeleteHi David, is possible to use a GPU with this algorithm ? for one up to 5 objects it's work fine. But add one more and all is going very slow.
ReplyDeleteBest regards.
Martín.
Hi David, is possible to use a GPU with this algorithm ? for one up to 5 objects it's work fine. But add one more and all is going very slow.
ReplyDeleteBest regards.
Martín.
There isn't any GPU accelerated version of this.
ReplyDeleteHi Davis,
ReplyDeleteI am currently using Correlation tracker to track speed limit signs in the videos. The tracker works fine, however when the speed sign goes out of the image the tracker returns negative x and y values. I used
tracker.update(current_image)
cout<<tracker.get_position().left()<< " "<<tracker.get_position().top()<<" "<<tracker.get_position().right()<<" "<<tracker.get_position().bottom()<<endl;
I tested the tracker with three videos and I observe this behaviour whenever the speed sign goes out of the image. Please suggest whether this is an expected behaviour or not. Thanks.
I assume it's going out of the image to the left or top? Those areas have negative coordinates, so this is expected.
ReplyDeleteMany thanks for confirming that negative coordinates are expected. As I drive, the speed signs in the captured video goes out of the image to the top and left. I now handled this exception in my code.
ReplyDelete