Thursday, August 28, 2014

Real-Time Face Pose Estimation

I just posted the next version of dlib, v18.10, and it includes a number of new minor features.  The main addition in this release is an implementation of an excellent paper from this year's Computer Vision and Pattern Recognition Conference:
One Millisecond Face Alignment with an Ensemble of Regression Trees by Vahid Kazemi and Josephine Sullivan
As the name suggests, it allows you to perform face pose estimation very quickly. In particular, this means that if you give it an image of someone's face it will add this kind of annotation:

In fact, this is the output of dlib's new face landmarking example program on one of the images from the HELEN dataset.  To get an even better idea of how well this pose estimator works take a look at this video where it has been applied to each frame:


It doesn't just stop there though.  You can use this technique to make your own custom pose estimation models.  To see how, take a look at the example program for training these pose estimation models.

330 comments :

«Oldest   ‹Older   201 – 330 of 330
Davis King said...

It depends on your computer. On mine, I saw about 2ms per face for 68 landmarks.

BCat said...

What can I do to reduce the false positives? With Dlib 18.18, the example face_landmark_detection_ex draws a face on someone's lap

Kevin Wood said...

Brian, you need to use a face detector before using the shape predictor. The shape predictor will always fit the shape (a face in this case) the best it can. In your case it happened to be in someone's lap. You have to give a pretty good bounding box for the detection and the shape predictor will fit the pose.

Unknown said...

I searched how to estimate head pose from the landmarks
Use The Active Appearance Model And Posit function
First of all, prepare frontal face average points of people
Second, get the landmarks from what you have the image
Then, Use posit function to estimate the direction

but what to do I don't know with python
where is the python tutorial code?

BCat said...

Kevin, the face_landmark_detection_ex example does use a frontal_face_detector

mohanraj said...

I am running the webcam_face_pose_ex code, but for me it runs very slow, During compilation in cmake i have enabled DUSE_SSE2_INSTRUCTIONS=ON, DUSE_SSE4_INSTRUCTIONS=ON, DUSE_AVX_INSTRUCTIONS=ON.


Davis King said...

Does this resolve your problem? http://dlib.net/faq.html#Whyisdlibslow

Unknown said...

Hi Davis

I used dlib 18.12 to train landmark detection model with Helen Dataset successfully, but if I replace the data with 80,000+ images, I got an incorrect value -1.#IND of mean training err.

Is there any limitations for the size of training dataset? Could you give me some suggestions to solve this problem?

Thanks a lot!

Davis King said...

There is no limit in the software. You can use as big a dataset as you want so long as your machine has enough RAM to load it.

You probably have something wrong with your data. Maybe it has erroneous values in it or you didn't load it correctly. I can't say.

Unknown said...

Zhang, what data set are you using that has 80,000 images?

Unknown said...

Hi,Stephen Moore. The data is collected by our lab's students.

Unknown said...

Hi Davis,

Thanks for your quick response!
With the data you provided ( http://dlib.net/files/data/), I trained a model and tested it with a mean err 0.0553159, which is worse than the model you provided ( mean err 0.0356526).

During the training, all parameters were set to default value except setting cascade depth to 15. Why can't I reproduce your model? Is there anything else need to be considered?

Unknown said...

Hao Zhang can you email me stephen.maurice.moore@gmail.com please

Unknown said...

Hi, Stephen Moore. I'm afraid that I can't send the dataset to you directly, because it's not open currently.

Davis King said...

How are you testing it? The model that comes with dlib was trained on the entire dataset. So you can't check how good it is by running the test portion of that dataset. You would need to check it against something else.

Unknown said...

hi Davis,

I use labels_ibug_300W_train.xml to train model and labels_ibug_300W_test.xml to test.

Should I use all images (included in labels_ibug_300W.xml) to train?

Davis King said...

I trained it on everything when I made the dlib model. I don't know what you should do because I don't know what you are trying to accomplish or why you want to train at all. What you should do depends on what you want to do accomplish. Only you can answer that question :)

Unknown said...

Hi Davis,

Thanks for you support!

Recently I read the paper "One Millisecond Face Alignment with an Ensemble of Regression Trees". I want to try to reproduce the paper's results, but I failed.

Then, I find that Dlib provide a landmark prediction model and it's pretty good. Now I want to train a model with the dataset and parameter settings you provide to reproduce this good job. But, again, I failed....

During the first training, I use the data included in file "labels_ibug_300W_train.xml" as trainset, and I test my model and Dlib's in the dataset included in file "labels_ibug_300W_test.xml". My model's mean err is 0.0553159, Dlib is 0.0356526.

In the second training, I use all of the data included in file "labels_ibug_300W.xml" as trainset, and I test my new model again. Its mean err is 0.0398477 which is close to Dlib's but not exactly the same.

Both of the two training sections, I use the default value for all parameters except the cascade depth is 15. I'm wondering why I can't train a model same as Dlib's with the same parameter settings and data?


Davis King said...

I don't remember the exact parameter settings I used to create the model. They were similar to the defaults but I may have tweaked something. It was long ago so I don't recall the exact settings (should have written them down). So you will just have to experiment with it yourself and see what you can get.

Unknown said...

Hi Davis,
Thanks a lot ! I will try different parameter settings!

Best Regards.

Unknown said...

Hello Davis, very interesting blog, and the application, is there a way to know the orientation of the face? like fitting a plane to the nose and get the plane´s normal vector or something like that with dlib?
i wait for your answer, many thanks!
regards
Richi

Davis King said...

Unfortunately, that's not currently a feature of dlib.

Unknown said...

Hi Davis
Thanks for the dlib. I tested the webcam_face_pose_ex.cpp in visual studio on my 64-bit desktop.I compiled it in release mode in 32-bit console. And the FPS of the example was 3. I found that it took a lot of time in this command:
std::vector faces = detector(cimg);
Can you help me?

Unknown said...

And I used AVX instructions.

Davis King said...

So you definitely read this? http://dlib.net/faq.html#Whyisdlibslow :)

If so then either your computer is just not fast enough to run it faster than that, or maybe your images are really large (e.g. HD images)

Unknown said...

Well,my desktop's CPU is core i7 and the picture is 640*480 captured by web camera I set AVX instruction directly in Viusal Studio. .I didn't use CMake to compile.I will have a try.

Hardik Jain said...

Hi Davis,
Thanks alot for the wonderful code. Could you please give some detail about the bounding box detector used in the implementation.

Kevin Wood said...

@Hardik, it's all here: http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html

videodoes said...

this is a very nice library. thanks for making it available to us !
so far i have only used the face detection and it works great.

is there a way to get age and track IDs for each face? or would it be better to use my own point tracker that keeps track of positions, age, labels etc?

thanks, stephan.

Davis King said...

Thanks.

There isn't any age or face recognition tool in dlib.

Hardik Jain said...

Hi Davis,
Could you also tell us about the HOG paper used for face detection.

Davis King said...

It's described here: http://arxiv.org/abs/1502.00046

Алексей said...

Hi Serhat Aygun,

I'm also interesting in using facial landmarks detector within mobile app. Have you obtained decreased trained model? What is its size and accuracy?

May be anyone else is researching in this direction?

The trained model provided by Davis is very nice. Thank you, Davis! But it's very big (90Mb).

I will be very thankful for any information or comments.

Regards, Alexey

Unknown said...

Hi Davis, I've been training part of HELEN (900 because I don't have enough RAM) dataset but using 143 face points, the results have not been completely satisfactory, this is one of the pictures I used for training http://imgur.com/QzAadZ1, and here is successful one http://imgur.com/Flo4SeX
By the way I used Train Shape Predictor example to train the images, also I configurated double interocular_distance function for the range covered by the eyes, I used too the default parameters in the constructor.
Do you know if there is something else to configure to get better results? It's strange that one of the images used for training failed.

Unknown said...

Hello!

I am trying to run the landmarks detection example. However it throws the following error: Unexpected version found while deserializing dlib::shape_predictor. To load the inital model I am using the file provided at: http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2

Any ideas?

Thanks!

RHV said...
This comment has been removed by the author.
RHV said...

Is it possible to use http://dlib.net/train_shape_predictor_ex.cpp.html and http://dlib.net/face_landmark_detection_ex.cpp.html to training with different landmark numbers? Both detect 68 landmarks in a frontal face by default. I would like to detect 39 landmarks only (I can train all right using the train_shape_predictor_ex however face_landmark_detection_ex outputs the attached message:

exception thrown!

Error detected at line 25.
Error detected in file /Users/Vareto/Documents/Dlib/dlib/../dlib/image_processing/render_face_detections.h.
Error detected in function std::vector dlib::render_face_detections(const std::vector &, const dlib::rgb_pixel).

Failing expression was dets[i].num_parts() == 68.
std::vector render_face_detections()
Invalid inputs were given to this function.
dets[0].num_parts(): 39

Unknown said...

Hi,
i want to extract the eye and mouth coordinate.
Can you please help me. i am just a newbie on this field

stephanschulz said...

@sara sara:

each face's features are described by a range of points that are stored in .part list of each full_object_detection variable.
for example the left eyebrow is point range 18 to 20 can be extracted like this:
for (unsigned long i = 18; i <= 21; ++i){
temp_pLine.addVertex(ofPoint(d.part(i).x(),d.part(i).y()));
// lines.push_back(image_window::overlay_line(d.part(i), d.part(i-1), color));
}

see my full code here:
https://gist.github.com/antimodular/25b58df209e20b0bd541#file-ofxdlib-cpp-L386

my code is wrapping dlib for openframeworks, so might not be straight up c++. but should give you a good idea on how to make this to work.

Unknown said...

Thank you.
But i didn't realy understand the code. can you please give me some documentation to help me understand this.
And what is the range for each point of the face.

Unknown said...
This comment has been removed by the author.
Unknown said...

Hi,
I try to download the ibug 300W large face landmark datase from http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz,but it can't be downloaded, as well as from the ibug website. could you send a copy of it to me, please? My email address is adnewer@gmail.com.
Thank you

ryoji said...

Hi Davis,

Why did you use "unsigned long" to store the indexes of "split_feature" and the "anchor_idx" values? Using "unsigned short" would be valid as well?

Thanks in advance.

Anonymous said...

How can you get an XML file for the 68 landmarks dlib predicts for each image?

Unknown said...

Hello Davis

As far as my understanding of uniform LBP, the computed value of x as in lbp.h will consist of values in the range 0 to 58 if 59-bin histogram is considered.

In the lbp.h I found dis....
const static unsigned char uniform_lbps[] = {
0, 1, 2, 3, 4, 58, 5, 6, 7, 58, 58, 58, 8, 58, 9, 10, 11, 58, 58, 58, 58, 58,
58, 58, 12, 58, 58, 58, 13, 58, 14, 15, 16, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 17, 58, 58, 58, 58, 58, 58, 58, 18, 58, 58, 58, 19, 58,
20, 21, 22, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 23, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 24, 58, 58, 58, 58, 58, 58, 58, 25, 58,
58, 58, 26, 58, 27, 28, 29, 30, 58, 31, 58, 58, 58, 32, 58, 58, 58, 58, 58, 58,
58, 33, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 34, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 35, 36, 37, 58, 38, 58, 58, 58, 39, 58, 58,
58, 58, 58, 58, 58, 40, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 41, 42, 43, 58, 44, 58, 58, 58, 45, 58, 58, 58, 58, 58, 58, 58, 46, 47, 48,
58, 49, 58, 58, 58, 50, 51, 52, 58, 53, 54, 55, 56, 57
};

I just want to know how these values have been arranged in this manner. What is the logic behind this sort of arrangement of values. Give me some reference where I could understand the functionality of the above code logic.

Thanking you

Unknown said...

Hi,

I test one of the provided example : dlib-18.18\examples\face_landmark_detection_ex.cpp
And I mesure the execution time with omp_get_wtime.
I comment all cout and window visualization and also pyramid_up(img)(because actually I need to detect only one and the biggest face).
The image size is 320*240 and it takes 3 seconds.
I use visual studio 2010 in release with optimisation.
Enable Enhanced Instruction Set is Streaming SIMD Extensions(/arch:SSE)(/arch:SSE).
Can you give some instruction in order that it works faster and in milliseconds ?
The goal to use in web and mobile applications.
Thanks to any provided help.

Davis King said...

http://dlib.net/faq.html#Whyisdlibslow

Sounds like it's not really in release mode.

Unknown said...

No, I am in release win32 (I do not see the another option to be in x64).
I add the screenshots and you can verify by link.
https://drive.google.com/open?id=0B3Ve8Gmg6qDhN1cxWTlmUG9RNUU

Also I did not succeed to compile on Windows with the provided instruction on the website. I did it directly in Visual Studio, exactly how it is mentionned here (on the answer):
http://stackoverflow.com/questions/32736149/how-to-load-jpeg-file-using-dlib-libarary

After perfoming those instructions, it compiles but how I mentionned it works in 3 seconds. In debug it works very slow (64 seconds per image.
Thanks to any help.

PS : Pascal and me, we work together.

Unknown said...

Hi, David !

I think I understand why it is slowly. I do not have the option to optimize until SSE4 or AVX. Finally, I use SSE2.

However, I have some questions :

-It is possible to use in mobile/web application library in order that it works in real time ? I am not familar with optimization settings and so on, that is why it is interesting how much restrictions there are.

- This file shape_predictor_68_face_landmarks.dat weights 95 Mb. I need just face+landmark detection, this file, does it not contain redundant information ? It is possible to regenerate this file ?

Thanks to any help

Best regard

Unknown said...
This comment has been removed by the author.
Unknown said...
This comment has been removed by the author.
Unknown said...

Hi Davis first of all i want to thak you for your help and a will be glad if you help me one something

when i tried with cout <<" the coordinate :" << round(shape.part(38))<< endl
i get x ans y coordinate together
but need to have x and y separately to continue my work HELLLP PLEASE !!
i tried something like cout << shape.part(38).x << endl NOTHING, thank you again

Unknown said...

Hey Aniche Ahlem,
did you try round(shape.part(38).x()) and round(shape.part(38).y())? I think it should return you the correct x and y values of the point.

Hope that helps :-)

quaffle29 said...

Hi!

Thank you for this implementation, it works fine! Have you created a .dat file for the 194 landwarks that are mentioned in Kazemi's and Sullivan's paper? Or do you know where I could find such a file in order to detect more than 68 landmarks?

Thanks a lot!

Davis King said...

Thanks.

I don't have any other .dat files. For the 194 landmarks you will have to train the model yourself I'm afraid.

Xan63 said...

@quaffle29
We have a .dat file for 194 points (the model used in Kazemi's paper). It is 183 Mo.
we also have other .dat files for models with 27, 75, 77 or 197 points

you can contact us at contact@wisimage.com

Unknown said...
This comment has been removed by the author.
Unknown said...
This comment has been removed by the author.
Unknown said...
This comment has been removed by the author.
Unknown said...

Hello! :)

I encountered a strange error- whenever I'm including #include < dlib/gui_widgets.h > to my projects and declaring a variable (for example dlib::image_window win) following errors appear: 'DLIB_NO_GUI_SUPPORT is defined so you can't use the GUI code. Turn DLIB_NO_GUI_SUPPORT off if you want to use it.' and 'Also make sure you have libx11-dev installed on your system' (from guy_core_kernel_2.h). I was searching and found some suggestions that the cmake of dlib could fail- but I really doubt so, I'm already doing a detection of landmarks.

The reason why I'm trying to include and declare one of the widgets is displaying values on the screen (there is no equivalent of opencv putText, is there?.

I would be very grateful for any help. :)

Cheers,

Teresa

Unknown said...

1. How to get shape_predictor_194_face_landmarks.dat ? any links to download directly ?
2. Also do we need to edit any field in dlib face landmark project to inform whether the dat is for 68 landmarks or for 194?

I see the procedure to generate the dat file using helan images gets stuck with non availability of xmls requested by dlib project (train_shape_predictor_ex).
>> http://stackoverflow.com/questions/36711905/dlib-train-shape-predictor-ex-cpp?answertab=votes#tab-top has some puzzled answer.

Regards
Gopi. J

Davis King said...

I don't have a model file for that dataset, so you will need to train your own. You don't need to edit any dlib code. The example programs and documentation give full details of how to use dlib.

Unknown said...

Trained the 194 points successfully :)
Posted the details here http://stackoverflow.com/questions/36711905/dlib-train-shape-predictor-ex-cpp

Davis King said...

Sweet :)

Unknown said...

Hi Davis,
I tried to train a model of 68 landmarks with iBUG-300W dataset as you did. However there are only 535 images can detected correctly and I used the train_shape_predictor_ex with default parameters. However the mean_test_error is 0.07 and yours is 0.04 on the same test dataset. Can you provide some details of training process?

Thank you.

Bin

Davis King said...

I'm not sure what you mean by only 535 images detected correctly. The detector in dlib is more accurate than that. Maybe you are using the wrong dataset. The dataset I used is available here: http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz

Unknown said...

Hi Davis
I tested your landmark detection on my 64-bit desktop and set
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(10);

But I got the "sp.dat" is 3.8 GB, what data I used is "http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz" .
Could you tell me why the "sp.dat" is so big and hope you can understand my pool english.

Davis King said...

It's because you set the tree depth to 10.

Unknown said...

Hi,
when you trained model, these are you set?
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(2);
Thank you very much.

Davis King said...

No, if I recall correctly, I used the default settings except the cascade depth was set to 15.

Unknown said...

Hi
when I trained model, I set
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(15);

but I got a "sp.dat" big as 8.7GB,it is driving me mad.

hope you can tell me that's why.
thank you.

Unknown said...

Hi Davis,
When I used train_shape_predictor example, I printed the landmarks read from the original test xml file by load_image_dataset function one by one and found that the orders of some landmarks are not the same as what written in the xml file. So I read your code and found nothing about this phenomenon. Do you have some hints about this?

Thank you.

Davis King said...

This is discussed in the documentation: http://dlib.net/dlib/data_io/load_image_dataset_abstract.h.html#load_image_dataset

Note that it says: "parts_list is in lexicographic sorted order."

Unknown said...

Thanks for your reply.
However I just put in three parameters in the function as the example did. And I printed both the landmarks' value and index in the test data set. Some points for example, some landmarks on the face edge, appeared after the eyes which I believed should be the first 41 points in the XML file(I used 194 points to train). I couldn't see any orders on these points. And that gave me a huge trouble on testing the accuracy of the model because of the wrong index of eyes' landmarks.

Davis King said...

https://en.wikipedia.org/wiki/Lexicographical_order

tzutalin said...

Mobile version (Android)
Demo video : https://www.youtube.com/watch?v=TbX3t7QNhvs&feature=autoshare

Source : https://github.com/tzutalin/dlib-android-app

Unknown said...

Hi Davis,
I've found the reason and solved the problem. It was because the part name in my xml file was, for example "1", rather than "001".
Anyway thank you.

Hoc Nguyen said...

I am not sure that you mean real time. I build and executed 'webcam head pose', it's really really slow :(

Unknown said...

Davis,

I am trying to build an app which does realtime make up to a webcam detected face. Like, https://www.youtube.com/watch?v=C6hI65ZMSxM. I am trying to extract different facial features. Which code should I start changing and what changes should I make.

Thanks.

Unknown said...

Davis,

I've been trying to train a model using your example program and dataset with your 68 point markup but I have trimmed 60 of the 68 points and kept the remaining 8 that I need. Unfortunately, once training completes I receive -nan for testing and training error. The model that is saved is invalid as well (as I would expect from the -nans) I've tried training with the original unedited 68 point xml files and training completes successfully with valid error and a valid model.

The only difference between the two sets of .xml files in these two cases is the # of parts. Here is an example of my edited parts list with the 8 points I need:









The xml file opens appropriately in a viewer and the parts are all in the correct spot, as is the bounding box.

If you have any ideas on what could be causing the -nan, or where I should be looking, let me know.

Thanks

Shubham said...

Hi Davis,
I tested landmark detection on embedded platform(linux-arm), however speed is very slow, can you please tell me to how to optimize for arm architecture by using simd extensions ( similar to SSE2,SSE4 and AVX in case of x86 processors).

Thanks.

Unknown said...

Hi,

I am using face_landmark_detection_ex.cpp to detect and then extract cropped faces. But I need the faces in their original width and height. Unfortunately this code gives face with same width and height (I can change it to crop faces with the height and width of the rectangles returned from the detection function, however the width and height are same). What I need is the real height and width of the face (presumably height is greater than width of an usual face). Is there any way I can do it with this library or any other open source library?

Thanks
Rakib

Davis King said...

I'm not sure I understand exactly what you want to do, but you can definitely crop out images any way you like with dlib.

Mike said...

Hi Davis,
we want to use the landmark detection on a DSP with no floating support. Is that feasible?

Davis King said...

I imagine the algorithm would work fine in fixed point arithmetic.

Mike said...

Thanks for your feedback. Do you know about any ports to plain C ?

Davis King said...

no

Mike said...

Hi Davis,
can you give me an idea what the smallest size for the pose model might be in order to produce meaningful results? Our embedded system has severe restrictions on memory footprint and the 95MB for the data are far too much. We have only 128 MB RAM for the complete system. The number of landmarks would have to be restricted to lets say 16 and the image size to VGA resolution. Before I go through a training process I wanted to get you take on this.
Thanks.

Davis King said...

The size of the model is linear in the number of points. So you can reduce the size a lot by dropping landmarks. Other parameters effect the size as well. You can definitely make a small model.

Mike said...

Hi Davis, I looked into training for a smaller number of landmarks but could not find the anything in the code where to set the number of points being trained. Do I have to edit the training data and remove all landmarks which are not wanted, or is there a better way of doing this? I was shooting for < 10 landmarks per face.

Davis King said...

You have to make a dataset that has whatever landmarks you want. There isn't some kind of hard coded thing in dlib that assumes you have a particular set of landmarks.

Mike said...

Are you familiar with the CMU OpenFace project. To my knowledge they are using dlib's face detector, however a different facial landmark detector. Do you know why?

Do I need 32gb of memory for training? I read something in the blog...

Davis King said...

I know about openface. It says right on the front of their web page that openface uses dlib's facial landmark detector.

Mike said...

You are right. The pose_model is really fast, but I am struggling with the facial detector yielding the bounding boxes. Is there a way to speed it up? Restrict facial size, number of faces, resolution of the input image? Any ideas are appreciated.

Mike said...

Hi Davis,
can I use imglab to remove landmarks ( i did try --rmlabel without success) and renumber the remaining ones?
Or do I have to write a program that does the modifications to the .xml files?

Mike said...

Dear Davis,
just to give you some feedback: I trained the landmark detector on the ibug training set for only 10 landmarks and reduced the file size of the sp. dat file down to 2.8 MB.
Now I have to remove the floating point variables in the detector and the pose_model for implementation on a fixed point DSP.

Unknown said...

Hi Davis
How would i know which of land marks belongs to which position on face. e.g land mark number 5 belongs to eye or lips etc ?

Unknown said...

l wanna to use ASM for features extraction then use SVM in classification to make recognition.
what's better for feature extraction asm-opencv lib or dlib and what's the difference

Unknown said...

l wanna to use ASM for features extraction then use SVM in classification to make recognition.
what's better for feature extraction asm-opencv lib or dlib and what's the difference

Unknown said...

Hi Davis
There is a problem with tremoring of predicted shape in realtime video.
Can you suggest which parameters of training helps reduce this effect?

Luis Beltran said...

Hi Davis,

I would like to examine the .dat file? Could you please tell me the format inside or how to open it in a readable way? I would like to know "what's inside" and not just "take it for granted" :)

Thank you

Davis King said...

Then you should read the code (http://dlib.net/faq.html#Whereisthedocumentationforobjectfunction) and the referenced paper in this blog post. Then you will understand.

Unknown said...

Hello Mr.King,

In HELEN database training images there are many landmarks (annotations provided by HELEN) off the box detected and generated by DLIB's detector.

Is this an issue for training ?

Thank you very much,
Regards.

Davis King said...

No

Unknown said...

Hello Mr. King,

I've trained shape predictor with your data (6666 train + 1008 test images).
Parameter values are defaults+cascade_depth = 15.
Results are : mean training error: 0.0369526,mean testing error: 0.0555288,sp size : 97,302.
Are those results normal ?

Thank you very much, best regards.

Unknown said...

Hi how can I find face landmarks with dlib in android studio on real-time using JAVA ? Please help me :(

Unknown said...

Hi Davis!

Just started using dlib, and it is simply wonderful! Thanks for putting this together.
I am interested in an application where a single keypoint of a given image is to be localized very accurately. Do you think I should go with a shape predictor with only 1 point or rather try to translate the case to a detection problem?

Laszlo

Davis King said...

Thanks, glad you like it :)

If it's just one point I would go with a normal detector. However, the specific details of the problem can be important and for some things you need specialized solutions.

Unknown said...

Hi Davis,
Thanks again for such a great library!
I am training a shape detector for human face, as I need additional landmarks on ear. Training for total of 12 landmarks. After creating sp.dat from training, I am using it to detect landmarks on images. However, 5 landmarks are always off; rest all aligns up perfectly. Can you give some pointers on how to fix it?

Probably related is this training log:
Fitting trees...
...
mean training error: -nan
mean testing error: 0.00216595

K.Pravallika said...

Hi Davis King ,
First of all i really appreciate your efforts in developing dlib and helping others by replying almost everyone. I am trying to use dlib for real time face recognition and it is a bit slow. thus i compiled it after turning avx instructions on..this speedened the code but the accuracy has fallen drastically.Can you tell why this is so.


thanking you ,

K.P

Davis King said...

Thanks, I'm glad you like dlib :)

Turning AVX on or off should make no difference in the output. Something else must have gone wrong. Also: http://dlib.net/faq.html#Whyisdlibslow

Unknown said...

Hi Davis,

Thanks for your amazing job. I was wondering why is the .dat file provided with the example is so big ? How can I make it smaller ? Training the shape predictor on less landmarks and/or images can reduce its size ? Any other suggestions ?

Cheers

Davis King said...

The size is linear in the number of landmarks. So if you retrain with fewer landmarks it will be smaller. It is also linear in the size of a number of other training parameters. The dlib API documents all the parameters, which area also described in great detail in the original Kazemi paper.

Unknown said...

Hi Davis,

Thanks again for letting everyone use this great library.
I noticed that the shape_predictor has an option to return some kind of feature vector. How do I interpret this feature vector? I understand that it contains 15 times 500 values, because the landmark detector has a cascade of 15 forests with 500 threes each. But I do not completely understand what these 7500 values represent.

Do you think that it would be possible to use this feature vector to get some kind of quality measure of the landmark fit (or check if the input image contained a face)?

Davis King said...

See: http://dlib.net/dlib/image_processing/shape_predictor_abstract.h.html#shape_predictor

It might give you a quality measure, it's tough to say. You can certainly use it to decide if a point is occluded or something similar. Maybe to know if a face is present. But probably not much about landmark fit since if it had excess information about landmark fit it probably would have fit the landmark better in the first place. But you never know until you try.

Unknown said...

Thanks!

Could you give me a hint on how to use it for detecting occluded landmarks?

Davis King said...

Use a linear SVM to train a classifier.

Unknown said...

Hi Davis;
Is there any additional issue that should be excluded like the usage of iBUG 300-W dataset in training for commercial use?

thanks.

Davis King said...

Not that I am aware of.

Unknown said...

Hi Ian working an project facedetection and eye blinking. Facedetection is working fine but any one tell me how to find eye blinking through video. Please let me know.Its urjent project

Unknown said...

Hi guys Iam working an project facedetection nd eye blinking. Face detection is perfectly fine but eyes blinking is not working perfectly through video stream.which method is helpful for me till now iam used dlib. Any one having sample source please mail me ..

Unknown said...

When I am using this predictor: shape_predictor_68_face_landmarks.dat. Does it come from "One millisecond face alignment.." or "300 Faces In-The-Wild Challenge" paper? Could you check github: https://github.com/davisking/dlib-models? Please

Unknown said...

Hi Davis,

Amazing library! Thank you!

I made a Python module to align faces together by landmarks, and am currently importing DLIB as a dependency. It works magic but DLIB is massive and I really only need to use the pre-trained face detector, and 68-point landmark finder.

Looking into taking DLIB apart to make the relevant bits a part of my module. But after looking through all the files, I'm completely overwhelmed. Would you have any advice on how best to proceed?

Any kind of response would be welcome, like:
"I never built this for taking apart so I'm as lost as you, my friend :("
"It's not worth it as detection and landmark localization require a large part of DLIB anyway :("
"This has been done before. Look it up fool!"
"Unless you're a pro at PYBIND11, C++ and Make, do not even bother, fool!"

As my starting point, I'm tracing all the dependencies of /tools/python/src/object_detection.cpp
Then I'll be trying to compile a version with only that (and any Python code and pybind11). But this seems really naive.


Carl

Davis King said...

Yeah you can delete files from tools/python/src. You will have to modify the tools/python/src/dlib.cpp file and the CMakeLists.txt file in there to remove the files you deleted. Nothing wrong with that. Same with removing files you aren't using from dlib/CMakeLists.txt.

Unknown said...

Landmarks are very stable. Yet I am using 68_face_landmarks.dat module. But its not as stable as this. So from where can I download this landmark module.

root said...

I'm trying to understand how the algorithm works for detecting face landmarks keypoint.
any video or reference I can go to?

Davis King said...

The paper One Millisecond Face Alignment with an Ensemble of Regression Trees by Vahid Kazemi and Josephine Sullivan explains it well.

root said...

Hi Davis, could you share the value that you used for this training?

tree_depth:
nu:
cascade_depth:
feature_pool_size:
num_test_splits:
oversampling_amount:
oversampling_translation_jitter:

root said...

Hello Davis, for model with 68 keypoint landmark. could you share the value that you used for this training?

tree_depth:
nu:
cascade_depth:
feature_pool_size:
num_test_splits:
oversampling_amount:
oversampling_translation_jitter:

Davis King said...

Use the default settings in dlib or look at what the Kazemi paper says to use.

«Oldest ‹Older   201 – 330 of 330   Newer› Newest»