One Millisecond Face Alignment with an Ensemble of Regression Trees by Vahid Kazemi and Josephine SullivanAs the name suggests, it allows you to perform face pose estimation very quickly. In particular, this means that if you give it an image of someone's face it will add this kind of annotation:
In fact, this is the output of dlib's new face landmarking example program on one of the images from the HELEN dataset. To get an even better idea of how well this pose estimator works take a look at this video where it has been applied to each frame:
It doesn't just stop there though. You can use this technique to make your own custom pose estimation models. To see how, take a look at the example program for training these pose estimation models.
330 comments :
«Oldest ‹Older 201 – 330 of 330It depends on your computer. On mine, I saw about 2ms per face for 68 landmarks.
What can I do to reduce the false positives? With Dlib 18.18, the example face_landmark_detection_ex draws a face on someone's lap
Brian, you need to use a face detector before using the shape predictor. The shape predictor will always fit the shape (a face in this case) the best it can. In your case it happened to be in someone's lap. You have to give a pretty good bounding box for the detection and the shape predictor will fit the pose.
I searched how to estimate head pose from the landmarks
Use The Active Appearance Model And Posit function
First of all, prepare frontal face average points of people
Second, get the landmarks from what you have the image
Then, Use posit function to estimate the direction
but what to do I don't know with python
where is the python tutorial code?
Kevin, the face_landmark_detection_ex example does use a frontal_face_detector
I am running the webcam_face_pose_ex code, but for me it runs very slow, During compilation in cmake i have enabled DUSE_SSE2_INSTRUCTIONS=ON, DUSE_SSE4_INSTRUCTIONS=ON, DUSE_AVX_INSTRUCTIONS=ON.
Does this resolve your problem? http://dlib.net/faq.html#Whyisdlibslow
Hi Davis
I used dlib 18.12 to train landmark detection model with Helen Dataset successfully, but if I replace the data with 80,000+ images, I got an incorrect value -1.#IND of mean training err.
Is there any limitations for the size of training dataset? Could you give me some suggestions to solve this problem?
Thanks a lot!
There is no limit in the software. You can use as big a dataset as you want so long as your machine has enough RAM to load it.
You probably have something wrong with your data. Maybe it has erroneous values in it or you didn't load it correctly. I can't say.
Zhang, what data set are you using that has 80,000 images?
Hi,Stephen Moore. The data is collected by our lab's students.
Hi Davis,
Thanks for your quick response!
With the data you provided ( http://dlib.net/files/data/), I trained a model and tested it with a mean err 0.0553159, which is worse than the model you provided ( mean err 0.0356526).
During the training, all parameters were set to default value except setting cascade depth to 15. Why can't I reproduce your model? Is there anything else need to be considered?
Hao Zhang can you email me stephen.maurice.moore@gmail.com please
Hi, Stephen Moore. I'm afraid that I can't send the dataset to you directly, because it's not open currently.
How are you testing it? The model that comes with dlib was trained on the entire dataset. So you can't check how good it is by running the test portion of that dataset. You would need to check it against something else.
hi Davis,
I use labels_ibug_300W_train.xml to train model and labels_ibug_300W_test.xml to test.
Should I use all images (included in labels_ibug_300W.xml) to train?
I trained it on everything when I made the dlib model. I don't know what you should do because I don't know what you are trying to accomplish or why you want to train at all. What you should do depends on what you want to do accomplish. Only you can answer that question :)
Hi Davis,
Thanks for you support!
Recently I read the paper "One Millisecond Face Alignment with an Ensemble of Regression Trees". I want to try to reproduce the paper's results, but I failed.
Then, I find that Dlib provide a landmark prediction model and it's pretty good. Now I want to train a model with the dataset and parameter settings you provide to reproduce this good job. But, again, I failed....
During the first training, I use the data included in file "labels_ibug_300W_train.xml" as trainset, and I test my model and Dlib's in the dataset included in file "labels_ibug_300W_test.xml". My model's mean err is 0.0553159, Dlib is 0.0356526.
In the second training, I use all of the data included in file "labels_ibug_300W.xml" as trainset, and I test my new model again. Its mean err is 0.0398477 which is close to Dlib's but not exactly the same.
Both of the two training sections, I use the default value for all parameters except the cascade depth is 15. I'm wondering why I can't train a model same as Dlib's with the same parameter settings and data?
I don't remember the exact parameter settings I used to create the model. They were similar to the defaults but I may have tweaked something. It was long ago so I don't recall the exact settings (should have written them down). So you will just have to experiment with it yourself and see what you can get.
Hi Davis,
Thanks a lot ! I will try different parameter settings!
Best Regards.
Hello Davis, very interesting blog, and the application, is there a way to know the orientation of the face? like fitting a plane to the nose and get the plane´s normal vector or something like that with dlib?
i wait for your answer, many thanks!
regards
Richi
Unfortunately, that's not currently a feature of dlib.
Hi Davis
Thanks for the dlib. I tested the webcam_face_pose_ex.cpp in visual studio on my 64-bit desktop.I compiled it in release mode in 32-bit console. And the FPS of the example was 3. I found that it took a lot of time in this command:
std::vector faces = detector(cimg);
Can you help me?
And I used AVX instructions.
So you definitely read this? http://dlib.net/faq.html#Whyisdlibslow :)
If so then either your computer is just not fast enough to run it faster than that, or maybe your images are really large (e.g. HD images)
Well,my desktop's CPU is core i7 and the picture is 640*480 captured by web camera I set AVX instruction directly in Viusal Studio. .I didn't use CMake to compile.I will have a try.
Hi Davis,
Thanks alot for the wonderful code. Could you please give some detail about the bounding box detector used in the implementation.
@Hardik, it's all here: http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html
this is a very nice library. thanks for making it available to us !
so far i have only used the face detection and it works great.
is there a way to get age and track IDs for each face? or would it be better to use my own point tracker that keeps track of positions, age, labels etc?
thanks, stephan.
Thanks.
There isn't any age or face recognition tool in dlib.
Hi Davis,
Could you also tell us about the HOG paper used for face detection.
It's described here: http://arxiv.org/abs/1502.00046
Hi Serhat Aygun,
I'm also interesting in using facial landmarks detector within mobile app. Have you obtained decreased trained model? What is its size and accuracy?
May be anyone else is researching in this direction?
The trained model provided by Davis is very nice. Thank you, Davis! But it's very big (90Mb).
I will be very thankful for any information or comments.
Regards, Alexey
Hi Davis, I've been training part of HELEN (900 because I don't have enough RAM) dataset but using 143 face points, the results have not been completely satisfactory, this is one of the pictures I used for training http://imgur.com/QzAadZ1, and here is successful one http://imgur.com/Flo4SeX
By the way I used Train Shape Predictor example to train the images, also I configurated double interocular_distance function for the range covered by the eyes, I used too the default parameters in the constructor.
Do you know if there is something else to configure to get better results? It's strange that one of the images used for training failed.
Hello!
I am trying to run the landmarks detection example. However it throws the following error: Unexpected version found while deserializing dlib::shape_predictor. To load the inital model I am using the file provided at: http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
Any ideas?
Thanks!
Is it possible to use http://dlib.net/train_shape_predictor_ex.cpp.html and http://dlib.net/face_landmark_detection_ex.cpp.html to training with different landmark numbers? Both detect 68 landmarks in a frontal face by default. I would like to detect 39 landmarks only (I can train all right using the train_shape_predictor_ex however face_landmark_detection_ex outputs the attached message:
exception thrown!
Error detected at line 25.
Error detected in file /Users/Vareto/Documents/Dlib/dlib/../dlib/image_processing/render_face_detections.h.
Error detected in function std::vector dlib::render_face_detections(const std::vector &, const dlib::rgb_pixel).
Failing expression was dets[i].num_parts() == 68.
std::vector render_face_detections()
Invalid inputs were given to this function.
dets[0].num_parts(): 39
Hi,
i want to extract the eye and mouth coordinate.
Can you please help me. i am just a newbie on this field
@sara sara:
each face's features are described by a range of points that are stored in .part list of each full_object_detection variable.
for example the left eyebrow is point range 18 to 20 can be extracted like this:
for (unsigned long i = 18; i <= 21; ++i){
temp_pLine.addVertex(ofPoint(d.part(i).x(),d.part(i).y()));
// lines.push_back(image_window::overlay_line(d.part(i), d.part(i-1), color));
}
see my full code here:
https://gist.github.com/antimodular/25b58df209e20b0bd541#file-ofxdlib-cpp-L386
my code is wrapping dlib for openframeworks, so might not be straight up c++. but should give you a good idea on how to make this to work.
Thank you.
But i didn't realy understand the code. can you please give me some documentation to help me understand this.
And what is the range for each point of the face.
Hi,
I try to download the ibug 300W large face landmark datase from http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz,but it can't be downloaded, as well as from the ibug website. could you send a copy of it to me, please? My email address is adnewer@gmail.com.
Thank you
Hi Davis,
Why did you use "unsigned long" to store the indexes of "split_feature" and the "anchor_idx" values? Using "unsigned short" would be valid as well?
Thanks in advance.
How can you get an XML file for the 68 landmarks dlib predicts for each image?
Hello Davis
As far as my understanding of uniform LBP, the computed value of x as in lbp.h will consist of values in the range 0 to 58 if 59-bin histogram is considered.
In the lbp.h I found dis....
const static unsigned char uniform_lbps[] = {
0, 1, 2, 3, 4, 58, 5, 6, 7, 58, 58, 58, 8, 58, 9, 10, 11, 58, 58, 58, 58, 58,
58, 58, 12, 58, 58, 58, 13, 58, 14, 15, 16, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 17, 58, 58, 58, 58, 58, 58, 58, 18, 58, 58, 58, 19, 58,
20, 21, 22, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 23, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 24, 58, 58, 58, 58, 58, 58, 58, 25, 58,
58, 58, 26, 58, 27, 28, 29, 30, 58, 31, 58, 58, 58, 32, 58, 58, 58, 58, 58, 58,
58, 33, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 34, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 58, 58, 35, 36, 37, 58, 38, 58, 58, 58, 39, 58, 58,
58, 58, 58, 58, 58, 40, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58,
58, 41, 42, 43, 58, 44, 58, 58, 58, 45, 58, 58, 58, 58, 58, 58, 58, 46, 47, 48,
58, 49, 58, 58, 58, 50, 51, 52, 58, 53, 54, 55, 56, 57
};
I just want to know how these values have been arranged in this manner. What is the logic behind this sort of arrangement of values. Give me some reference where I could understand the functionality of the above code logic.
Thanking you
Hi,
I test one of the provided example : dlib-18.18\examples\face_landmark_detection_ex.cpp
And I mesure the execution time with omp_get_wtime.
I comment all cout and window visualization and also pyramid_up(img)(because actually I need to detect only one and the biggest face).
The image size is 320*240 and it takes 3 seconds.
I use visual studio 2010 in release with optimisation.
Enable Enhanced Instruction Set is Streaming SIMD Extensions(/arch:SSE)(/arch:SSE).
Can you give some instruction in order that it works faster and in milliseconds ?
The goal to use in web and mobile applications.
Thanks to any provided help.
http://dlib.net/faq.html#Whyisdlibslow
Sounds like it's not really in release mode.
No, I am in release win32 (I do not see the another option to be in x64).
I add the screenshots and you can verify by link.
https://drive.google.com/open?id=0B3Ve8Gmg6qDhN1cxWTlmUG9RNUU
Also I did not succeed to compile on Windows with the provided instruction on the website. I did it directly in Visual Studio, exactly how it is mentionned here (on the answer):
http://stackoverflow.com/questions/32736149/how-to-load-jpeg-file-using-dlib-libarary
After perfoming those instructions, it compiles but how I mentionned it works in 3 seconds. In debug it works very slow (64 seconds per image.
Thanks to any help.
PS : Pascal and me, we work together.
Hi, David !
I think I understand why it is slowly. I do not have the option to optimize until SSE4 or AVX. Finally, I use SSE2.
However, I have some questions :
-It is possible to use in mobile/web application library in order that it works in real time ? I am not familar with optimization settings and so on, that is why it is interesting how much restrictions there are.
- This file shape_predictor_68_face_landmarks.dat weights 95 Mb. I need just face+landmark detection, this file, does it not contain redundant information ? It is possible to regenerate this file ?
Thanks to any help
Best regard
Hi Davis first of all i want to thak you for your help and a will be glad if you help me one something
when i tried with cout <<" the coordinate :" << round(shape.part(38))<< endl
i get x ans y coordinate together
but need to have x and y separately to continue my work HELLLP PLEASE !!
i tried something like cout << shape.part(38).x << endl NOTHING, thank you again
Hey Aniche Ahlem,
did you try round(shape.part(38).x()) and round(shape.part(38).y())? I think it should return you the correct x and y values of the point.
Hope that helps :-)
Hi!
Thank you for this implementation, it works fine! Have you created a .dat file for the 194 landwarks that are mentioned in Kazemi's and Sullivan's paper? Or do you know where I could find such a file in order to detect more than 68 landmarks?
Thanks a lot!
Thanks.
I don't have any other .dat files. For the 194 landmarks you will have to train the model yourself I'm afraid.
@quaffle29
We have a .dat file for 194 points (the model used in Kazemi's paper). It is 183 Mo.
we also have other .dat files for models with 27, 75, 77 or 197 points
you can contact us at contact@wisimage.com
Hello! :)
I encountered a strange error- whenever I'm including #include < dlib/gui_widgets.h > to my projects and declaring a variable (for example dlib::image_window win) following errors appear: 'DLIB_NO_GUI_SUPPORT is defined so you can't use the GUI code. Turn DLIB_NO_GUI_SUPPORT off if you want to use it.' and 'Also make sure you have libx11-dev installed on your system' (from guy_core_kernel_2.h). I was searching and found some suggestions that the cmake of dlib could fail- but I really doubt so, I'm already doing a detection of landmarks.
The reason why I'm trying to include and declare one of the widgets is displaying values on the screen (there is no equivalent of opencv putText, is there?.
I would be very grateful for any help. :)
Cheers,
Teresa
1. How to get shape_predictor_194_face_landmarks.dat ? any links to download directly ?
2. Also do we need to edit any field in dlib face landmark project to inform whether the dat is for 68 landmarks or for 194?
I see the procedure to generate the dat file using helan images gets stuck with non availability of xmls requested by dlib project (train_shape_predictor_ex).
>> http://stackoverflow.com/questions/36711905/dlib-train-shape-predictor-ex-cpp?answertab=votes#tab-top has some puzzled answer.
Regards
Gopi. J
I don't have a model file for that dataset, so you will need to train your own. You don't need to edit any dlib code. The example programs and documentation give full details of how to use dlib.
Trained the 194 points successfully :)
Posted the details here http://stackoverflow.com/questions/36711905/dlib-train-shape-predictor-ex-cpp
Sweet :)
Hi Davis,
I tried to train a model of 68 landmarks with iBUG-300W dataset as you did. However there are only 535 images can detected correctly and I used the train_shape_predictor_ex with default parameters. However the mean_test_error is 0.07 and yours is 0.04 on the same test dataset. Can you provide some details of training process?
Thank you.
Bin
I'm not sure what you mean by only 535 images detected correctly. The detector in dlib is more accurate than that. Maybe you are using the wrong dataset. The dataset I used is available here: http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz
Hi Davis
I tested your landmark detection on my 64-bit desktop and set
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(10);
But I got the "sp.dat" is 3.8 GB, what data I used is "http://dlib.net/files/data/ibug_300W_large_face_landmark_dataset.tar.gz" .
Could you tell me why the "sp.dat" is so big and hope you can understand my pool english.
It's because you set the tree depth to 10.
Hi,
when you trained model, these are you set?
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(2);
Thank you very much.
No, if I recall correctly, I used the default settings except the cascade depth was set to 15.
Hi
when I trained model, I set
trainer.set_oversampling_amount(300);
trainer.set_nu(0.05);
trainer.set_tree_depth(15);
but I got a "sp.dat" big as 8.7GB,it is driving me mad.
hope you can tell me that's why.
thank you.
Hi Davis,
When I used train_shape_predictor example, I printed the landmarks read from the original test xml file by load_image_dataset function one by one and found that the orders of some landmarks are not the same as what written in the xml file. So I read your code and found nothing about this phenomenon. Do you have some hints about this?
Thank you.
This is discussed in the documentation: http://dlib.net/dlib/data_io/load_image_dataset_abstract.h.html#load_image_dataset
Note that it says: "parts_list is in lexicographic sorted order."
Thanks for your reply.
However I just put in three parameters in the function as the example did. And I printed both the landmarks' value and index in the test data set. Some points for example, some landmarks on the face edge, appeared after the eyes which I believed should be the first 41 points in the XML file(I used 194 points to train). I couldn't see any orders on these points. And that gave me a huge trouble on testing the accuracy of the model because of the wrong index of eyes' landmarks.
https://en.wikipedia.org/wiki/Lexicographical_order
Mobile version (Android)
Demo video : https://www.youtube.com/watch?v=TbX3t7QNhvs&feature=autoshare
Source : https://github.com/tzutalin/dlib-android-app
Hi Davis,
I've found the reason and solved the problem. It was because the part name in my xml file was, for example "1", rather than "001".
Anyway thank you.
I am not sure that you mean real time. I build and executed 'webcam head pose', it's really really slow :(
Davis,
I am trying to build an app which does realtime make up to a webcam detected face. Like, https://www.youtube.com/watch?v=C6hI65ZMSxM. I am trying to extract different facial features. Which code should I start changing and what changes should I make.
Thanks.
Davis,
I've been trying to train a model using your example program and dataset with your 68 point markup but I have trimmed 60 of the 68 points and kept the remaining 8 that I need. Unfortunately, once training completes I receive -nan for testing and training error. The model that is saved is invalid as well (as I would expect from the -nans) I've tried training with the original unedited 68 point xml files and training completes successfully with valid error and a valid model.
The only difference between the two sets of .xml files in these two cases is the # of parts. Here is an example of my edited parts list with the 8 points I need:
The xml file opens appropriately in a viewer and the parts are all in the correct spot, as is the bounding box.
If you have any ideas on what could be causing the -nan, or where I should be looking, let me know.
Thanks
Hi Davis,
I tested landmark detection on embedded platform(linux-arm), however speed is very slow, can you please tell me to how to optimize for arm architecture by using simd extensions ( similar to SSE2,SSE4 and AVX in case of x86 processors).
Thanks.
Hi,
I am using face_landmark_detection_ex.cpp to detect and then extract cropped faces. But I need the faces in their original width and height. Unfortunately this code gives face with same width and height (I can change it to crop faces with the height and width of the rectangles returned from the detection function, however the width and height are same). What I need is the real height and width of the face (presumably height is greater than width of an usual face). Is there any way I can do it with this library or any other open source library?
Thanks
Rakib
I'm not sure I understand exactly what you want to do, but you can definitely crop out images any way you like with dlib.
Hi Davis,
we want to use the landmark detection on a DSP with no floating support. Is that feasible?
I imagine the algorithm would work fine in fixed point arithmetic.
Thanks for your feedback. Do you know about any ports to plain C ?
no
Hi Davis,
can you give me an idea what the smallest size for the pose model might be in order to produce meaningful results? Our embedded system has severe restrictions on memory footprint and the 95MB for the data are far too much. We have only 128 MB RAM for the complete system. The number of landmarks would have to be restricted to lets say 16 and the image size to VGA resolution. Before I go through a training process I wanted to get you take on this.
Thanks.
The size of the model is linear in the number of points. So you can reduce the size a lot by dropping landmarks. Other parameters effect the size as well. You can definitely make a small model.
Hi Davis, I looked into training for a smaller number of landmarks but could not find the anything in the code where to set the number of points being trained. Do I have to edit the training data and remove all landmarks which are not wanted, or is there a better way of doing this? I was shooting for < 10 landmarks per face.
You have to make a dataset that has whatever landmarks you want. There isn't some kind of hard coded thing in dlib that assumes you have a particular set of landmarks.
Are you familiar with the CMU OpenFace project. To my knowledge they are using dlib's face detector, however a different facial landmark detector. Do you know why?
Do I need 32gb of memory for training? I read something in the blog...
I know about openface. It says right on the front of their web page that openface uses dlib's facial landmark detector.
You are right. The pose_model is really fast, but I am struggling with the facial detector yielding the bounding boxes. Is there a way to speed it up? Restrict facial size, number of faces, resolution of the input image? Any ideas are appreciated.
Hi Davis,
can I use imglab to remove landmarks ( i did try --rmlabel without success) and renumber the remaining ones?
Or do I have to write a program that does the modifications to the .xml files?
Dear Davis,
just to give you some feedback: I trained the landmark detector on the ibug training set for only 10 landmarks and reduced the file size of the sp. dat file down to 2.8 MB.
Now I have to remove the floating point variables in the detector and the pose_model for implementation on a fixed point DSP.
Hi Davis
How would i know which of land marks belongs to which position on face. e.g land mark number 5 belongs to eye or lips etc ?
l wanna to use ASM for features extraction then use SVM in classification to make recognition.
what's better for feature extraction asm-opencv lib or dlib and what's the difference
l wanna to use ASM for features extraction then use SVM in classification to make recognition.
what's better for feature extraction asm-opencv lib or dlib and what's the difference
Hi Davis
There is a problem with tremoring of predicted shape in realtime video.
Can you suggest which parameters of training helps reduce this effect?
Hi Davis,
I would like to examine the .dat file? Could you please tell me the format inside or how to open it in a readable way? I would like to know "what's inside" and not just "take it for granted" :)
Thank you
Then you should read the code (http://dlib.net/faq.html#Whereisthedocumentationforobjectfunction) and the referenced paper in this blog post. Then you will understand.
Hello Mr.King,
In HELEN database training images there are many landmarks (annotations provided by HELEN) off the box detected and generated by DLIB's detector.
Is this an issue for training ?
Thank you very much,
Regards.
No
Hello Mr. King,
I've trained shape predictor with your data (6666 train + 1008 test images).
Parameter values are defaults+cascade_depth = 15.
Results are : mean training error: 0.0369526,mean testing error: 0.0555288,sp size : 97,302.
Are those results normal ?
Thank you very much, best regards.
Hi how can I find face landmarks with dlib in android studio on real-time using JAVA ? Please help me :(
Hi Davis!
Just started using dlib, and it is simply wonderful! Thanks for putting this together.
I am interested in an application where a single keypoint of a given image is to be localized very accurately. Do you think I should go with a shape predictor with only 1 point or rather try to translate the case to a detection problem?
Laszlo
Thanks, glad you like it :)
If it's just one point I would go with a normal detector. However, the specific details of the problem can be important and for some things you need specialized solutions.
Hi Davis,
Thanks again for such a great library!
I am training a shape detector for human face, as I need additional landmarks on ear. Training for total of 12 landmarks. After creating sp.dat from training, I am using it to detect landmarks on images. However, 5 landmarks are always off; rest all aligns up perfectly. Can you give some pointers on how to fix it?
Probably related is this training log:
Fitting trees...
...
mean training error: -nan
mean testing error: 0.00216595
Hi Davis King ,
First of all i really appreciate your efforts in developing dlib and helping others by replying almost everyone. I am trying to use dlib for real time face recognition and it is a bit slow. thus i compiled it after turning avx instructions on..this speedened the code but the accuracy has fallen drastically.Can you tell why this is so.
thanking you ,
K.P
Thanks, I'm glad you like dlib :)
Turning AVX on or off should make no difference in the output. Something else must have gone wrong. Also: http://dlib.net/faq.html#Whyisdlibslow
Hi Davis,
Thanks for your amazing job. I was wondering why is the .dat file provided with the example is so big ? How can I make it smaller ? Training the shape predictor on less landmarks and/or images can reduce its size ? Any other suggestions ?
Cheers
The size is linear in the number of landmarks. So if you retrain with fewer landmarks it will be smaller. It is also linear in the size of a number of other training parameters. The dlib API documents all the parameters, which area also described in great detail in the original Kazemi paper.
Hi Davis,
Thanks again for letting everyone use this great library.
I noticed that the shape_predictor has an option to return some kind of feature vector. How do I interpret this feature vector? I understand that it contains 15 times 500 values, because the landmark detector has a cascade of 15 forests with 500 threes each. But I do not completely understand what these 7500 values represent.
Do you think that it would be possible to use this feature vector to get some kind of quality measure of the landmark fit (or check if the input image contained a face)?
See: http://dlib.net/dlib/image_processing/shape_predictor_abstract.h.html#shape_predictor
It might give you a quality measure, it's tough to say. You can certainly use it to decide if a point is occluded or something similar. Maybe to know if a face is present. But probably not much about landmark fit since if it had excess information about landmark fit it probably would have fit the landmark better in the first place. But you never know until you try.
Thanks!
Could you give me a hint on how to use it for detecting occluded landmarks?
Use a linear SVM to train a classifier.
Hi Davis;
Is there any additional issue that should be excluded like the usage of iBUG 300-W dataset in training for commercial use?
thanks.
Not that I am aware of.
Hi Ian working an project facedetection and eye blinking. Facedetection is working fine but any one tell me how to find eye blinking through video. Please let me know.Its urjent project
Hi guys Iam working an project facedetection nd eye blinking. Face detection is perfectly fine but eyes blinking is not working perfectly through video stream.which method is helpful for me till now iam used dlib. Any one having sample source please mail me ..
When I am using this predictor: shape_predictor_68_face_landmarks.dat. Does it come from "One millisecond face alignment.." or "300 Faces In-The-Wild Challenge" paper? Could you check github: https://github.com/davisking/dlib-models? Please
Hi Davis,
Amazing library! Thank you!
I made a Python module to align faces together by landmarks, and am currently importing DLIB as a dependency. It works magic but DLIB is massive and I really only need to use the pre-trained face detector, and 68-point landmark finder.
Looking into taking DLIB apart to make the relevant bits a part of my module. But after looking through all the files, I'm completely overwhelmed. Would you have any advice on how best to proceed?
Any kind of response would be welcome, like:
"I never built this for taking apart so I'm as lost as you, my friend :("
"It's not worth it as detection and landmark localization require a large part of DLIB anyway :("
"This has been done before. Look it up fool!"
"Unless you're a pro at PYBIND11, C++ and Make, do not even bother, fool!"
As my starting point, I'm tracing all the dependencies of /tools/python/src/object_detection.cpp
Then I'll be trying to compile a version with only that (and any Python code and pybind11). But this seems really naive.
Carl
Yeah you can delete files from tools/python/src. You will have to modify the tools/python/src/dlib.cpp file and the CMakeLists.txt file in there to remove the files you deleted. Nothing wrong with that. Same with removing files you aren't using from dlib/CMakeLists.txt.
Landmarks are very stable. Yet I am using 68_face_landmarks.dat module. But its not as stable as this. So from where can I download this landmark module.
I'm trying to understand how the algorithm works for detecting face landmarks keypoint.
any video or reference I can go to?
The paper One Millisecond Face Alignment with an Ensemble of Regression Trees by Vahid Kazemi and Josephine Sullivan explains it well.
Hi Davis, could you share the value that you used for this training?
tree_depth:
nu:
cascade_depth:
feature_pool_size:
num_test_splits:
oversampling_amount:
oversampling_translation_jitter:
Hello Davis, for model with 68 keypoint landmark. could you share the value that you used for this training?
tree_depth:
nu:
cascade_depth:
feature_pool_size:
num_test_splits:
oversampling_amount:
oversampling_translation_jitter:
Use the default settings in dlib or look at what the Kazemi paper says to use.
Post a Comment