Wednesday, April 9, 2014

Dlib 18.7 released: Make your own object detector in Python!

A while ago I boasted about how dlib's object detection tools are better than OpenCV's. However, one thing OpenCV had on dlib was a nice Python API, but no longer!  The new version of dlib is out and it includes a Python API for using and creating object detectors. What does this API look like? Well, lets start by imagining you want to detect faces in this image:


You would begin by importing dlib and scikit-image:
import dlib
from skimage import io
Then you load dlib's default face detector, the image of Obama, and then invoke the detector on the image:
detector = dlib.get_frontal_face_detector()
img = io.imread('obama.jpg')
faces = detector(img)
The result is an array of boxes called faces. Each box gives the pixel coordinates that bound each detected face. To get these coordinates out of faces you do something like:
for d in faces:
    print "left,top,right,bottom:", d.left(), d.top(), d.right(), d.bottom()
We can also view the results graphically by running:
win = dlib.image_window()
win.set_image(img)
win.add_overlay(faces)

But what if you wanted to create your own object detector?  That's easy too.  Dlib comes with an example program and a sample training dataset showing how to this.  But to summarize, you do:
options = dlib.simple_object_detector_training_options()
options.C = 5  # Set the SVM C parameter to 5.  
dlib.train_simple_object_detector("training.xml","detector.svm", options)
That will run the trainer and save the learned detector to a file called detector.svm. The training data is read from training.xml which contains a list of images and bounding boxes. The example that comes with dlib shows the format of the XML file. There is also a graphical tool included that lets you mark up images with a mouse and save these XML files. Finally, to load your custom detector you do:
detector = dlib.simple_object_detector("detector.svm")
If you want to try it out yourself you can download the new dlib release here.

Thursday, April 3, 2014

MITIE: A completely free and state-of-the-art information extraction tool

I work at a MIT lab and there are a lot of cool things about my job. In fact, I could go on all day about it, but in this post I want to talk about one thing in particular, which is that we recently got funded by the DARPA XDATA program to make an open source natural language processing library focused on information extraction.

Why make such a thing when there are already open source libraries out there for this (e.g. OpenNLP, NLTK, Stanford IE, etc.)? Well, if you look around you quickly find out that everything which exists is either expensive, not state-of-the-art, or GPL licensed. If you wanted to use this kind of NLP tool in a non-GPL project then you are either out of luck, have to pay a lot of money, or settle for something of low quality. Well, not anymore! We just released the first version of our MIT Information Extraction library which is built using state-of-the-art statistical machine learning tools.

At this point it has just a C API and an example program showing how to do English named entity recognition. Over the next few weeks we will be adding bindings for other languages like Pyhton and Java. We will also be adding a lot more NLP tools in addition to named entity recognition, starting with relation extractors and part of speech taggers. But in the meantime you can use the C API or the streaming command line program.  For example, if you had the following text in a file called sample_text.txt:
Meredith Vieira will become the first woman to host Olympics primetime coverage on her own when she fills on Friday night for the ailing Bob Costas, who is battling a continuing eye infection.  
 Then you can simply run:
cat sample_text.txt | ./ner_stream MITIE-models/ner_model.dat
And you get this as output:
 [PERSON Meredith Vieira] will become the first woman to host [MISC Olympics] primetime coverage on her own when she fills on Friday night for the ailing [PERSON Bob Costas] , who is battling a continuing eye infection .
It's all up on github so if you want to try it out yourself then just run these commands and off you go:
git clone https://github.com/mit-nlp/MITIE.git
cd MITIE
./fetch_submodules.sh
make examples
make MITIE-models
cat sample_text.txt | ./ner_stream MITIE-models/ner_model.dat