Wednesday, December 20, 2017

Dlib 19.8 is Out

Dlib 19.8 is officially out. There are a lot of changes, but the two most interesting ones are probably the new global optimizer and semantic segmentation examples.  The global optimizer is definitely my favorite as it allows you to easily find the optimal hyperparameters for machine learning algorithms.  It also has a very convenient syntax.  For example, consider the Holder table test function:
File:Holder table function.pdf
Here is how you could use dlib's new optimizer from Python to optimize the difficult Holder table function:
def holder_table(x0,x1):
    return -abs(sin(x0)*cos(x1)*exp(abs(1-sqrt(x0*x0+x1*x1)/pi)))

x,y = dlib.find_min_global(holder_table, 
                           [-10,-10],  # Lower bound constraints on x0 and x1 respectively
                           [10,10],    # Upper bound constraints on x0 and x1 respectively
                           80)         # The number of times find_min_global() will call holder_table()

Or in C++: 
auto holder_table = [](double x0, double x1) {return -abs(sin(x0)*cos(x1)*exp(abs(1-sqrt(x0*x0+x1*x1)/pi)));};

// obtain result.x and result.y
auto result = find_min_global(holder_table, 
                             {-10,-10}, // lower bounds
                             {10,10}, // upper bounds

Both of these methods find holder_table's global optima to about 12 digits of precision in about 0.1 seconds. The documentation has much more to say about this new tooling.  I'll also make a blog post soon that goes into much more detail on how the method works.

Finally, here are some fun example outputs from the new semantic segmentation example program:





  1. Awesomme!! Looking forward to more explanations on global optimizer.

  2. Awesomme!! Looking forward to more explanations on global optimizer.

  3. This comment has been removed by the author.

  4. Thanks a lot, Davis, great library!
    Is it possible to build dlib-GPU-based solution, so that result program could work both on user computers with GPU and on computers without CUDA-supporting GPU, in CPU mode only?

  5. Andrey Zakharoff >> Yes, you can do it. You need check_cuda start file, which load cuda library dynamically and check compatibility with cuda. After it you run one of two files - first compiled with cuda, second compiled for cpu only. For cpu you can use lapack, which dramatically improved your cpu version. Also cpu and gpu version inside are the same. You need only set flags for compiler.

    I do it for windows and mac (4 result files) and it works perfectly.

  6. What is the speed of the semantic segmentation?

  7. Hi Davis;

    As I understand from the semantic segmentation training code in file, I can train a new model on another dataset as long as I give the corresponding vector to the trainer. Is that right and if so, do I need to do any change in network types in file ?

    thanks in advance

  8. Yes, that's right. Giving it different training data is fine.

  9. Hi again Davis,

    I need probability as well as class label for each pixel. I thought changing the to_label function of loss_multiclass_log_per_pixel_ class such that I will also get the value not only the label. Is this a correct way or do you have any other suggestions?


  10. Yes, the log loss optimizes the log likelihood and is what you should use. Then you will get something that outputs log likelihoods. Convert them to probabilities by passing them through a sigmoid.

  11. In the segmentation example code, after these codes:

    anet_type net;
    deserialize("semantic_segmentation_voc2012net.dnn") >> net;

    I add these lines:

    softmax probabilities;
    probabilities.subnet() = net.subnet();

    but at this point I could't solve how to iterate likelihoods of classes for each pixel.

    Can you give me some tips?

  12. The CNN face detector can run on the GPU. But the HOG detector is unchanged and single threaded. If you want to use multiple threads to run the HOG detector then the user should do so, by making multiple threads themselves and processing many images in parallel.