Friday, June 5, 2015

Reinforcement Learning, Control, and 3D Visualization

Over the last few months I've spent a lot of time studying optimal control and reinforcement learning. Aside from reading, one of the best ways to learn about something is to do it yourself, which in this case means a lot of playing around with the well known algorithms, and for those I really like, including them into dlib, which is the subject of this post.  So far I've added two methods, the first, added in a previous dlib release was the well known least squares policy iteration reinforcement learning algorithm.  The second, and my favorite so far due to its practicality, is a tool for solving model predictive control problems.

There is a dlib example program that explains the new model predictive control tool in detail.  But the basic idea is that it takes as input a simple linear equation defining how some process evolves in time and then tells you what control input you should apply to make the process go into some user specified state.  For example, imagine you have an air vehicle with a rocket on it and you want it to hover at some specific location in the air.  You could use a model predictive controller to find out what direction to fire the rocket at each moment to get the desired outcome.  In fact, the dlib example program is just that.  It produces the following visualization where the vehicle is the black dot and you want it to hover at the green location.  The rocket thrust is shown as the red line:


Another fun new tool in dlib is the perspective_window.  It's a super easy to use tool for visualizing 3D point cloud data.  For instance, the included example program shows how to make this:


Finally, Patrick Snape contributed Python bindings for dlib's video tracker, so now you can use it from Python.  To try out these new tools download the newest dlib release.



8 comments :

Zubrycki Igor said...

you should play with ADRC (Active disturbance rejection control) it is quite powerful and robust

Shuang Liu said...

Hi Davis,

Is there anyway to make the frontal face detector to use all the cores available when it's running on a single image?

Davis King said...

No, that's not one of its features.

Shuang Liu said...

But is it possible though? Pyr every image beforehand and use openmp parallel for to extract feature and such?

Shuang Liu said...

Hi Davis, correlation tracker is very slow on my machine (i7, 32GB desktop) when I enable use blas and lapack, it seems be spending lots of time on

Function Name Total CPU (%) Self CPU (%) Total CPU (ms) Self CPU (ms) Module
- dlib::blas_bindings::matrix_assign_blas_helper,0,1,dlib::memory_manager_stateless_kernel_1,dlib::row_major_layout>,dlib::matrix,0,1,dlib::memory_manager_stateless_kernel_1,dlib::row_major_layout>,void>::assign 32.06 % 0.06 % 513 1 linear.exe

Is there a way to fix this? i.e. not calling blas routine on complex number operation ?

Davis King said...

Does this help? http://dlib.net/faq.html#Whyisdlibslow

James Harper said...

It's normal (60fps) if I disabled Blas. I'm using a cmake optimization script to detect available instruction set and I am sure it's using sse and avx in release mode. It seems to be a memory transfer problem (5~6 fps with use_blas).

Also I noticed the svd routine is giving different results with use_lapack enabled. Apparently on matrix lager than 3x3 the algorithm in numerical recipe gives different results than Lapack. Disabling lapack will break my app.

Davis King said...

I don't see such large differences between using or not using blas on my machines. I always use either the Intel MKL or OpenBLAS since those are high quality blas implementations. However, there are a lot of bad ones or improperly compiled blas libraries floating around. Many that don't compute matrix decompositions correctly or have other problems. You have probably installed one of those.