Therefore, dlib has been updated to work with cuDNN 5.1 and hence we have a dlib 19.1 release, which you can download from dlib's home page.
I also recently realized that the fancy std::async() in C++11, an API for launching asynchronous tasks, is not backed by any kind of load balancing at all. For example, if you call std::async() at a faster rate than the tasks complete then your program will create an unbounded number of threads, leading to an eventual crash. That's awful. But std::async() is a nice API and I want to use it. So dlib now contains dlib::async() which has the same interface, except instead of the half baked launch policy as the first argument, dlib::async() takes a dlib::thread_pool, giving dlib::async() all the bounded resource use properties of dlib::thread_pool. Moreover, if you don't give dlib::async() a thread pool it will default to a global thread pool instance that contains std::thread::hardware_concurrency() threads. Yay.