June 14, 2017

Deep Reinforcement Learning: Mark Hammond’s GTC Presentation

Last month, I had the chance to speak at at NVIDIA’s 2017 GPU Technology Conference in San Jose. The conference brought together academics, startups and enterprises all leveraging the incredible recent advancements in GPU computing for machine learning. With such a GPU-centric audience, I focused my talk on the how to deal with the headache of latency, hazards, and pipeline stalls in the GPU era.

In the realm of deep reinforcement learning, stateful, interactive simulation based workloads push these problems to the extreme, necessitating a handoff to the simulator on every iteration – and that simulator may not even by running on the same machines as the deep reinforcement learning model!   

For control and optimization problems using simulation based workloads, deep reinforcement learning models are highly relevant, and there are many nuances to optimizing usage of the GPU. Since you can’t simply batch data to load into GPU memory for learning, new techniques are needed to keep the GPU humming and not stalled waiting for other parts of the system to catch up.  

For example, in a typical interactive controller application, any control action that is given must be sent to the simulator or physical system to be carried out, the resulting new state must be returned and evaluated, and that data is then available to be used to update the learning system. Consequently, there are seemingly unavoidable transitions between the deep reinforcement learning model and the simulator, as well as latency for transmission and execution. In these cases, optimizing use of the GPUs becomes a non-trivial concern.

In this talk, I explore lessons we’ve learned on how to avoid these performance degrading modern hazards.  You’ll learn some tricks and techniques – including approaches to pool multiple concurrent simulations for use with single networks – that you can employ in your own systems to increase performance with your deep reinforcement learning workloads. This talk will present data from Bonsai’s platform (a mix of python + tensorflow, C++, and inkling), but the lessons learned and resulting tricks and techniques can be leveraged broadly.

You can watch the talk below or view the slides here. If your organization has a use case that can benefit from reinforcement learning, check out the Bonsai Early Access Program and start leveraging AI in your own industrial systems.


Subscribe to our newsletter to stay up-to-date on our latest product news & more!