Michael Breiter

Michael Breiter

Lecture Description:

There are several solutions for bringing artificial intelligence to devices like vehicles, smart cameras or handhelds. However, the gap between possibilities and applicable solutions in artificial intelligence is growing because of hardware limitations. The list of hardware accelerators includes FPGA, GPU and specialized ASIC solutions. Since Deep Learning is still a fast evolving field, a solution has to be extendable to allow future features like new activation functions, layer types or network architectures. We show the benefits of using an FPGA as a deep neural network inference accelerator platform regarding extendability, energy consumption and safety topics. FPGAs have a lot of practical advantages like energy efficiency, possible operating temperatures, electromagnetic compatibility and safety certified hardware. Not being dependent on the usage of an operating system can be a huge advantage for safety considerations. FPGA algorithms perform time-deterministic and controllable, while remaining scalable, flexible and adaptable. We provide an example of a Tiny Yolo v3 object detector neural network for real-time inference, running on an Artix-7 XC7A200T and compare it to a Google Coral TPU on a Raspberry Pi host system and an Nvidia Jetson Nano SoC. Comparison is performed for inference speed, energy consumption and safety aspects. FPGA-centered neural network optimization techniques like pruning, batchnorm fusing and quantization are presented, which reduce neural network architecture size or arithmetic complexity. An example of how a use-case specialized object detector can be trained is shown. Practical advantages and disadvantages of minimized neural network architectures like the MobileNet SSD compared to a Tiny Yolo v3 are presented and a critical view on quality metrics is given.

Share by: