Lecture Description:
 FPGAs are being increasingly employed as co-processors in data centers using accelerator
 cards that connect to x86 processors using a PCI express interface. A driver behind this
 transition is AI applications that leverage the parallel nature of FPGAs. For programming
 these FPGA-based accelerator cards you can either use a top-down approach, starting
 from a top-level C/C++ and OpenCL application and working towards lower-level kernels
 or a bottom-up approach where the kernel blocks are compiled separately and can be
 linked together into a binary at a later stage.
 The bottom-up flow has several advantages over the top-down flow. (1) It allows the
 design, validation, and optimization of kernels separately from the main application. (2) It
 provides faster iteration cycles for the development and optimization of kernels by
 splitting the design into smaller components. (3) It facilitates reuse; a collection of kernels
 can be reused like a library.
 In this presentation, we use a face detection application as a reference design to show
 how designers can optimize a kernel when using Xilinx's Vitis tool and the bottom-up flow.
 Note that the same methodology is also applicable when designing a kernel from scratch
 or importing an existing kernel from Vitis HLS. 



