Header image

FPGA Acceleration of Quantised Neural Networks for Remote-Sensed Cloud Detection

Wednesday, September 23, 2020
3:20 PM - 3:45 PM

Speaker

Attendee29
University Of Strathclyde

FPGA Acceleration of Quantised Neural Networks for Remote-Sensed Cloud Detection

Abstract Submission

The capture and transmission of remote-sensed imagery for Earth observation is both computationally and bandwidth expensive. In the analyses of remote-sensed imagery in the visual band, atmospheric cloud cover can obstruct up to two-thirds of observations, resulting in costly imagery being discarded [1]. Mission objectives and satellite operational details vary; however, assuming a cloud-free observation requirement, a doubling of useful data downlinked with an associated halving of delivery cost is possible through effective cloud detection. A minimal-resource, real-time inference neural network is ideally suited to perform automatic cloud detection, both for pre-processing captured images prior to transmission and preventing unnecessary images being taken by larger payload cameras.

Much of the hardware complexity of modern neural network implementations resides in high-precision floating-point calculation pipelines. In recent years, research has been conducted in identifying quantised, or low-integer precision equivalents to known deep learning models, which do not require the extensive resources of their floating-point, full-precision counterparts. Our work leverages existing research on binary, ternary and quantised neural networks (QNNs) to develop a real-time, remote-sensed cloud detection solution using a low-cost, commodity system-on-chip (SoC). This follows on developments of the Forward Looking Imager [2] for predictive cloud detection developed by Craft Prospect, a space engineering practice based in Glasgow, UK.

Recent neural network minimisation advances have examined reducing the number of layers and parameters in models to maintain a high degree of precision and accuracy at a fraction of the complexity. Even with hundreds of orders of magnitude reductions in network sizes, these neural nets still encompass hundreds of thousands of weights. At 32 bits per weight, the storage demands and computational stress from performing matrix calculations on these huge arrays of parameters continues to prevent novel networks from being implemented in real-time embedded devices.

To further reduce the computational load, weight quantisation is required. Thresholding each weight from a 32-bit floating-point value down to a fixed, 8-bit integer representation is commonly performed. Such a reduction in weight precision preserves the structure of an existing network and will minimally impact its inference accuracy. However, for resource-strapped and low-power devices, storing and computing 8-bit integer weights can remain too taxing. A more extreme method of quantisation is required in the form of 4-, 2- and 1-bit networks.

Field-programmable gate arrays (FPGAs) feature a number of advantages over application-specific embedded processors, predominantly the potential for creating custom instruction sets. They present opportunities for high-precision results while maintaining adequate performance levels and a hardware footprint amenable to the resource-restricted domain of remote sensing.

To achieve a highly parallel FPGA architecture, the open-source FINN framework, supported by Xilinx, is used for the synthesis of QNNs [3]. FINN efficiently supports the previously described optimisations, including variable quantisation. In addition, FINN can adjust designs for maximal usage of FPGA block memory – reducing latency and increasing system performance [4]. For real-time predictions, images are compressed ahead of FPGA processing via the general processing cores on the SoC. Our inference implementation uses a Xilinx Zynq-7000 SoC with integrated FPGA.

The primary dataset used for QNN training is a custom, annotated image repository using ESA Copernicus Sentinel-2 captures. A fully convolutional, binary neural network served as the benchmark for the deep learning pipeline. Ternary and low-integer networks were then compared to this baseline for performance, accuracy, and resource utilisation comparisons using roofline models across the heterogenous processing architecture.

References:
[1] Mohajerani, Sorour, Thomas A. Krammer, and Parvaneh Saeedi. “Cloud detection algorithm for remote sensing images using fully convolutional neural networks.” arXiv preprint arXiv:1810.05782 (2018).
[2] Greenland, Steve, Murray Ireland, Chisato Kobayashi, Peter Mendham, Mark Post, and David White. “Development of a minaturised forwards looking imager using deep learning for responsive operations.” ESA, 2018.
[3] Blott, Michaela, Thomas B. Preußer, Nicholas J. Fraser, Giulio Gambardella, Kenneth O’brien, Yaman Umuroglu, Miriam Leeser, and Kees Vissers. “FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks.” ACM Transactions on Reconfigurable Technology and Systems (TRETS) 11, no. 3 (2018): 1-23.
[4] Umuroglu, Yaman, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. “Finn: A framework for fast, scalable binarized neural network inference.” In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 65-74. 2017.

loading