Header image

On the Embedded GPU Parallelisation of On-Board CCSDS Compressors: a Benchmarking Approach

Monday, September 21, 2020
4:45 PM - 5:10 PM

Speaker

Attendee107
Barcelona Supercomputer Center

On the Embedded GPU Parallelisation of On-Board CCSDS Compressors: a Benchmarking Approach

Abstract Submission

The on-board processing requirements of future missions are constantly increasing, requiring new hardware solutions able to support this need, while staying within the strict power and thermal limits of space systems. Embedded GPUs present a promising candidate, combining high-performance capabilities will low power consumption, close to the target limits. The GPU4S (GPU for Space) ESA-funded project [1] studies whether on-board processing algorithms are amenable to GPU parallelisation as well as whether embedded GPUs can satisfy the performance requirements of future space missions, effectively paving the way for their adoption.
Early project results with commonly used processing algorithms [2] as well as an infrared space observatory case study demonstrator [3] indicate that embedded GPUs can provide significant processing improvements of several orders of magnitude compared to existing space processors such as LEON/SPARC or PowerPC-based processors. Compared to FPGAs, which are commonly used in on-board processing applications, GPUs offer the advantage to reconfigure the on-board processing using software in a fast manner.
In order to cover as many space domains as possible, we performed an analysis of different classes of on-board algorithms and we are currently designing a benchmarking suite for the evaluation of both embedded GPUs as well as their programming models. The benchmark suite includes applications such as image pre-processing, standard compression, FFT, FIR, and other common on-board processing tasks.
While our analysis in the algorithm selection points out that most of the image-related processing algorithms used in observation systems are a good fit for embedded GPUs, we have identified that the compression algorithms are among the most challenging ones. In this paper, we present our experience with the GPU parallelisation of several components from the CCSDS Compressors 121, 122 and 123, which are included in our upcoming benchmark suite. In particular, we implement the following parts of each compression standard:
• CCSDS 121.0-B-2
o Predictor
- Unit-delay
- error mapper
o Encoder
- Fundamental sequence
- Sample Split
- Zero block
• CCSDS 122.0-B-2
o 2D multilevel wavelet transform
o Bit planar encoder
• CCSDS 123.0-B-1
o Predictor:
- Adaptive weighted predictor
o Encoder:
- Simple and Block adaptive encoder
In particular, we focus only on the parts of the standards related to the encoding (e.g. forward transforms), since this part is expected to be used on-board. Moreover, we prioritise the parts which are a better fit for GPU parallelisation based on our analysis of their access patterns. In a future edition of the benchmark suite, we may consider to gradually add the non-implemented parts of the CCSDS.
The algorithms are implemented in multiple parallel programming models GPUs (CUDA, OpenCL), which is our primary focus. In addition, implementations have been made for CPUs (OpenMP) to exploit also the multicore capabilities of existing COTS SoCs featuring embedded GPUs. In our paper, which is an early preview of a significant part of our benchmarking suite, we will describe in detail the approach we have followed for each algorithm and provide insights about the different parallelisation approaches for GPUs and CPUs. We will present results with several state-of-the-art embedded GPU platforms, which have been selected as good candidate platforms earlier in the project [2], including the latest NVIDIA platform, Xavier, showing the performance benefits provided by the use of embedded GPUs. One major challenge for the use of GPUs in space is the requirement for fault-tolerance. Hence, we have targeted also the benchmarks on a fault-tolerant model based on a COTS GPU.
This work has been performed in the framework of the ESA GPU4S (GPU for Space) Project ITT AO/1-9010/17/NL/AF “Low Power GPU Solutions for High Performance On-Board Data Processing”.
References:
[1] Leonidas Kosmidis, Jérôme Lachaize, Jaume Abella, Olivier Notebaert, Francisco J. Cazorla, David Steenari. GPU4S: Embedded GPUs in Space, 22nd Euromicro Conference on Digital System Design (DSD), 2019
[2] Leonidas Kosmidis, Iván Rodriguez, Álvaro Jover, Sergi Alcaide, Jérôme Lachaize, Jaume Abella, Olivier Notebaert, Francisco J. Cazorla, David Steenari. GPU4S: Embedded GPUs in Space - Latest Project Updates, Elsevier Microprocessors and Microsystems, Volume 77, September 2020
[3] Iván Rodriguez, Leonidas Kosmidis, Olivier Notebaert, Francisco J. Cazorla, David Steenari. An On-board Algorithm Implementation on an Embedded GPU: a Space Case Study, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020

loading