Simplified entropy model for reduced-complexity end-to-end variational auto-encoder with application to on-board satellite image compression
Wednesday, September 23, 2020 |
5:20 PM - 5:45 PM |
Speaker
Attendee6
University of Toulouse, TéSA Toulouse
Simplified entropy model for reduced-complexity end-to-end variational auto-encoder with application to on-board satellite image compression
Abstract Submission
In recent years, neural networks have emerged as data-driven tools to solve problems which were previously addressed with model-based methods. In particular, image processing has been largely impacted by convolutional neural networks (CNNs). Recently, CNN-based auto-encoders have been successfully employed for lossy image compression [1,2,3,4]. These end-to-end optimized architectures are able to dramatically outperform traditional compression schemes in terms of rate-distortion trade-off. The auto-encoder is composed of an encoder and a decoder both learned from the data. The encoder is applied to the input data to produce a latent representation with minimum entropy after quantization. The latent representation, derived through several convolutional layers composed of filters and activation functions, is multi-channel (the output of a particular filter is called a channel or a feature) and non-linear. The representation is then quantized to produce a discrete-valued vector. A standard entropy coding method uses the entropy model inferred from the representation to losslessly compress this discrete-valued vector. A key element of these frameworks is the entropy model. In earlier works [1,2,3], the learned representation was assumed independent and identically distributed within each channel and the channels were assumed independent of each other, resulting in a fully-factorized entropy model. Moreover, a fixed entropy model was learned once, from the training set, preventing any adaptation to the input image during the operational phase. The variational auto-encoder proposed in [4] proposed to use a hyperprior auxiliary network. This network estimates the hyper-parameters of the representation distribution, for each input image. Thus, it does not require the assumption of a fully-factorized model which conflicts with the need for context modeling. This variational auto-encoder achieves compression performance close to the one of BPG (Better Portable Graphics) at the expense of a considerable increase in complexity.
However, in the context of on-board compression, a trade-off between compression performance and complexity has to be considered to take into account the strong computational constraints. For this reason, the CCSDS (Consultative Committee for Space Data Systems) lossy compression standard has been designed as a highly simplified version of JPEG2000. This work follows the same logic, however in the context of learned image compression. The aim of this paper is to design a simplified version of the variational auto-encoder proposed in [4] in order to meet the on-board constraints in terms of complexity while preserving high performance in terms of rate-distortion. Apart from straightforward simplifications of the transform (e.g. reduction of the number of filters in the convolutional layers), we mainly propose a simplified entropy model that preserves the adaptability to the input image.
A preliminary reduction of the number of filters reduces the complexity by 62% in terms of FLOPs with respect to [4]. It also reduces the number of learned parameters with a positive impact on the memory occupancy. The entropy model simplification exploits a statistical analysis of the learned representation for satellite images, also performed in [5] for natural images. This analysis reveals that most of the features are well fitted by centered Laplacian distributions. The complex hyperprior model based on a non-parametric distribution of [4] can thus be replaced by a simpler parametric centered Laplacian model. The problem then amounts to a classical and simple estimation of a single parameter referred to as the scale. Our simplified entropy models reduces the complexity of the variational auto-encoder coding part by 22% and outperforms the end-to-end model proposed in [1] for the high target rates.
Acknowledgment: This work has been carried out under the financial support of the French space agency CNES and Thales Alenia Space.
References
[1] Ballé, Laparra, Simoncelli, “End-to-end optimized image compression,” ICLR 2017.
[2] Rippel, Bourdev, “Real-time adaptive image compression,” ICML 2017.
[3] Theis, Shi, Cunningham, Huszar, “Lossy image compression with compressive autoencoders,” ICLR 2017.
[4] Ballé, Minnen, Singh, Hwang, Johnston, “Variational image compression with a scale hyperprior,” ICLR 2018.
[5] Dumas, Roumy, Guillemot, “Autoencoder based image compression: Can the learning be quantization independent?” ICASSP 2018.
However, in the context of on-board compression, a trade-off between compression performance and complexity has to be considered to take into account the strong computational constraints. For this reason, the CCSDS (Consultative Committee for Space Data Systems) lossy compression standard has been designed as a highly simplified version of JPEG2000. This work follows the same logic, however in the context of learned image compression. The aim of this paper is to design a simplified version of the variational auto-encoder proposed in [4] in order to meet the on-board constraints in terms of complexity while preserving high performance in terms of rate-distortion. Apart from straightforward simplifications of the transform (e.g. reduction of the number of filters in the convolutional layers), we mainly propose a simplified entropy model that preserves the adaptability to the input image.
A preliminary reduction of the number of filters reduces the complexity by 62% in terms of FLOPs with respect to [4]. It also reduces the number of learned parameters with a positive impact on the memory occupancy. The entropy model simplification exploits a statistical analysis of the learned representation for satellite images, also performed in [5] for natural images. This analysis reveals that most of the features are well fitted by centered Laplacian distributions. The complex hyperprior model based on a non-parametric distribution of [4] can thus be replaced by a simpler parametric centered Laplacian model. The problem then amounts to a classical and simple estimation of a single parameter referred to as the scale. Our simplified entropy models reduces the complexity of the variational auto-encoder coding part by 22% and outperforms the end-to-end model proposed in [1] for the high target rates.
Acknowledgment: This work has been carried out under the financial support of the French space agency CNES and Thales Alenia Space.
References
[1] Ballé, Laparra, Simoncelli, “End-to-end optimized image compression,” ICLR 2017.
[2] Rippel, Bourdev, “Real-time adaptive image compression,” ICML 2017.
[3] Theis, Shi, Cunningham, Huszar, “Lossy image compression with compressive autoencoders,” ICLR 2017.
[4] Ballé, Minnen, Singh, Hwang, Johnston, “Variational image compression with a scale hyperprior,” ICLR 2018.
[5] Dumas, Roumy, Guillemot, “Autoencoder based image compression: Can the learning be quantization independent?” ICASSP 2018.