Invited Speaker: CompressAI: A PyTorch library and evaluation platform for end-to-end compression research
Wednesday, September 23, 2020 |
4:55 PM - 5:20 PM |
Speaker
Attendee9
Interdigital US
CompressAI: A PyTorch library and evaluation platform for end-to-end compression research
Abstract Submission
This paper presents CompressAI, an open-source library that provides custom operations, layers, models and tools to research, develop, and evaluate end-to-end image and video codecs. In particular, CompressAI includes pre-trained models and evaluation tools to compare learned methods with traditional codecs. Multiple models from the state-of-the-art on learned end-to-end image compression have been reimplemented in PyTorch [1] and trained from scratch using the Vimeo90K2 training dataset [2].
The current deep learning ecosystem is mostly dominated by two frameworks: PyTorch and TensorFlow. Discussing the merits, advantages and features of one framework over another is beyond the scope of this document. There is evidence that PyTorch has seen major growth in the academic and industrial research circles over the last years. However, building end-to-end architectures for
image and video compression from scratch in PyTorch requires a lot of re-implementation work, as PyTorch does not ship with any custom operations required for compression, such as entropy bottleneck and estimation tools. These tools are also mostly absent from the current PyTorch ecosystem, whereas the TensorFlow framework already has a compression library .
CompressAI aims to implement the above-mentioned operations to build deep neural network architectures for data compression in PyTorch and provide evaluation tools for comparing learned methods against traditional codecs. CompressAI includes custom layers, entropy models, operations, and models to build end-to-end codecs. CompressAI also provides pre-defined model architectures from the state-of-the-art [4]–[6], including pre-trained weights achieving similar performances as reported in the original papers. All the implemented models fully support end-to-end compression and decompression of images, with a bit-stream representation leveraging an entropy coder based on the range Asymmetric Numeral Systems algorithm [3]. Additionally, CompressAI provides some tools to facilitate the research on learned codecs. For example, the following traditional codecs can be used for evaluation within CompressAI: JPEG, JPEG2000, WebP, BPG, HEVC, AV1, VVC.
This paper also reports objective comparisons with other implementations and traditional codecs using PSNR and MS-SSIM metrics vs. bitrate, using the Kodak5 [7] image dataset as the test set.
Several extensions to CompressAI are planned. In the next releases, CompressAI will include additional models from the state-of-the-art on learned image compression, and new pre-trained weights for perceptual metrics (e.g.: MS-SSIM). One critical envisioned extension is to add support for video compression. In particular, CompressAI is expected to support the evaluation of traditional video compression standards codecs and end-to-end networks with compressible motion information modules in low-delay and random-access configurations.
The platform is made available to the research and open source communities, under the Apache license version 2.0. We plan to continue supporting and extending CompressAI publicly on GitHub, and we welcome feedback, questions, and contributions.
The current deep learning ecosystem is mostly dominated by two frameworks: PyTorch and TensorFlow. Discussing the merits, advantages and features of one framework over another is beyond the scope of this document. There is evidence that PyTorch has seen major growth in the academic and industrial research circles over the last years. However, building end-to-end architectures for
image and video compression from scratch in PyTorch requires a lot of re-implementation work, as PyTorch does not ship with any custom operations required for compression, such as entropy bottleneck and estimation tools. These tools are also mostly absent from the current PyTorch ecosystem, whereas the TensorFlow framework already has a compression library .
CompressAI aims to implement the above-mentioned operations to build deep neural network architectures for data compression in PyTorch and provide evaluation tools for comparing learned methods against traditional codecs. CompressAI includes custom layers, entropy models, operations, and models to build end-to-end codecs. CompressAI also provides pre-defined model architectures from the state-of-the-art [4]–[6], including pre-trained weights achieving similar performances as reported in the original papers. All the implemented models fully support end-to-end compression and decompression of images, with a bit-stream representation leveraging an entropy coder based on the range Asymmetric Numeral Systems algorithm [3]. Additionally, CompressAI provides some tools to facilitate the research on learned codecs. For example, the following traditional codecs can be used for evaluation within CompressAI: JPEG, JPEG2000, WebP, BPG, HEVC, AV1, VVC.
This paper also reports objective comparisons with other implementations and traditional codecs using PSNR and MS-SSIM metrics vs. bitrate, using the Kodak5 [7] image dataset as the test set.
Several extensions to CompressAI are planned. In the next releases, CompressAI will include additional models from the state-of-the-art on learned image compression, and new pre-trained weights for perceptual metrics (e.g.: MS-SSIM). One critical envisioned extension is to add support for video compression. In particular, CompressAI is expected to support the evaluation of traditional video compression standards codecs and end-to-end networks with compressible motion information modules in low-delay and random-access configurations.
The platform is made available to the research and open source communities, under the Apache license version 2.0. We plan to continue supporting and extending CompressAI publicly on GitHub, and we welcome feedback, questions, and contributions.