High-Fidelity Audio Compression with Improved RVQGAN

High-Fidelity Audio Compression with Improved RVQGAN

Abstract Language models have been successfully used to model natural signals such as images, speech, and music. A key component of these models is high-quality neural compression models that can compress high-dimensional natural signals into lower-dimensional discrete tokens. To this end, we propose a high-fidelity universal neural audio compression algorithm that can compress 44.1 KHz … Read more