Perceptual Evaluation of a Mix Presentation for Immersive Audio with IAMF

Carlos Tejeda-Ocampo
Toni Hirvonen
Ema Souza-Blanes
Mahmoud Namazi
AES 158th Convention of the Audio Engineering Society (2025) (to appear)
Google Scholar

Abstract

Immersive audio mix presentations involve transmitting and rendering several audio elements simultaneously. This enables next-generation applications, such as personalized playback. Using immersive loudspeaker and headphone MUSHRA tests, we investigate bitrate vs. quality for a typical mix presentation use case of a foreground stereo element, plus a background Ambisonics scene. For coding, we use Immersive Audio Model and Formats, a recently
proposed system for Next-Generation Audio. Excellent quality is achieved at 384 kbit/s even with reasonable amount of personalization. We also propose a framework for content-aware analysis that can significantly reduce the bitrate when using underlying legacy audio coding instances.