Dr. Grill, mp3 was developed by researchers in Germany, but is it really other people who have made “big business with compressed file formats?”
Bernhard Grill: mp3 was and remains a success story for the Fraunhofer-Gesellschaft, but also for the German economy. For many years, annual revenue from licensing mp3, our first-generation audio codec, has been between 50 and 100 million euros. And our second- and third-generation codecs – AAC, HE-AAC and so on – are installed on more than 12 billion devices worldwide, which translates into similar commercial success. Fraunhofer IIS is now one of the world’s largest and most important research institutions for audio technology.
Most people still associate Fraunhofer IIS largely with the development of mp3. What is the institute doing now?
Bernhard Grill: We’ve diversified and specialized our development of new technologies so that we can now handle virtually any new audio-related advancement. We’re currently seeing progress in our work on automotive audio, smart speakers and 3D soundbars, and we’re also active in the field of motion picture technologies such as JPEG-XS. But coding remains our main focus, with great strides being made in fourth-generation audio codecs:
we developed xHE-AAC specially for the streaming of AV content. Thanks to a further drop in data rates combined with consistently high quality, it’s possible to broadcast content almost anywhere without interruptions – even in areas with only 2G coverage. This codec has a firm foothold in the market, is integrated into Android and iOS and is also about to be adopted by major steaming services.
While for xHE-AAC it’s all about bringing data rates down, with MPEG-H Audio it’s possible to broadcast immersive surround sound. What’s more, because MPEG-H takes an object-based approach, users can tailor the playback to their individual preferences. Sony has discovered that immersive sound can be used in music streaming and has based its new 360 Reality Audio format on MPEG-H.
Intensive research into voice signal processing has allowed us to tap into an entirely new area. In collaboration with the world’s leading mobile communications companies, we created a new format – the EVS codec – which enables phone calls to be experienced in Hi-Fi quality. EVS is already a feature of most new smartphones.
We’re also improving our methodology, for instance by combining signal processing with artificial intelligence. Here we’re aiming to resolve issues that have long plagued the AV community, where traditional approaches have yet to yield results.
Amid all these successes, are there still “blind spots” in the audio world that don’t use Fraunhofer technology at all?
Bernhard Grill: Not really. Our solutions can be found in any device capable of audio playback – every mobile phone, every TV – we’ve achieved almost 100 percent market penetration.
But that’s not to say that there aren’t still challenges to overcome. Although EVS has made it onto the devices, it’s not yet used throughout the networks. EVS was standardized as the compulsory speech codec for 5G voice services, so we’re hoping that in the future all cell phone calls will use it.
MPEG-H has been included in a variety of TV standards, but many countries have yet to decide whether and when they will roll out new technologies. We’ll just have to keep working and highlighting the benefits of our solutions, especially when it comes to MPEG-H because there we have a strong proprietary competitor.