Scarlet Devil Mansion

14 June 2025

Introducing the Extended QOA Format for Audio

OK, so I alluded to this a few times in my last few blog posts, but it’s time to make the official announcement.

The Extended QOA Format is a new audio format I’ve developed that’s based on Quite OK Audio Format (QOA), a lossy audio codec. QOA itself is an audio codec that “…decodes audio 3x faster than Ogg-Vorbis, while offering better quality and compression than ADPCM (278 kbits/s for 44khz stereo).” As the authors explain, it sits in what they call the “triangle of neglect”, which is an area where codecs have good quality and low complexity.

A chart from the original QOA announcement post showing the 'triangle of neglect'.

I fell in love with the format when I discovered it, especially with how it sounds and how fast it decodes, but I was disappointed that it didn’t support anything like Vorbis Comments for metadata. I mean, it makes sense - I’m pretty sure QOA was developed for applications like audio for video games or embedded use. But QOA sounds good, is a solid format, and most importantly, I wanted to use it for everyday music.

That’s where the new Extended QOA Format (XQAF or XQA) comes in. This new format extends QOA by adding metadata and other features that make it suitable as an everyday audio codec, akin to how you already use MP3 or Ogg Vorbis. The official reference implementation is part of CL-RemiAudio, and the full format specifications can be found here.

Internally, XQAF uses the same codec as normal QOA, just with a slightly different frame format. So XQAF files sound identical to QOA, and any QOA decoder can easily be extended with support for XQAF. The real benefit of XQAF is what it adds on top of QOA.

Metadata (Tags), Flags, and ReplayGain

XQAF supports Vorbis Comments directly within the file for tagging purposes. These can be added with the official reference tool (see the next section). Since Vorbis Comments are widely supported, this should make it extremely easy for existing applications to support the tags in XQAF files.

The XQAF format also supports a few flags within each file. These flags are all “off” by default, and the official reference tool can toggle them:

Lastly, XQAF fully supports ReplayGain. This is stored in the Vorbis Comments just like with Ogg Vorbis files, and both per-file and album modes are supported.

The format is designed to be extensible so that new features can be added without breaking backwards compatibility. So new flags or features can be added without worry.

XQATool, the Official Reference Tool

Creating a new format is pointless unless users have some way of using it. XQATool is the official reference tool, and lets you do everything you need with XQAF files (and normal QOA if you wish):

Right now it’s source-only, but I’ll be getting some AppImages up within a few days of this blog post.

Player Support

Benben v0.7.0 will fully support XQAF, as well as normal QOA. If you want to listen to XQAF right now, you can build the current Benben trunk code from source, or download one of its development AppImage builds.

Sound Examples

These examples were created by taking a FLAC file, creating an XQAF file from it using XQATool, decompressing the XQAF file back to WAV, then creating a new FLAC from that. So essentially what you hear here is how the actual XQAF encoder sounds. The original FLAC files are also linked for reference.

On a side note, if you enjoy these tracks, then thank you! They’re written by me. You may want to check out my catalog on Bandcamp. The original FLACs are straight from the master WAV files.

Internal Stuff

Internally, the format has a larger header to support tag data, flags, and make a few editing operations easier for downstream users. Unlike normal QOA format, the sample rate and number of channels is stored only once in the main header. Each frame header is thus four bytes smaller than normal QOA. The LMS and Slice formats are identical.

Like QOA, XQAF supports sample rates from 1 Hz to 16,777,215 Hz, inclusive. It also supports between 1 and 255 channels of audio data, inclusive.

Offset fields within the main header are 32-bit unsigned integer values by default. When using the 64-bit offset flag, these fields are upgraded to 64-bit unsigned integer values. This was added in order to support really, really long tracks (or long tracks with overkill sample rates). When the flag is set, then these fields get upgraded to 64-bit: Offset to the encoded audio data, the size of the encoded audio data in bytes, and the offset to the Vorbis Comment data. The size of the Vorbis Comment data is always 32-bit.

The current format version is v1.1, which removed a pointless “seek table” from the format. The seek table was designed to make seeking easier, but I soon realized that it was already easy to seek within a QOA file, so this was dropped. XQATool and the reference codec in CL-RemiAudio still support v1.0 files, though I highly doubt anyone found the code and created any :-P

The official specification document explains all of this in more detail.