1 November 2025
Reworking CL-RemiAudio's "Formats API" Yet Again
Other programmers with personal projects have this same thing in their lives, right? You’ve got a library or some sort of subsystem that you wrote, and you rewrite it every so often. Maybe it’s every month, every six months, or once a year. But regardless, you end up rewriting it, either because you had a better idea, or your last good idea ended up being a bad idea, or maybe you’re just plain bored and need something to do.
CL-RemiAudio’s Formats API is what I seem to rewrite over and over. Hopefully this is the last time I do it.
A Brief History
Old versions of the Formats API took a very straightforward approach to handling
things: you had an AU-FILE and a WAV-FILE class (I think), they didn’t share
a lot of code, and you needed to provide samples in the correct format. I think
it may have also had some sort of format conversion built-in, but it’s been so
long that I don’t remember. A lot of runtime checks happened as well, so it
wasn’t the fastest thing in the sock drawer.
The first rewrite changed things to have dedicated classes for reading and
writing, and also switched to using FUNCALL’d functions for its reading and
writing. This sped things up dramatically at the cost of some internal
complexity and general messiness. But it’s the first time the public-facing API
functions were mostly solidified.
The next rewrite went back to just AU-FILE and WAV-FILE (and I think may
have introduced a parent AUDIO-FILE class? Again, it’s been quite a while),
and changed things so that you could both read and write without having separate
instances. This cleaned up the internal code in some ways, and complicated it
in others. But it worked, and was still pretty fast. This rewrite also
introduced the Sample Info stuff into the API, and switched the semantics to
reading/writing either the native sample format (e.g. T/INT16 for 16-bit
signed integer samples), SINGLE-FLOAT, or DOUBLE-FLOAT.
The most recent rewrite (and the one that landed in Benben v0.7.0) was huge. I
kept the AU-FILE and WAV-FILE classes, but then had a subclass for each
supported encoding. Internally I generated a huge number of inline functions
with macros, where each encoding got dedicated reading and writing functions for
a single native or float sample, and vectors of samples (native and float), and
also low-level ones that did the actual I/O on the stream. Then I would
generate methods for each subclass that would use these internal methods, and
also (on SBCL) compiler macros to skip as much runtime dispatch as possible.
This complicated the heck out of the internal code, but it was faster than
previous versions (assuming you did types and your OPTIMIZE stuff correctly).
It wasn’t a huge increase in speed like I had hoped, but one that, at the time,
I figured was significant enough. The downside was an added dependency
(sb-cltl2, for the compiler macros - SBCL only), and increased compile times.
Leading Up to the Latest Rewrite
People around my age may remember a game from 1995 for DOS called Command & Conquer). It one of the early RTS games for PC, and it had really awesome music. The music data was encoded at 22050 Hz mono in a slightly customized ADPCM format (“Westwood ADPCM”, very similar to IMA ADPCM but with some different on-disk structure), then stored in “Mix” files (.mix). This let the game have a large amount of very high quality music that wasn’t in a MIDI or module format with reasonable storage requirements. The next game, Command & Conquer: Red Alert, used the same format except that the .mix files were encrypted. Later games in the series either continued with this format, or as is the case with Command & Conquer: Red Alert 2 (maybe later games as well, I’m not sure), switched to standard IMA ADPCM .wav files stored in the .mix files.
The C&C games were my favorite RTSes back in the day, and I got back into them a few years ago. Last month I played through many of the C&C games again and realized just how awesome the music was in these games. That’s where I started thinking: wouldn’t it be cool if Benben could play the .mix files directly? ADPCM is not a difficult or CPU-intense compression scheme, after all.
So, I hacked together a prototype .mix reader and Westwood ADPCM decoder in Common Lisp over a couple of nights that worked with C&C, Red Alert, and Tiberian Sun. It was awesome. RA2 followed soon after once I learned the difference between Westwood’s ADPCM format, and normal IMA ADPCM stored in a RIFF WAVE.
To get this included into Benben, though, would require more than just prototype code. It needed formal support in CL-RemiAudio’s Codecs API, and the IMA ADPCM RIFF WAVE stuff needed to be supported by the Formats API. It was the latter that prompted the rewrite. And I wanted it done for a small patch version that I’m planning to release next week, Benben v0.7.1.
The Rewrite
The Formats API has, up until now, only supported linear PCM (8-bit through 64-bit) and IEEE floating point (32-bit and 64-bit) samples. I’ve also always planed to include µLaw and Alaw support sometime in the future, but just never got around to it. Unfortunately, it was quite obvious that the API was designed only to support PCM/float when I went to add in IMA ADPCM support, and it was obvious my last rewrite made it incredibly annoying to add new features.
Time for another rewrite, preferably with minimal API breakage, and preferably in a way that it’s easily extended so that I never have to do a rewrite like this again.
The first things were get rid of the compiler macros, which never did speed
things up as much as I had hoped; removed all the encoding-specific classes;
removed all the generated internal functions and methods that handled
per-encoding I/O; and went back to using FUNCALL’d internal functions for
low-level I/O (just things like write-int16, %read-int32-samples, etc.).
However, this time I decided to store them in a dedicated DECODER structure,
stored in a class slot, to keep the code nice and cleanly separated. Then, the
public API methods such as AUDIO-FILE-WRITE-SAMPLE-F32 just FUNCALL’s the
appropriate function in the DECODER record. This is very similar to my second
and third iterations of the Formats API, but has been cleaned up and optimized
quite a bit.
One nice thing is that most of the public API has not been broken. What did
change is the class hierarchy, which now looks like this (classes with %
prepended to their name are internal only and not exported), and also implies
going back to the older semantics of “one class (tree) for reading, one class
(tree) for writing”:
AUDIO-FILEAUDIO-FILE/READABLE, inheritsAUDIO-FILEAUDIO-FILE/WRITEABLE, inheritsAUDIO-FILE%AU-FILE, inheritsAUDIO-FILE.AU-FILE/READABLE, inherits%AU-FILEandAUDIO-FILE/READABLE.AU-FILE/WRITEABLE, inherits%AU-FILEandAUDIO-FILE/WRITEABLE.%WAV-FILE, inheritsAUDIO-FILE.WAV-FILE/READABLE, inherits%WAV-FILEandAUDIO-FILE/READABLE. This is the class for reading linear PCM WAVE files.WAV-FILE/WRITEABLE, inherits%WAV-FILEandAUDIO-FILE/WRITEABLE. This is the class for writing linear PCM WAVE files.FLOAT-WAV-FILE/READABLE, inheritsWAVE-FILE/READABLE. This is the class for reading IEEE floating point WAVE files.FLOAT-WAV-FILE/WRITEABLE, inheritsWAVE-FILE/WRITEABLE. This is the class for writing IEEE floating point WAVE files.EXTENSIBLE-WAV-FILE/READABLE, inheritsWAVE-FILE/READABLE. This is the class for reading “extensible” WAVE files.EXTENSIBLE-WAV-FILE/WRITEABLE, inheritsWAVE-FILE/WRITEABLE. This is the class for writing “extensible” WAVE files.IMA-ADPCM-WAV-FILE/READABLE, inheritsWAVE-FILE/READABLE. This is the class for reading IMA ADPCM WAVE files.IMA-ADPCM-WAV-FILE/WRITEABLE, inheritsWAVE-FILE/WRITEABLE. This is the class for writing IMA ADPCM WAVE files.
This design was chosen to simplify internal code while still keeping it fast, and also because RIFF WAVE is an awful format with redundant fields and ways to represent encodings. Seriously, I’ve come to dislike WAVEs, mostly because of the “RIFF” part of that name[1], but also because of the whole “extensible WAVE format” thing. The audio is fine in them, it’s just PCM/Float/whatever, but whew are RIFF files an annoying inelegant relic. It’s like when I implemented a reader for Ogg containers and suddenly lost my appreciation for the format (I’ll still take it over an .mp4 file any day, though[2]). Au files are just plain nice and I’ll prefer those any day.
cough Anyway…
Moving forward, all I need to do to add a new encoding is to add a new pair of
subclasses. So when I finally add µLaw and Alaw, they’ll just be new subclasses
of %AU-FILE and %WAV-FILE. Special handling ends up being abstracted away
by the DECODER structure and the generic functions that make up the public
API. Additionally, the internal code has been refactored so that it’s easier to
navigate and is just generally cleaner.
Speed is almost as good as the last rewrite, and seems to be a bit better than
the last time I did FUNCALL’d stuff, though I haven’t done any formal
benchmarks. I can probably still improve it a little before the release of
CL-RemiAudio v0.1.3.
There are a few other changes to the public API, but these are all minor. Check the NEWS file if you’re curious.
CL-RemiAudio v0.1.3
So, my plan is to have CL-RemiAudio v0.1.3 out in the next few days. It’ll have a new IMA ADPCM decoder and encoder, the reworked Formats API, and a few other things. If I have the time and energy this weekend, I’ll also add in Westwood ADPCM (just the codec), µLaw, and Alaw support just to test that my internal API design is indeed easily extended.
Benben v0.7.1 will come out next week. It’ll mainly just fix two bugs (one in the NES emulator core for VGM files, and one with config generation), but will also include experimental support for playing back IMA ADPCM .wav files thanks to the new Formats API refresh. So you won’t be able to play .mix files directly yet (that’ll be v1.0, sometime late next year probably), but you’ll at least be able to play the .wav files you extract from Red Alert 2, or any other sources that use IMA ADPCM in a RIFF WAVE container.
Footnotes
- And yes, this is one reason why I’ll not implement AIFF support. The other reason being I hate Apple. The other other reason being Au, RIFF WAVE, and WAVE64 are all more than capable and fulfill all needs.
- I’m convinced the MPEG people do not know the first thing about writing good file formats. Codecs, sure, generally. Formats, no. Either that or the formats are just so corpo-saturated that they’re over-engineered dumpster fires designed to keep the public in-line with what the corpo overlords want from us rather than what we want and need. Yeah, I hate .mp4 files with a passion. Ogg or Matroska, please.
