Scarlet Devil Mansion

14 March 2025

Small Update on Benben v0.7.0

The other day I ran into an interesting bug. I normally use Crystal version 1.12.2 to compile Benben on my own machines since I’ve found that Benben performs best with this version. But as an experiment, I decided to upgrade to Crystal v1.15.1 and see how Benben behaved with it. For the most part it works fine… except when you have remote control enabled and the CRYSTAL_WORKERS is set to a value less than 5 (it defaults to 4, iirc). Then it stutters BAD (like 1-second gaps), to the point of being totally unusuable. When Benben is compiled with Crystal versions earlier than 1.15.x, the problem goes away.

How Benben Works Internally, and Why

So what’s happening? Well, after doing some detective work, I traced the issue back to this bit of code in the main.cr file:

fun main(argc : Int32, argv : UInt8**) : Int32
  rendering : Bool = false
  jobsNext : Bool = false
  jobs : Int64 = -1
  minWorkers : Int32 = 9

  # Attempt to detect --render/-n and --jobs.
  argc.times do |idx|
    str = String.new(argv[idx])
    if str == "--render" || str == "-n"
      rendering = true
    elsif str == "--jobs"
      jobsNext = true
    elsif jobsNext
      jobs = LibC.atoll(str) # Will return 0 if it can't be converted
      jobsNext = false
    end
  end

  if rendering
    # We don't really care if jobs is 0 or negative or whatever.  We just want
    # to spawn a positive number of worker threads.  The actual command line
    # checking will handle invalid values later on.
    if jobs > 0
      # +2 so we don't starve ourselves of worker threads.  We always want at
      # least minWorkers, though.
      LibC.setenv("CRYSTAL_WORKERS", Math.max((jobs + 4), minWorkers).to_s, 1)
    else
      # This will need to be raised in the future if Benben gets more heavily
      # parallel.  We should ideally always have a few spare workers for the
      # extra fibers we spawn.
      LibC.setenv("CRYSTAL_WORKERS", Math.max(System.cpu_count + 2, minWorkers).to_s, 1)
    end
  else
    # This will need to be raised in the future if Benben gets more heavily
    # parallel.  We should ideally always have one spare worker.
    LibC.setenv("CRYSTAL_WORKERS", minWorkers.to_s, 1)
  end

  Crystal.main(argc, argv)
end

Normally defining your own fun main(argc : Int32, argv : UInt8**) : Int32 isn’t needed in a Crystal program - it’s implicitly defined by the runtime, and so you just do toplevel code as-needed. But Benben has some special needs. Let me explain…

At the current time, true multithreading isn’t enabled by default in Crystal. Instead it uses fibers (basically green threads), which then get cooperatively scheduled by the runtime for execution on a single real OS thread. This allows for concurrency, but is not a true multithreaded design. But, if you pass in the -Dpreview_mt option during compilation, all of this changes. Instead the runtime will schedule fibers on multiple threads, the number of which is controlled by the CRYSTAL_WORKERS environment variable. This allows for true multithreading, where any number of fibers can be scheduled on the worker threads (N:M threading). But, there’s no (documented) first-class access to OS threads.

Benben is designed around multithreading, and requires the -Dpreview_mt flag. The UI, decoding/playback, the remote control listener, each remote control client, and input are all on their own fibers, and each of these expect their own worker thread so that the program behaves smoothly and properly. When rendering, the number of fibers spawned is at least three (progress bar, job receiver, and rendering one file when --jobs 1 is used or you’re on a single core machine), but can spawn a lot more to render multiple files in parallel.

Because of this, Benben’s custom fun main() function specifically overrides the value of CRYSTAL_WORKERS to ensure enough true worker threads are present so that each fiber gets its own thread (plus an extra just to be safe). Then it’s just up to the runtime’s scheduler to schedule the fibers on the threads. Essentially it tries its best to emulate a 1:1 threading model without having direct, first-class access to OS threads.

The reason for this is because, during my early experiments with Benben, I found that not using -Dpreview_mt, or having an insufficient number of worker threads, either locked the program up or would cause severe stuttering. The lock ups were my issue, and were fixed, but the stuttering is just because of the N:M threading model. The only time it wasn’t an issue was when Benben rendered files to WAV with --render.

TL;DR: Benben is designed to be multithreaded, expects one thread per fiber to run properly, and uses a hack to forcefully override the CRYSTAL_WORKERS environment variable so that the runtime spawns enough real threads. This breaks when compiling it with Crystal 1.15.x.

Benben's Issue With Crystal v1.15.x

So what happens when you build Benben with v1.15.x? Basically, fun main() runs, but the adjusted CRYSTAL_WORKERS doesn’t seem to make a difference anymore. I’m not sure if the runtime is not using this environment variable properly, or if the runtime’s scheduler has changed enough that it just behaves differently and is now less efficient with respect to how Benben is designed. But Benben now behaves as if it doesn’t have enough worker threads with v1.15.x by default.

Workaround?

For now, if you use the official AppImage, you aren’t affected. It’s built with an earlier version of Crystal anyway (v1.12.2, I believe).

But, if you build your own copy of Benben, you might run into the stuttering issue, and as I mention in this ticket, there isn’t a really good workaround. The choices are basically 1) use an earlier version of Crystal to compile Benben, 2) set CRYSTAL_WORKERS to at least 5 before running Benben, or 3) don’t use the remote control feature. None of these sound good from a casual end-user perspective.

Basically, Crystal v1.15.x cripples Benben in a way that’s out of my hands :(

Adjusting Plans

There is hope, however… If you remember in my last post, my plan in the coming months is to slowly port Benben to Common Lisp. Benben v0.7.0 was going to land first and still be in Crystal, but then Benben v0.8.0 would come and be ported to Common Lisp. But now that I’ve discovered this issue, I’ve rethought this plan and decided to modify it.

The new plan is to 1) drop all development on the Crystal version of Benben[1], effectively immediately, and 2) instead focus on porting Benben and its related libraries to Common Lisp.

So Benben v0.7.0 will be out later this year, and will be written in Common Lisp. This release will not only have new features, but will also implicitly fix the threading problems described above.

And the good news? The port is already progressing faster than I expected :D

Porting Status

If you want to keep track of the porting status of Benben and all its related libraries, you can do so here: https://chiselapp.com/user/MistressRemilia/repository/benben/wiki?name=Lisp+Port+Status

Currently I’m focusing mainly on porting the VGM library (the Lisp version is named SatouSynth), and getting the base audio processing up to date. I also happen to have an old, VGM-only, incomplete version of Benben already ported to Common Lisp, so I’m able to use that as a starting point when I get around to porting Benben itself.

A logo for Common Lisp that has an alien holding a Lisp flag, and the caption says 'Made with Alien Technology'

Footnotes