Music driver: when you can't do a thing, find another way!



Hey! Today topic is music! ... *hmm* OK, it is about the code playing musics.

Super Tilt Bro. version 2 has better musics than the old version, by a long shot. It was not easy, the old music driver is a minimal driver, written while figuring things out. It is very light but seriously lacks features. To the point that importing any track require multiple days of work, and forces to butcher the original song. When Tuï came with a title screen theme, there was no way to butcher it, the audio engine needed an upgrade.

It would have been too easy to use a well known audio engine. GGSound and Famitone are solid, do the work perfectly and are battle-tested. Super Tilt Bro. needs technologies that are experimental, unstable, and ... *hmm* AMBITIOUS!

OK, and to have a home-made audio driver opens some possibilities, we will come back on it. Before that, let's discuss some common knowledge about NES audio, detail Super Tilt Bro.'s driver operation, and bring biased comparison with other drivers.

Here is a glimpse at Super Tilt Bro.'s audio features, so you know where we are headed:

  • Support for Famitracker effects: Axx, Bxx, Dxx, Fxx, Gxx, Qxx, Rxx, Sxx, 1xx, 2xx, 3xx, 4xx (More to come, aiming to support them all)
  • Support for instruments
  • Support for tempo
  • Faster than Famitone

An audio driver, how does it work?

At the lowest lever there is the audio chip in the NES, we name it the APU for Audio Processing Unit. The APU has a lot of functionality but at the end of the day it exposes some registers and the sound played depends on values we write in these. The goal is then to regularly change values in these registers to play a music.

The audio driver is essentially a chunk of code that is called regularly, reading music data, interpreting it, and writing values in the APU registers. We'll strive to do it as fast as we can (to let CPU cycles for the game) while minimizing music data size (to let ROM space for the game.) Oh ... And ... You know ... Actually being able to play what the musician composed.

On the NES, most artists work with the tracker named Famitracker. It has tons of features, and its own track format. The first step will be to convert tracks from tracker's format to our driver's.


From tracker to driver

One distinctive point of Super Tilt Bro.'s driver is to be extremely light while relying on a powerful import script. Other driver generally have more features but ask the composer not to use the rest. Super Tilt Bro. handles only the minimum, but what is not supported by the driver is converted to a simplified equivalent at import time.

From now on, I will take for granted that you are accustomed to trackers and have some chiptune knowledge. Sorry others, I can only suggest trying Famitracker or Famistudio by yourself, it is a lot of fun.

I will also talk about "ticks." A tick is one call of the driver. As it should be called regularly it is the shortest rhythm unit. I'll observe 6 ticks for a quarter note.

So, the driver's goal is to write values in the APU's registers. To do so, it read some data and acts accordingly. Since these data describe the behavior expected from the driver, the simplest is to use a bytecode. A bytecode is a flow of instructions to execute, like a compiled program. Here are the instructions to play a scale:

play the note C
wait 6 ticks
play the note D
wait 6 ticks
play the note E
wait 6 ticks
play the note F
wait 6 ticks
play the note G
wait 6 ticks
play the note A
wait 6 ticks
play the note B
wait 6 ticks

Of course, the NES has multiple channels (pulse1, pulse2, triangle, and noise.) We need a series of instructions per channel. To be able to compress things, Super Tilt Bro. introduce the concept of sample. A sample is a series of instruction, and each channel plays a series of samples. Now we can reuse some samples, like by having the chorus in a sample and verses in others.

CHANNEL pulse1:
    sample_verse1
    sample_chorus
    sample_verse2
    sample_chorus
CHANNEL pulse2:
    sample_silence
CHANNEL triangle:
    sample_silence
CHANNEL noise:
    sample_silence
sample_verse1:
    play the note C
    wait 6 ticks
    play the note D
    wait 6 ticks
    ...
    sample's end
sample_refrain:
    play the note C
    wait 5 ticks
    silence
    wait 1 tick
    play the note C
    wait 5 ticks
    silence
    wait 1 tick
    play the note G
    wait 5 ticks
    ...
    sample's end
sample_verse2:
    play the note B
    wait 6 ticks
    play the note A
    wait 6 ticks
    ...
    sample's end
sample_silence:
    play silence
    wait 255 ticks
    sample's end

We could improve the compression with smaller samples. Above we could have made a sample for the series "C during 5 ticks, silence during 1 tick," but you get the idea.

While perfectly adapted to driver's code, this notation rapidly become hard to read for human being. We can use an intermediate format, more like trackers' notation but constrained to Super Tilt Bro.'s driver capacities:

CHANNEL 00
00, 01
SAMPLE 00 (2a03-pulse)
#    note   frequency_adjust volume duty pitch_slide
00 : C-3    ...              ...    ...  ...
01 : ...    ...              ...    ...  ...
02 : ...    ...              ...    ...  ...
03 : ...    ...              ...    ...  ...
04 : ...    ...              ...    ...  ...
05 : ...    ...              ...    ...  ...
SAMPLE 01 (2a03-pulse)
#    note   frequency_adjust volume duty pitch_slide
00 : E-3    ...              ...    ...  ...
01 : ...    ...              ...    ...  ...
02 : ...    ...              ...    ...  ...
03 : ...    ...              ...    ...  ...
04 : ...    ...              ...    ...  ...
05 : ...    ...              ...    ...  ...

OK, that is easier to read if you are familiar with trackers. (If you are not, I warned!) "CHANNEL 00" is pulse1, others were left out of this example for brevity.

This format exposes driver's features. There is no loosely defined "effect" column, each column has a specific role. There also is no instrument. "note", "volume", and "duty" should be self-explanatory. "pitch_slide" contains a signed number, it is equivalent to 1xx and 2xx effects in Famitracker. "frequency_adjust" changes the currently played frequency by a number of pitch units, it works as expected even if there is an active pitch_slide. It is unused when importing from Famitracker but can allow for an arpeggio and a slide as Famistudio allows.

This format is readable and trivial to convert to driver's bytecode. But, how do we convert a full-featured Famitracker track to such a limited format?

Here is an extract from a real world track from Tuï:

ROW 00 : C#3 12 . ... ... :
ROW 01 : ... .. . ... ... :
ROW 02 : ... .. . ... ... :
ROW 03 : ... .. . ... ... :
ROW 04 : ... .. . 4AA 103 :
ROW 05 : ... .. . ... ... :
ROW 06 : ... .. . ... ... :
ROW 07 : ... .. . ... ... :
ROW 08 : ... .. . ... ... :
ROW 09 : ... .. . ... ... :
ROW 0A : ... .. . ... ... :
ROW 0B : ... .. . ... ... :
ROW 0C : ... .. . ... ... :
ROW 0D : ... .. . ... ... :
ROW 0E : ... .. . ... ... :
ROW 0F : ... .. . ... ... :
ROW 10 : ... .. . ... ... :
ROW 11 : ... .. . ... ... :
ROW 12 : ... .. . ... ... :
ROW 13 : ... .. . ... ... :
ROW 14 : ... .. . ... ... :
ROW 15 : ... .. . ... ... :
ROW 16 : ... .. . ... ... :
ROW 17 : ... .. . ... ... :
ROW 18 : ... .. . ... ... :
ROW 19 : ... .. . ... ... :
ROW 1A : ... .. . ... ... :
ROW 1B : ... .. . ... ... :
ROW 1C : ... .. . ... ... :
ROW 1D : ... .. . ... ... :
ROW 1E : ... .. . ... ... :
ROW 1F : ... .. . ... ... :
ROW 20 : --- 00 . 400 100 :

Instrument 12 is simple, it just sets duty to 2. The 4xx effect is a vibrato, and the 1xx effect is a pitch slide from low to high pitch. Everything together make for a nice ascending vibrato.

We will simply transform the file step by step, to finally conform to our limitations. First is to flatten the instrument in the pattern, and convert the vibrato to a series of 1xx and 2xx. Here is the result:

ROW 00 : C#3 .. F ... ... ... V02 :
ROW 01 : ... .. F ... ... ... V02 :
ROW 02 : ... .. F ... ... ... V02 :
ROW 03 : ... .. F ... ... ... V02 :
ROW 04 : ... .. F 214 103 ... V02 :
ROW 05 : ... .. F ... ... ... V02 :
ROW 06 : ... .. F 114 ... ... V02 :
ROW 07 : ... .. F ... ... ... V02 :
ROW 08 : ... .. F ... ... ... V02 :
ROW 09 : ... .. F 214 ... ... V02 :
ROW 0A : ... .. F ... ... ... V02 :
ROW 0B : ... .. F ... ... ... V02 :
ROW 0C : ... .. F 114 ... ... V02 :
ROW 0D : ... .. F ... ... ... V02 :
ROW 0E : ... .. F ... ... ... V02 :
ROW 0F : ... .. F 214 ... ... V02 :
ROW 10 : ... .. F ... ... ... V02 :
ROW 11 : ... .. F ... ... ... V02 :
ROW 12 : ... .. F 114 ... ... V02 :
ROW 13 : ... .. F ... ... ... V02 :
ROW 14 : ... .. F ... ... ... V02 :
ROW 15 : ... .. F 214 ... ... V02 :
ROW 16 : ... .. F ... ... ... V02 :
ROW 17 : ... .. F ... ... ... V02 :
ROW 18 : ... .. F 114 ... ... V02 :
ROW 19 : ... .. F ... ... ... V02 :
ROW 1A : ... .. F ... ... ... V02 :
ROW 1B : ... .. F 214 ... ... V02 :
ROW 1C : ... .. F ... ... ... V02 :
ROW 1D : ... .. F ... ... ... V02 :
ROW 1E : ... .. F 114 ... ... V02 :
ROW 1F : ... .. F ... ... ... V02 :
ROW 20 : --- .. F 100 100 ... V00 :

Note that everything happens in import script's memory, we can add effect columns at will. It comes handy to avoid conflicts.

As Super Tilt Bro.'s driver handles only one "pitch_slide", we now have to merge the slides present on different columns. That done by adding slide values, and here is the result:

ROW 00 : C#3 .. F ... ... ... V02 ... :
ROW 01 : ... .. F ... ... ... V02 ... :
ROW 02 : ... .. F ... ... ... V02 ... :
ROW 03 : ... .. F ... ... ... V02 ... :
ROW 04 : ... .. F ... ... ... V02 211 :
ROW 05 : ... .. F ... ... ... V02 ... :
ROW 06 : ... .. F ... ... ... V02 117 :
ROW 07 : ... .. F ... ... ... V02 ... :
ROW 08 : ... .. F ... ... ... V02 ... :
ROW 09 : ... .. F ... ... ... V02 211 :
ROW 0A : ... .. F ... ... ... V02 ... :
ROW 0B : ... .. F ... ... ... V02 ... :
ROW 0C : ... .. F ... ... ... V02 117 :
ROW 0D : ... .. F ... ... ... V02 ... :
ROW 0E : ... .. F ... ... ... V02 ... :
ROW 0F : ... .. F ... ... ... V02 211 :
ROW 10 : ... .. F ... ... ... V02 ... :
ROW 11 : ... .. F ... ... ... V02 ... :
ROW 12 : ... .. F ... ... ... V02 117 :
ROW 13 : ... .. F ... ... ... V02 ... :
ROW 14 : ... .. F ... ... ... V02 ... :
ROW 15 : ... .. F ... ... ... V02 211 :
ROW 16 : ... .. F ... ... ... V02 ... :
ROW 17 : ... .. F ... ... ... V02 ... :
ROW 18 : ... .. F ... ... ... V02 117 :
ROW 19 : ... .. F ... ... ... V02 ... :
ROW 1A : ... .. F ... ... ... V02 ... :
ROW 1B : ... .. F ... ... ... V02 211 :
ROW 1C : ... .. F ... ... ... V02 ... :
ROW 1D : ... .. F ... ... ... V02 ... :
ROW 1E : ... .. F ... ... ... V02 117 :
ROW 1F : ... .. F ... ... ... V02 ... :
ROW 20 : --- .. F ... ... ... V00 100 :

In some little steps, we obtained a Famitracker's module that sounds the same as the original while observing our harsh constraints. Here is the result in internal format:

SAMPLE 00 (2a03-pulse)
#    note  frequency_adjust    volume  duty    pitch_slide
00 : C#3   ...                 15      2       ...
01 : ...   ...                 ...     ...     ...
02 : ...   ...                 ...     ...     ...
03 : ...   ...                 ...     ...     ...
04 : ...   ...                 ...     ...     17
05 : ...   ...                 ...     ...     ...
06 : ...   ...                 ...     ...     -23
07 : ...   ...                 ...     ...     ...
08 : ...   ...                 ...     ...     ...
09 : ...   ...                 ...     ...     17
0A : ...   ...                 ...     ...     ...
0B : ...   ...                 ...     ...     ...
0C : ...   ...                 ...     ...     -23
0D : ...   ...                 ...     ...     ...
0E : ...   ...                 ...     ...     ...
0F : ...   ...                 ...     ...     17
10 : ...   ...                 ...     ...     ...
11 : ...   ...                 ...     ...     ...
12 : ...   ...                 ...     ...     -23
13 : ...   ...                 ...     ...     ...
14 : ...   ...                 ...     ...     ...
15 : ...   ...                 ...     ...     17
16 : ...   ...                 ...     ...     ...
17 : ...   ...                 ...     ...     ...
18 : ...   ...                 ...     ...     -23
19 : ...   ...                 ...     ...     ...
1A : ...   ...                 ...     ...     ...
1B : ...   ...                 ...     ...     17
1C : ...   ...                 ...     ...     ...
1D : ...   ...                 ...     ...     ...
1E : ...   ...                 ...     ...     -23
1F : ...   ...                 ...     ...     ...
20 : ---   ...                 ...     0       0

And the bytecode will look like this:

sample0:
    play the note C#3
    set volume to 15
    set duty to 2
    wait 4 ticks
    set pitch slide to 17 units per tick
    wait 2 ticks
    set pitch slide to -23 units per tick
    wait 3 ticks
    ...

Actually, before generating the bytecode, the import script will apply some dictionary compression which is finding the optimal set of samples to store the music in as few bytes as possible.

How does this driver perform relative to others?

There are a lot of drivers more or less working. We will focus on the three cool kids: Famitracker, Famitone, and GGSound.

Famitracker is the driver that comes with the eponymous tracker. It has every feature needed to play tracker's musics. Despite that, this driver is often avoided by game developers because it can consume a lot of the CPU budget.

Famitone, by Shiru, is the lightest of the three. Missing notable features, it is typically used on small games, when ROM space is more precious than comfort of the composer.

GGSound, by Gradual Games, is a solid in-between which is easier to write for while keeping performances under control.

Here is a overview of each one's implementation. (Except Famitracker, sorry I didn't do my homework on it.)

  • Famitone and GGSound implement instruments but no effect. Super Tilt Bro. does not support instrument, but handles 1xx and 2xx effects.
  • Music formats
    • Famitone features a minimal bytecode: play a note, wait, select an instrument, and loop the track.
    • GGSound adds some control flow: call a subsection (like our samples), GOTO, ...
    • Super Tilt Bro. has no instrument-related data, instead opcodes directly reflect data to write in APU registers


After the rain, Famitone's demo track, and how it is handled by different drivers.

Two points are obvious: Super Tilt Bro.'s driver is lighter than others, sensible since it does not handle instruments. Also, musics take more space in ROM. Other drivers having instruments, most data is a simple series of notes. Super Tilt Bro. makes up for its weakness with compression, but it is not yet enough to compete.

What to think about that?

That your read until there. Impressive! The topic is far from easy :)

Also, the Super Tilt Bro.'s driver is a very lightweight engine which is able to play complex tracks that are not supported by others. This is possible thanks to a big transformation that happens before even running the game. On the other hand, tracks are larger, pressuring the ROM. In Super Tilt Bro.'s case it is not a big deal, CPU cycles are very precious when running the netcode while there is big fat ROM in the cart. That said, it is not a magical solution that fits everybody needs.

In the future, we could gain some ROM space by optimizing the bytecode, and maybe publish the driver as a standalone project if somebody is interested. Also, there is a major reason I insisted on implementing a specific audio engine: everything is possible. It was put aside because of priorities, but one day it would be nice to feature dynamic musics. It means a track that becomes more intense when players are excited, and calms down when they become more passive. That's a reason of the concept of samples, if each sample has multiple variants, the sound engine can choose the most appropriate at any time.

Having a powerful import script is really a good thing. It is too often minimal, simply rejecting features unsupported by the driver or, worse, generating invalid tack data. GGSound could use a compression like the one in Super Tilt Bro. since it is capable of an equivalent to samples. Also, with only the addition of pitch slide effect, and running Super Tilt Bro.'s simplifier, GGSound and Famitone could handle most effects, simplifying composers life.

One last thing, I notice that I repeated the game's name all over this essay. This is because the audio driver does not have a name. Somebody is inspired to name it?

<< Previous post (Benchmark: C compilers)

Get Super Tilt Bro. for NES

Download NowName your own price

Leave a comment

Log in with itch.io to leave a comment.