Skip to content

Making own instruments

Veikko Sariola edited this page Oct 31, 2023 · 6 revisions

Sointu virtual machine

Let's start with a simple example how one would make an instrument in a modular synthesizer, visualized as a graph:

graph LR;
    Envelope-->Multiply;
    Oscillator-->Multiply;
    Multiply-->Delay;
    Delay-->Out;
Loading

Sointu adopts terminology from 4klang and calls each of the nodes of this graph a "unit". For now, we will assume all unit inputs and outputs mono signals and worry about stereo signals later. To synthesize such an instrument, sointu uses a stack-based virtual machine, where the byte code is executed once per sample. The instructions for the previous example:

                       // stack: <empty>
envelope               // stack: envelope
oscillator             // stack: oscillator, envelope
mulp                   // stack: envelope*oscillator
delay                  // stack: delay(envelope*oscillator)
out                    // stack: <empty>

The comments show the contents of the stack, with the top-most element on the left.

Notice "mulp"; there is an another closely related instruction "mul". The p in the end stands for pop: "mulp" pops the top of the stack and replaces the new top of the stack with the product of the two signals, while "mul" just replaces the top of the stack with the product of the two topmost signals.

In sointu, this example instrument looks like this:

image

Notice the small numbers after each unit: they indicate the number of signals on the stack.

Stereo signals

If you actually implemented the previous instrument, you will discover that it only outputs sound to the left channel. To output a stereo signal, you would use the stereo version of the "out". However, the stereo version of "out" expects the two signals on the stack: left and right. Thus, we also need to convert the mono signal to a stereo signal, converting one signal on top of the stack to two signals. To do that, we use the "pan" unit:

image

Almost every unit has a mono version and a stereo version. In general, mono versions modify/add/remove single signals and stereo versions modify/add/remove signals in pairs from the stack. For example, the stereo version of "envelope" just pushes the same envelope twice on the stack.

Unit parameters and bytecode

Each of the unit instructions take also parameters (e.g. "envelope" has stereo, attack, decay, sustain, release and gain). Stereo parameter is only a single bit, while most values are between 0 and 128. Unlike in typical assembly languages that CPUs consume, the opcodes and the parameters (operands) of an instruction are not encoded in a single stream, but the sointu units (instructions) are encoded in two streams:

  1. Opcode stream, which has single bytes, telling the type of next unit
  2. Operand stream, which contains variable number of bytes per unit, containing the parameters of a unit

In Sointu, the stereo bit is encoded in the least significant bit of the opcode stream, but all other parameters are in the operand stream.

The reason to split the instructions into two streams, is that this compresses better: for many units, values 0, 64, 128 are good "off/neutral/on" values, so grouping them into one stream compresses best.

In the compiled version of the song, the meaning of the bytes in the command stream depends on the patch: only units that are used at least once are included in the jump table of the virtual machine. Thus, a simple way to optimize the size of your patch is try to use as few different units as possible.

Units

Stack manipulation and arithmetic

Unit Stereo/mono Stack before Stack after
add mono a, b, ... a + b, b, ...
add stereo a, b, c, d, ... a + c, b + d, c, d, ...
addp mono a, b, ... a + b, ...
addp stereo a, b, c, d, ... a + c, b + d, ...
mul mono a, b, ... a * b, b, ...
mul stereo a, b, c, d, ... a * c, b * d, c, d, ...
mulp mono a, b, ... a * b, ...
mulp stereo a, b, c, d, ... a * c, b * d, ...
pop mono a, ... ...
pop stereo a, b, ... ...
push mono a, ... a, a, ...
push stereo a, b, ... a, b, a, b, ...
xch mono a, b, ... b, a, ...
xch stereo a, b, c, d, ... c, d, a, b, ...

Signal sources

All the signal sources push new signals on the stack.

Envelope

The envelope unit pushes a linear Attack/Decay/Sustain/Release envelope on stack. Stereo version of the unit pushes the same envelope value twice on the stack. The envelope is triggered by Note ON events and sustained until a Note OFF event occurs.

Oscillator

The oscillator unit is a simple oscillator, oscillating at the frequency defined by the current triggered note. Parameters:

  • Type. Sine, trisaw, pulse, gate or sample.
  • Transpose. Controls the transpose of the oscillator, in semitones. 64 = neutral.
  • Detune. Detuning of the oscillator. Note that in the stereo oscillator, the detune is reversed for the other channel, so if left channel is -0.1 semitones, the right channel is +0.1 semitones. 0 = -1 semitone, 64 = neutral, 128 = +1 semitone.
  • Phase. Starting phase of the oscillator. 0 = 128 = phase offset of full period
  • Color. Meaning of the color parameter depends of the type oscillator: 1) Sine: 128 = pure sine wave, < 128 = the sine wave is squeezed into part of the period, followed by 0s. 2) Trisaw: 64 = triangle wave, 0 = left saw tooth wave, 128 = right saw tooth wave. 3) Pulse: color controls the duty cycle of the pulse wave. 4) Gate: color is the bit pattern of 1s and 0s. 5) Sample: no effect.
  • Shape. The oscillator wave is distorted by a wave shaping function, where the amount of wave shaping depends on shape. The actual function is x*a/(1-a+(2*a-1)*abs(x)), where x is the signal and a is shape / 128.0. 64 = neutral (the equation reduces to just x).
  • Gain. 128 = 1.0, 0 = 0.0

When the oscillator type is sample, it produces one of the samples from the gm.dls file. Gm.dls is a file, which comes preinstalled with Windows. Sointu loads the file entirely in memory. It contains 16-bit PCM mono samples values. Internally, sointu reuses the color parameter as an index to a table, where each entry is 8 bytes: 4 bytes to define the offset within gm.dls file where the sample starts, 2 bytes to define the loop start point and 2 bytes to define loop length of the sample.

Noise

Produces noise using a central random number generator, so won't produce exactly same output every time note is triggered.

  • Shape. The noise is distorted by a wave shaping function, where the amount of wave shaping depends on shape. The actual function is x*a/(1-a+(2*a-1)*abs(x)), where x is the signal and a is shape / 128.0. 64 = neutral (the equation reduces to just x).
  • Gain. 128 = 1.0, 0 = 0.0

Loadval

Pushes a given value in the range from -1.0 to 1.0 on stack.

  • Value. 0 = -1.0, 64 = 0.0, 128 = 1.0

Receive

Can be targeted by a send to receive a signal from somewhere else in the patch. Has no parameters; pushes the received mono or stereo signal on stack.

In

Reads and zeroes a global input/output port and pushes the value on stack. There are 8 global ports, numbered from 0-7: master L(eft), master R(ight), aux1 L, aux1 R, aux2 L, aux2 R, aux3 L, and aux3 R.

Signal sinks

All the signal sinks consume signals from the stack.

Send

Sends the signal to another location in the patch, to modulate other units using the signal.

  • Amount. A gain that is applied to the signal before adding it to the target. 0 = -1.0, 64 = 0.0, 128 = 1.0
  • Voice. Which voice of the targeted instrument should be modulated. 0 has a special meaning: for an instrument that modulates itself, 0 means that each voice of that instrument modulates itself. For cross-instruments modulations, 0 means "all" i.e. modulate all the voices of that instrument. Internally, "all" gets compiled into multiple sends, one per voice modulated.
  • Target. Which unit (of which instrument) is modulated.
  • Port. Which port of the targeted unit is modulated. Most ports correspond to the parameters of the unit, but some units have additional ports that affect their behavior.
  • Sendpop. If the signal should be removed from the stack after it is sent.

Out

Adds the current signal to the master output, then removes it from the stack.

  • Gain. 128 = 1.0, 0 = 0.0

Outaux

Adds the current signal to the master and aux1 outputs, then removes it from the stack.

  • Outgain. The gain for the master channel. 128 = 1.0, 0 = 0.0
  • Auxgain. The gain for the aux1 channel. 128 = 1.0, 0 = 0.0

Aux

Adds the current signal to chosen global port (master, aux1, aux2 or aux3), then removes it from the stack.

  • Gain. 128 = 1.0, 0 = 0.0
  • Channel. Global port, numbered from 0-7: master L(eft), master R(ight), aux1 L, aux1 R, aux2 L, aux2 R, aux3 L, and aux3 R.

Effects

Gain

Mono version changes the stack as x -> x*g, stereo version l r -> l*g r*g

  • Gain. 128 = 1.0, 0 = 0.0

Invgain

Mono version changes the stack as x -> x/g, stereo version l r -> l/g r/g

  • Gain. 128 = 1.0, 0 = 0.0

Dbgain

Mono version changes the stack as x -> x*g, stereo version l r -> l*g r/g

  • Decibels. 128 = +40 dB (i.e. 100 x amplitude), 64 = neutral, 0 = -40 dB (i.e. 0.01 x amplitude)

Crush

Mono version changes the stack as x -> e*int(x/e) and the stereo version as l r -> e*int(l/e) e*int(r/e), where e=2**(-24*resolution)

  • Resolution. 0 = 0 bits, 128 = 24 bits

Clip

Mono version changes the stack as x -> min(max(x,-1),1) and the stereo version as l r -> min(max(l,-1),1) min(max(r,-1),1).

Compressor

Mono version pushes g on stack, where g is a suitable gain for the signal. You can either MULP to compress the signal or SEND it to a GAIN somewhere else for compressor side-chaining. Stereo version push g g on stack, where g is calculated based on l^2 + r^2 signal i.e. the sum of energy of both left and right channels.