Noisetracker/Soundtracker/Protracker Module Format

4th Revision
Credits: Lars Hamre, Norman Lin, Kurt Kennett, Mark Cox, Peter Hanning, Steinar Midtskogen, Marc Espie, and Thomas Meyer

(All numbers below are given in decimal)

Module Format:
# Bytes   Description
-------   -----------
20        The module's title, padded with null (\0) bytes. Original
          Protracker wrote letters only in uppercase.

(Data repeated for each sample 1-15 or 1-31)
22        Sample's name, padded with null bytes. If a name begins with a
          '#', it is assumed not to be an instrument name, and is
          probably a message.
2         Sample length in words (1 word = 2 bytes). The first word of
          the sample is overwritten by the tracker, so a length of 1
          still means an empty sample. See below for sample format.
1         Lowest four bits represent a signed nibble (-8..7) which is
          the finetune value for the sample. Each finetune step changes
          the note 1/8th of a semitone. Implemented by switching to a
          different table of period-values for each finetune value.
1         Volume of sample. Legal values are 0..64. Volume is the linear
          difference between sound intensities. 64 is full volume, and
          the change in decibels can be calculated with 20*log10(Vol/64)
2         Start of sample repeat offset in words. Once the sample has
          been played all of the way through, it will loop if the repeat
          length is greater than one. It repeats by jumping to this
          position in the sample and playing for the repeat length, then
          jumping back to this position, and playing for the repeat
          length, etc.
2         Length of sample repeat in words. Only loop if greater than 1.
(End of this sample's data.. each sample uses the same format and they
 are stored sequentially)
N.B. All 2 byte lengths are stored with the Hi-byte first, as is usual
     on the Amiga (big-endian format).

1         Number of song positions (ie. number of patterns played
          throughout the song). Legal values are 1..128.
1         Historically set to 127, but can be safely ignored.
          Noisetracker uses this byte to indicate restart position -
          this has been made redundant by the 'Position Jump' effect.
128       Pattern table: patterns to play in each song position (0..127)
          Each byte has a legal value of 0..63 (note the Protracker
          exception below). The highest value in this table is the
          highest pattern stored, no patterns above this value are
          stored.
(4)       The four letters "M.K." These are the initials of
          Unknown/D.O.C. who changed the format so it could handle 31
          samples (sorry.. they were not inserted by Mahoney & Kaktus).
          Startrekker puts "FLT4" or "FLT8" here to indicate the # of
          channels. If there are more than 64 patterns, Protracker will
          put "M!K!" here. You might also find: "4CHN", "6CHN" or "8CHN"
          which indicates 4, 6 or 8 channels respectively. If no letters
          are here, then this is the start of the pattern data, and only
          15 samples were present.

(Data repeated for each pattern:)
1024      Pattern data for each pattern (starting at 0).
(Each pattern has same format and is stored in numerical order.
 See below for pattern format)

(Data repeated for each sample:)
xxxxxx    The maximum size of a sample is 65535 words. Each sample is
          stored as a collection of bytes (length of a sample was given
          previously in the module). Each byte is a signed value (-128
          ..127) which is the channel data. When a sample is played at a
          pitch of C2 (see below for pitches), about 8287 bytes of
          sample data are sent to the channel per second. Multiply the
          rate by the twelfth root of 2 (=1.0595) for each semitone
          increase in pitch eg. moving the pitch up 1 octave doubles the
          rate. The data is stored in the order it is played (eg. first
          byte is first byte played). The first word of the sample data
          is used to hold repeat information, and will overwrite any
          sample data that is there (but it is probably safer to set it
          to 0).
          The rate given above (8287) conveys an inaccurate picture of
          the module-format - in reality it is different for different
          Amigas. As the routines for playing were written to run off
          certain interrupts, for different Amiga computers the rate to
          send data to the channel will be different. For PAL machines
          the clock rate is 7093789.2 Hz and for NTSC machines it is
          7159090.5 Hz. When the clock rate is divided by twice the
          period number for the pitch it will give the rate to send the
          data to the channel, eg. for a PAL machine sending a note at
          C2 (period 428), the rate is 7093789.2/856 ~= 8287.1369
(Each sample is stored sequentially)

Pattern Format:
Each pattern is divided into 64 divisions. By allocating different
tempos for each pattern and spacing the notes across different amounts
of divisions, different bar sizes can be accommodated.

Each division contains the data for each channel (1..4) stored after
each other. Channels 1 and 4 are on the left, and channels 2 and 3 are
on the right. In the case of more channels: channels 5 and 8 are on the
left, and channels 6 and 7 are on the right, etc. Each channel's data in
the division has an identical format which consists of 2 words (4
bytes). Divisions are numbered 0..63. Each division may be divided into
a number of ticks (see 'set speed' effect below).

Channel Data:
                  (the four bytes of channel data in a pattern division)
7654-3210 7654-3210 7654-3210 7654-3210
wwww xxxxxxxxxxxxxx yyyy zzzzzzzzzzzzzz

    wwwwyyyy (8 bits) is the sample for this channel/division
xxxxxxxxxxxx (12 bits) is the sample's period (or effect parameter)
zzzzzzzzzzzz (12 bits) is the effect for this channel/division

If there is to be no new sample to be played at this division on this
channel, then the old sample on this channel will continue, or at least
be "remembered" for any effects. If the sample is 0, then the previous
sample on that channel is used. Only one sample may play on a channel at
a time, so playing a new sample will cancel an old one - even if there
has been no data supplied for the new sample. Though, if you are using a
"silence" sample (ie. no data, only used to turn off other samples) it
is polite to set its default volume to 0.

To determine what pitch the sample is to be played on, look up the
period in a table, such as the one below (for finetune 0). If the period
is 0, then the previous period on that channel is used. Unfortunately,
some modules do not use these exact values. It is best to do a binary-
search (unless you use the period as the offset of an array of notes..
expensive), especially if you plan to use periods outside the "standard"
range. Periods are the internal representation of the pitch, so effects
that alter pitch (eg. sliding) alter the period value (see "effects"
below).

          C    C#   D    D#   E    F    F#   G    G#   A    A#   B
Octave 1: 856, 808, 762, 720, 678, 640, 604, 570, 538, 508, 480, 453
Octave 2: 428, 404, 381, 360, 339, 320, 302, 285, 269, 254, 240, 226
Octave 3: 214, 202, 190, 180, 170, 160, 151, 143, 135, 127, 120, 113

Octave 0:1712,1616,1525,1440,1357,1281,1209,1141,1077,1017, 961, 907
Octave 4: 107, 101,  95,  90,  85,  80,  76,  71,  67,  64,  60,  57

Octaves 0 and 4 are NOT standard, so don't rely on every tracker being
able to play them, or even not crashing if being given them - it's just
nice that if you can code it, to allow them to be read.

Effects:
Effects are written as groups of 4 bits, eg. 1871 = 7 * 256 + 4 * 16 +
15 = [7][4][15]. The high nibble (4 bits) usually determines the effect,
but if it is [14], then the second nibble is used as well.

[0]: Arpeggio
     Where [0][x][y] means "play note, note+x semitones, note+y
     semitones, then return to original note". The fluctuations are
     carried out evenly spaced in one pattern division. They are usually
     used to simulate chords, but this doesn't work too well. They are
     also used to produce heavy vibrato. A major chord is when x=4, y=7.
     A minor chord is when x=3, y=7.

[1]: Slide up
     Where [1][x][y] means "smoothly decrease the period of current
     sample by x*16+y after each tick in the division". The
     ticks/division are set with the 'set speed' effect (see below). If
     the period of the note being played is z, then the final period
     will be z - (x*16 + y)*(ticks - 1). As the slide rate depends on
     the speed, changing the speed will change the slide. You cannot
     slide beyond the note B3 (period 113).

[2]: Slide down
     Where [2][x][y] means "smoothly increase the period of current
     sample by x*16+y after each tick in the division". Similar to [1],
     but lowers the pitch. You cannot slide beyond the note C1 (period
     856).

[3]: Slide to note
     Where [3][x][y] means "smoothly change the period of current sample
     by x*16+y after each tick in the division, never sliding beyond
     current period". Any note in this channel's division is not played,
     but changes the "remembered" note - it can be thought of as a
     parameter to this effect. Sliding to a note is similar to effects
     [1] and [2], but the slide will not go beyond the given period, and
     the direction is implied by that period. If x and y are both 0,
     then the old slide will continue.

[4]: Vibrato
     Where [4][x][y] means "oscillate the sample pitch using a
     particular waveform with amplitude y/16 semitones, such that (x *
     ticks)/64 cycles occur in the division". The waveform is set using
     effect [14][4]. By placing vibrato effects on consecutive
     divisions, the vibrato effect can be maintained. If either x or y
     are 0, then the old vibrato values will be used.

[5]: Continue 'Slide to note', but also do Volume slide
     Where [5][x][y] means "either slide the volume up x*(ticks - 1) or
     slide the volume down y*(ticks - 1), at the same time as continuing
     the last 'Slide to note'". It is illegal for both x and y to be
     non-zero. You cannot slide outside the volume range 0..64. The
     period-length in this channel's division is a parameter to this
     effect, and hence is not played.

[6]: Continue 'Vibrato', but also do Volume slide
     Where [6][x][y] means "either slide the volume up x*(ticks - 1) or
     slide the volume down y*(ticks - 1), at the same time as continuing
     the last 'Vibrato'". It is illegal for both x and y to be non-zero.
     You cannot slide outside the volume range 0..64.

[7]: Tremolo
     Where [7][x][y] means "oscillate the sample volume using a
     particular waveform with amplitude y*(ticks - 1), such that (x *
     ticks)/64 cycles occur in the division". If either x or y are 0,
     then the old tremolo values will be used. The waveform is set using
     effect [14][7]. Similar to [4].

[8]: (Set panning position)
     This command is unused by the vast majority of trackers, but one
     tracker for the PC (called DMP) uses this for setting the panning
     state of the channel. As this is very useful, I am documenting it
     here. The effect [8][x][y] means "set channel to panning position
     x*16 + y". Position 0 is left, 64 is centre, 128 is right.
     Interestingly, position 164 is defined as "surround". 

[9]: Set sample offset
     Where [9][x][y] means "play the sample from offset x*4096 + y*256".
     The offset is measured in words. If no sample is given, yet one is
     still playing on this channel, it should be retriggered to the new
     offset using the current volume.

[10]: Volume slide
     Where [10][x][y] means "either slide the volume up x*(ticks - 1) or
     slide the volume down y*(ticks - 1)". If both x and y are non-zero,
     then the y value is ignored (assumed to be 0). You cannot slide
     outside the volume range 0..64.

[11]: Position Jump
     Where [11][x][y] means "stop the pattern after this division, and
     continue the song at song-position x*16+y". This shifts the
     'pattern-cursor' in the pattern table (see above). Legal values for
     x*16+y are from 0 to 127.

[12]: Set volume
     Where [12][x][y] means "set current sample's volume to x*16+y".
     Legal volumes are 0..64.

[13]: Pattern Break
     Where [13][x][y] means "stop the pattern after this division, and
     continue the song at the next pattern at division x*10+y" (the 10
     is not a typo). Legal divisions are from 0 to 63.

[14][0]: Set filter on/off
     Where [14][0][x] means "set sound filter ON if x is 0, and OFF is x
     is 1". This is a hardware command for some Amigas, so if you don't
     understand it, it is better not to use it.

[14][1]: Fineslide up
     Where [14][1][x] means "decrement the period of the current sample
     by x". The incrementing takes place at the beginning of the
     division, and hence there is no actual sliding. This type of
     sliding cannot be continued with effect [5]. You cannot slide
     beyond the note B3 (period 113).

[14][2]: Fineslide down
     Where [14][2][x] means "increment the period of the current sample
     by x". Similar to [14][1] but shifts the pitch down. You cannot
     slide beyond the note C1 (period 856).

[14][3]: Set glissando on/off
     Where [14][3][x] means "set glissando ON if x is 1, OFF if x is 0".
     Used in conjunction with [3] ('Slide to note'). If glissando is on,
     then 'Slide to note' will slide in semitones, otherwise will
     perform the default smooth slide.

[14][4]: Set vibrato waveform
     Where [14][4][x] means "set the waveform of succeeding 'vibrato'
     effects to wave #x". [4] is the 'vibrato' effect.  Possible values
     for x are:
          0 - sine (default)      /\    /\     (2 cycles shown)
          4  (without retrigger)     \/    \/

          1 - ramp down          | \   | \
          5  (without retrigger)     \ |   \ |

          2 - square             ,--,  ,--,
          6  (without retrigger)    '--'  '--'

          3 - random: a random choice of one of the above.
          7  (without retrigger)
     If the waveform is selected "without retrigger", then it will not
     be retriggered from the beginning at the start of each new note.

[14][5]: Set finetune value
     Where [14][5][x] means "sets the finetune value of the current
     sample to the signed nibble x". x has legal values of 0..15,
     corresponding to signed nibbles 0..7,-8..-1 (see start of text for
     more info on finetune values).

[14][6]: Loop pattern
     Where [14][6][x] means "set the start of a loop to this division if
     x is 0, otherwise after this division, jump back to the start of a
     loop and play it another x times before continuing". If the start
     of the loop was not set, it will default to the start of the
     current pattern. Hence 'loop pattern' cannot be performed across
     multiple patterns. Note that loops do not support nesting, and you
     may generate an infinite loop if you try to nest 'loop pattern's.

[14][7]: Set tremolo waveform
     Where [14][7][x] means "set the waveform of succeeding 'tremolo'
     effects to wave #x". Similar to [14][4], but alters effect [7] -
     the 'tremolo' effect.

[14][8]: -- Unused --

[14][9]: Retrigger sample
     Where [14][9][x] means "trigger current sample every x ticks in
     this division". If x is 0, then no retriggering is done (acts as if
     no effect was chosen), otherwise the retriggering begins on the
     first tick and then x ticks after that, etc.

[14][10]: Fine volume slide up
     Where [14][10][x] means "increment the volume of the current sample
     by x". The incrementing takes place at the beginning of the
     division, and hence there is no sliding. You cannot slide beyond
     volume 64.

[14][11]: Fine volume slide down
     Where [14][11][x] means "decrement the volume of the current sample
     by x". Similar to [14][10] but lowers volume. You cannot slide
     beyond volume 0.

[14][12]: Cut sample
     Where [14][12][x] means "after the current sample has been played
     for x ticks in this division, its volume will be set to 0". This
     implies that if x is 0, then you will not hear any of the sample.
     If you wish to insert "silence" in a pattern, it is better to use a
     "silence"-sample (see above) due to the lack of proper support for
     this effect.

[14][13]: Delay sample
     Where [14][13][x] means "do not start this division's sample for
     the first x ticks in this division, play the sample after this".
     This implies that if x is 0, then you will hear no delay, but
     actually there will be a VERY small delay. Note that this effect
     only influences a sample if it was started in this division.

[14][14]: Delay pattern
     Where [14][14][x] means "after this division there will be a delay
     equivalent to the time taken to play x divisions after which the
     pattern will be resumed". The delay only relates to the
     interpreting of new divisions, and all effects and previous notes
     continue during delay.

[14][15]: Invert loop
     Where [14][15][x] means "if x is greater than 0, then play the
     current sample's loop upside down at speed x". Each byte in the
     sample's loop will have its sign changed (negated). It will only
     work if the sample's loop (defined previously) is not too big. The
     speed is based on an internal table.

[15]: Set speed
     Where [15][x][y] means "set speed to x*16+y". Though it is nowhere
     near that simple. Let z = x*16+y. Depending on what values z takes,
     different units of speed are set, there being two: ticks/division
     and beats/minute (though this one is only a label and not strictly
     true). If z=0, then what should technically happen is that the
     module stops, but in practice it is treated as if z=1, because
     there is already a method for stopping the module (running out of
     patterns). If z<=32, then it means "set ticks/division to z"
     otherwise it means "set beats/minute to z" (convention says that
     this should read "If z<32.." but there are some composers out there
     that defy conventions). Default values are 6 ticks/division, and
     125 beats/minute (4 divisions = 1 beat). The beats/minute tag is
     only meaningful for 6 ticks/division. To get a more accurate view
     of how things work, use the following formula:
                             24 * beats/minute
          divisions/minute = -----------------
                              ticks/division
     Hence divisions/minute range from 24.75 to 6120, eg. to get a value
     of 2000 divisions/minute use 3 ticks/division and 250 beats/minute.
     If multiple "set speed" effects are performed in a single division,
     the ones on higher-numbered channels take precedence over the ones
     on lower-numbered channels. This effect has a large number of
     different implementations, but the one described here has the
     widest usage.

N.B. This document should be fairly accurate now, but as the module
format is more of an observation than a standard, a couple of effects
cannot be relied upon to act exactly the same from tracker to tracker
(especially if the tracker is not for the Amiga). It is probably better
to use this document as a guide rather than as a hard-and-fast
definition of the module format. Oh.. and yes, I would normally give
bytes as hex values, but it is easier to understand a consistent
notation.

Andrew Scott, Author of MIDIMOD (MOD to MIDI converter), PTMID (MIDI to MOD converter)