/------------------\ | NSF Decompiler | \------------------/ ==========Why 0cc 0.3.13?========== 0cc-Famitracker 0.3.13 introduced instrument interoperability. (That means instruments made for one chip can work with other chips) This feature informs the last major change made to the format of FT compiled data. Starting from this version, compiled instrument start with a byte that indicates the instrument's chip. This change makes it possible to read instruments independently. For comparison, goluigi's decompiler has to read through pattern data first, then read each instrument differently based on where it appeared. It's something I want to implement in the future, but I'm focused on the newer versions for now. ==========NES memory vs NSF payload========== As stated in the FT source code: NSF compiled data assumes some switchable banks 8000-8FFF: non-switched 9000-9FFF: non-switched A000-AFFF: non-switched B000-BFFF: switchable C000-CFFF: switchable <- DPCM READS SAMPLES RELATIVE TO HERE D000-DFFF: switchable E000-EFFF: switchable F000-FFFF: non-switched There is "global" data that must reside in the first 3 (non-switched) banks: The Driver (being code that must always be accessible) Header, Instruments, Grooves, Track Headers If all of the global data can't fit on the first 3 banks (8000-AFFF), the export fails. In a bankswitched NSF, frame and pattern data resides on switchable bank B000-BFFF. Their offset is usually given as a combination of an offset (relative to the start of data) and a bank number. Frames or patterns that are at "offset x, bank y" in NES memory will be at "offset x (adjusted for load address and nsf header) + 0x1000*(y - 3)" in an NSF file. ==========what a driver is made of========== The "driver" is asm code necessary to read and play compiled FT data. Each new version of the driver comes in 7 variants, one for each chip plus one for multichip. Its size has grown steadily in the past, and by a big margin from Dn 0.4.0.1 to 0.5.0.0. (TODO: it would be really cool if I could graph this) Drivers usually start with a version tag: ascii bytes that read "FTDRV ", "0CCFT ", "DN-FT " or "Dn-FT " "FTDRV " only appeared in Vanilla 0.4.3, previous version use the same format but without tag Other notable things near the start of the driver: init address (from NSF header) is always the 8th byte of the driver play address (from NSF header) is always the 11th byte of the driver Drivers usually end with: vibrato table (256 bytes) data start address (2 bytes, little-endian) where "data" is the music data that I'm decompiling. ==========first step: finding the driver========== A driver's length immediately correlates with its version (because their lengths are generally different for each new version of each variant). Because "init address" (from NSF header) is always the 8th byte of the driver, you automatically know the starting position of the driver, but not necessarily its length. The driver can be placed either before or after data. There are 3 possible configurations: |--driver--|---base data---|--bankswitched data--||--(bankswitched) dpcm samples--| very common for big files: driver at the start, on non-switched banks instruments/sequences/headers on non-switched banks pattern data on switched bank (B000-BFFF) dpcm samples start at C000 |--driver--|---non-bankswitched music data---||--- dpcm samples ---| very rare non-bankswitched format, possible with very few dpcm samples driver at the start, all music data follows and possibly spills into bank C000 |---music data---|--driver--||--dpcm samples--| common for small files in this case the driver ends at C000, so you immediately know its length (and version) In the first 2 cases, you need another means to find the driver's length. Goluigi's decompiler (the 0cc part specifically) looks for a "signature" that always appears near the end of the driver. That "signature" is actually the end of the vibrato table (see above). Without goluigi's observation (however incomplete it may be) I wouldn't know where to start, so thanks a lot to goluigi for coming up with it. ==========rest of music data========== All of this is derived from the source code of Famitracker, (specifically Compiler.cpp and PatternCompiler.cpp). Data follows this order: Header Sequences Instrument list (offsets to each instrument) Fds wave list N163 wave list Instruments Dpcm keys (3 bytes each) Dpcm pointers (3 bytes each) Grooves (always starts with a 0) Song/track list (offsets to each song/track) Song/Track headers, each contains offset and bank of own frame list Bankswitched data for each song: Frame list (offsets to each frame) Frames (offsets and banks of patterns for each channel of this frame) Pattern data When describing the details of each section, I use these shorthands: "int" always means 16-bit (2 bytes) little-endian integer "offset" (unless specified) is relative to the "data start address" that appears at the end of the driver "index" (unless specified) is relative to fixed-size elements in a section, starting from 0 HEADER: at the start of music data int: offset to track list int: offset to instrument list int: offset to sample keys int: offset to sample pointers int: offset to grooves byte: flags (1=bankswitched, 2=old vibrato that nobody uses, 4=linear pitch introduced in 0cc 0.3.14.1) if the header is fds or multichip: int: offset to fds waves int: NTSC divider (need to look up what that was) int: PAL divider (need to look up what that was) if the header is n163 or multichip: byte: number of n163 channels SEQUENCES = SEQUENCE[] SEQUENCE: byte: sequence length (L) signed byte: loop point (-1 indicates no loop point) byte: release point + 1 (0 indicates no release point) byte: setting (meaning depends on sequence type) signed byte[L]: sequence values INSTRUMENT LIST: int[]: offsets to instruments how many instruments? depends on where this section ends where does this section end? at the minimum of these offsets (written in order of how hard they are to find) * "fds wave list" section (if present) -> offset specified in header * "instruments" section -> assuming instrument list is in growing order, this is the first element of the instrument list * "n163 wave list" section (if present) -> offsets are only specified in each instrument so you're S O L, you gotta read instruments one by one FDS WAVE LIST: fds waves have fixed length of 64 bytes each byte[][64]: fds waves how many waves? depends on where this section ends, see above N163 WAVE LIST: n163 waves can be any size and any number a N163 instrument specifies a wave size and offset, but not the number of waves this means you need to collect all wave offsets, and infer wave count for each instrument from the boundaries of waves INSTRUMENTS = INSTRUMENT[] INSTRUMENT: byte: chip (0=2A03, 4=VRC6, 6=VRC7, 7=FDS, 9=N163, 10=S5B) FOR INSTRUMENTS THAT HAVE SEQUENCES (every type except VRC7) byte: mask of used sequences (1=volume, 2=arpeggio, 4=pitch, 8=coarse pitch, 16=duty/timbre) int[number of used sequences]: offsets to used sequence EXTRA DATA FOR FDS INSTRUMENTS byte[16]: packed mod table (each value of the mod table is 3 bits; each byte has 2 values in bits 0..2 and bits 3..5; bits 6 and 7 are unused) byte: modulation delay byte: modulation depth int: modulation rate byte: wave index (wave is found in "fds wave list" section; waves have fixed size of 64 bytes) EXTRA DATA FOR N163 INSTRUMENTS byte: wave size in bytes (each byte has 2 samples in bits 0..3 and 4..7) byte: wave position in n163 memory int: wave offset VRC7 INSTRUMENT byte: patch number * 16 if patch number is 0: byte[8]: custom patch DPCM KEYS = DPCM KEY[] DPCM KEY: byte: pitch (low 4 bits) and loop (bit 6, instead of bit 7 like in FT module) signed byte: delta counter reset byte: dpcm pointer index * 3 DPCM POINTERS = DPCM POINTER[] DPCM POINTER: byte: address/64 byte: (size-1)/16 byte: bank GROOVES: the "grooves" section always starts with a byte (with value 0) even if there are no grooves, there is one empty byte GROOVE: bytes[until 0]: groove entries byte: offset (relative to GROOVES start) of the start of this groove rationale: grooves loop, so the driver must know where to go back to when a groove ends TRACK LIST: int[]: offsets to track headers how many tracks? specified in nsf header TRACK HEADER: int: offset to frame list byte: frames (0 means 256) byte: rows per frame (0 means 256) byte: speed (0 means use groove specified later) byte: tempo (0 means fixed tempo) byte: offset (relative to GROOVES start) to groove byte: bank where frame list is located (byte always present even if nsf is not bankswitched) ->rounds up to a nice fixed 8 bytes per track, which is rare in this format FRAME LIST: int[]: offsets to frames how many frames? the number of frames specified in track header FRAME: int[channel count]: pattern offset for each channel how many channels? depends on expansion chips and (if applicable) namco channel count channels follow a different order compared to FT, notably DPCM is always last if nsf is bankswitched: byte[channel count]: bank where the pattern is located FRAME LIST and FRAMES for each track cannot cross bank boundaries. 0cc 0.3.14.5 once made that mistake on one of my files and made nsf players "crash". (not the emulator, the emulated nes) PATTERN DATA FORMAT: pattern is defined one row at a time (of course) each row is defined in a specific order different from the FT UI presentation: Gxx effect FIRST, always any other non-note commands (effects, volume, instrument) in any order note (terminates the row) number of empty rows before next defined row (unless a certain command is active) each note/volume/instrument/fx is defined as a command a COMMAND is: byte: specifies type and value of command optional extra byte (here marked with "+byte" when present): command parameter COMMAND TYPES commands 0x00 to 0x7F are NOTES: 0x00: no note 0x7F: note cut 0x7E: note release 0x70 to 0x73: echo buffer 0x01 to 0x60: notes from C-0 to B-7 (or index of sample pointer for DPCM channel) commands 0xF0 to 0xFF are VOLUMES: low nibble is volume value commands 0xE0 to 0xEF are QUICK INSTRUMENT CHANGES: low nibble is instrument commands 0x80 to 0xB3 (potentially up to 0xDF) are other commands, mostly fx FX COMMANDS: These numbers are relative to Dn 0.5.0.2 with all expansion chips. If any effects are missing due to expansion chip or driver version, all following numbers are pushed back Drivers before Dn 0.3.1.0 had double commands: instead of 0x80,0x81,0x82, they would be 0x80,0x82,0x84... 0x80 + byte: instrument change (full range, used for instruments 0x10 to 0x3F) ONLY AFTER 0CC 0.3.14.3 0x81: "hold" instrument 0x82 + byte: don't read the number of empty rows after note; instead skip this fixed number of rows 0x83: resets the behaviour of 0x82 0x84 + byte: speed 0x85 + byte: tempo 0x86 + byte: Bxx with xx increased by 1 0x87 + byte: Dxx with xx increased by 1 0x88 + byte: Cxx, which has an (unused) parameter and terminates the pattern immediately 0x89 + byte: Exx (deprecated volume effect turned into 2a03 length counter effect) 0x8A: reset mutually-exclusive pitch-sliding effects (0xy, 3xx, 1xx, 2xx, Qxy, Rxy) 0x8B + byte: 1xx (2xx with linear pitch or in VRC7/FDS/N163 channels) 0x8C + byte: 2xx (1xx with linear pitch or in VRC7/FDS/N163 channels) 0x8D + byte: 3xx 0x8E + byte: 0xy 0x8F + byte: 4yx (xy nibbles are reversed compared to FT) 0x90 + byte: 7yx (xy nibbles are reversed compared to FT) 0x91 + byte: Pxx (value is reversed with linear pitch or in VRC7/FDS/N163 channels) 0x92: P80 0x93 + byte: Vxx (in Sunsoft channels, FT values [0,1,2,3,4,5,6,7] here are [0x00,0x40,0x80,0xC0,0x20,0x60,0xA0,0xE0]) 0x94 + byte: Gxx (if present, it's always the first command in a row) 0x95 + byte: Hxy or Ixy for 2A03 channels (H if bit 3 is set) 0x96 + byte: Zxx (DPCM) 0x97 + byte: Yxx (DPCM) 0x98 + byte: Qxy 0x99 + byte: Rxy 0x9A + byte: Axy 0x9B + byte: Sxx (except when it's the triangle channel's linear counter) 0x9C + byte: Xxx (DPCM) with xx increased by 1 0x9D + byte: Wxx (DPCM) 0x9E + byte: Lxx 0x9F + byte: Sxx (only linear counter for triangle channel) with xx decreased by 0x80 0xA0 + byte: Oxx where xx is the offset (relative to GROOVES start) to the groove 0xA1 + byte: Mxy 0xA2 + byte: Txy ONLY AFTER DN 0.5.0.0 0xA3 + byte: =xx (phase reset for general channels) 0xA4 + byte: =xx (phase reset for DPCM channel) 0xA5 + byte: Kxx 0xA6 + byte: Nxx ONLY IN DRIVERS USING VRC7 (the "vrc7" driver and the "all chips" driver) 0xA7 + byte: VRC7 Vxx (patch select) 0xA8 + byte: VRC7 Hxx (register select) 0xA9 + byte: VRC7 Ixx (register write) ONLY IN DRIVERS USING FDS (the "fds" driver and the "all chips" driver) 0xAA + byte: FDS Hxx 0xAB + byte: FDS Ixx 0xAC + byte: FDS Jxx ONLY AFTER 0CC 0.3.10 0xAD + byte: FDS Exx ONLY AFTER 0CC 0.3.12 0xAE + byte: FDS Zxx ONLY IN DRIVERS USING N163 (the "n163" driver and the "all chips" driver) 0xAF + byte: N163 Zxx (only in drivers using S5B, but these are at the end and kind of ignored) 0xB0 + byte: S5B Hxx 0xB1 + byte: S5B Ixx 0xB2 + byte: S5B Jxx 0xB3 + byte: S5B Wxx TODO: make a big table (csv?) showcasing how effect commands changed over time, down to the oldest version I've inspected (Vanilla 0.2.7)