[COFF] Dragon Quest Disassembly
segaloco via COFF
coff at tuhs.org
Wed Sep 6 02:56:58 AEST 2023
Howdy all, I've made some mention of this in the past but it's finally at a point I feel like formally plugging it. For the past few years I've been tinkering on a disassembly of Dragon Quest as originally released on the Famicom (NES) in Japan. The NA release had already gotten this treatment but I wanted to produce something a little more clean and separated: https://gitlab.com/segaloco/dq1disasm
Another reason for this project is I've longed for a Lions's Commentary on UNIX-quality analysis of a scrollplane/sprite-driven console video game as I find generally analysis of this type of hardware platform lacking outside of random, chance blog posts and incredibly specialized communities where it takes forever to find information you're seeking. In the longer term I intend to produce a commentary on this codebase covering all the details of how it makes the Famicom tick.
So a few of the difficulties involved:
The Famicom is 6502-based with 32768 bytes of ROM address space directly accessible and another 2048 bytes of ROM accessible through the Picture Processing Unit or PPU. The latter is typically graphics ROM being displayed directly by the PPU but data can be copied to CPU memory space as well. To achieve higher sizes, bank swapping is widely employed, with scores of different bank swapping configurations in place. Dragon Quest presents an interesting scenario in that the Japanese and North American releases use *different* mapper schemes, so while much of the code is the same, some parts were apples to oranges between the existing and new disassembly. Where this has been challenging (and a challenge many other Famicom disassemblers haven't taken) is getting a build that resembles a typical software product, i.e. things nicely split out with little concern as to what bank they're going to wind up on. My choice of linker facilitates this by being able to define separate output binaries for different memory segments, so I can produce these banks as distinct binaries and the population thereof is simply based on segment. Moving items between banks then becomes simply changing the segment and adding/removing a bank-in before calling out. I don't know that I've ever seen a Famicom disassembly do this but I also can't say I've gone looking heavily.
Another challenge has been the language, and not necessarily just because it's Japanese and not my native language. So old Japanese computing is largely in ShiftJIS whereas we now have UTF-8. Well, Dragon Quest uses neither of these as they instead use indices into a table of kana character tiles, as they are ordered in memory, as the string format. What this means is to actually effectively manage strings, and I haven't done this yet, is one needs a to-fro encoder that can convert between UTF-8 Japanese and the particular positioning of their various characters. This says nothing of the difficulty then of translating the game. Most Japanese games of this era used exclusively kana for two reasons: These are games for children, so too much complicated Kanji and you've locked out your target audience. The other reason is space; it would be impossible to fit all the necessary Kanji in memory, even as 8x8 pixel tiles (the main graphic primitive on these chips.) Plus, even if you could, an 8x8 pixel tile is hardly enough resolution to read many Kanji, so they'd likely need 16x16, quadrupling the space requirement if you didn't recycle quadrant radicals. In any case, what this means is all of the strings are going to translate to strictly hiragana or katakana strings, which are Japanese, but not as easily intelligible as the groupings of kana then have to be interpreted into their meanings by context clues often times rather than having an exact definition. The good news though is again these are games for children, so the vocabulary shouldn't be too complicated.
Endianness occasionally presents some problems, one of which I suspect because the chip designers didn't talk to each other...So the PPU exposes a register you write two bytes to, the high and low byte of the address to place the next value written to the data register to. Problem is, this is a 6502-driven system, so when grabbing a word the high byte is the second one. This means that every word has to be sent to the PPU in reverse when selecting an address. This PPU to my knowledge is based on some arcade hardware but otherwise was based on designs strictly for this console, so why they didn't present an address register in the same endianness I'll never know. It tripped me up early on but I worked past it.
Finally, back to the mappers, since a "binary" was really a gaggle of potentially overlapping ROM banks, there wasn't really a single ROM image format used by Nintendo back in the day. Rather, you sent along your banks, layout, and what mapper hardware you needed and that last step was a function of cartridge fab, not some software step. Well, the iNES format is an accommodation for that, presenting the string "NES" as a magic number in a 16-byte header that also contains a few bytes describing the physical cartridge in terms of what mapper hardware, how many banks, what kind, and particular jumpers that influence things like nametable mirroring and the like. Countless disassembles out there are built under the (incorrect) assumption that the iNES binary format is how the binary objects are supposed to build, but this is simply not the case, and forcing this "false structure" actually makes then analysis of the layout and organization of the original project more difficult. What I opted towards instead is using the above-described mechanism of linker scripts defining individual binaries in tandem with two nice old tools: printf(1) and cat(1). When all my individual ROM banks are built, I then simply use printf to spit out the 16 bytes of an iNES header that match my memory particulars and then cat it all together as the iNES format is simply that header then all of the PRG and CHR ROM banks organized in that order. The end result is a build that is agnostic (rightfully so) of the fact that it is becoming an iNES ROM, which is a community invention years after the Famicom was in active use.
Note that this is still an ongoing project. I don't consider it "done" but it is more usable than not now. Some things to look out for:
Any relative positioning of specific graphic data in CHR banks outside that which has been extracted into specific files is not dynamic, meaning moving a tileset around in CHR will garble display of the object associated.
A few songs (credits, dungeons, title screen) do not have every pointer massaged out of them yet, so if their addresses change, those songs at the very least won't sound right, and in extreme circumstances playing them with a shift in pointers could crash the game.
Much of the binary data is still BLOB-afied, hence not each and every pointer being fixed. This will be slow moving and there may be a few parts that wind up involving some scripting to keep them all dynamic (for instance, maps include block IDs and such which are ultimately an index into a table, but the maps make more sense to keep as binary data, so perhaps a patcher is needed to "filter" a map to assign final block IDs, hard to say.)
Also, while some data is very heavily coupled to the engine, such as music, and as such has been disassembled into audio engine commands, other data, like graphic tiles, are more generic to the hardware platform and as such do not have any particular dependencies on the code. As such, these items can exist as pure BLOBs without worry of any pointers buried within or ID values that need to track indices in a particular table. As such these flat binary tiles and other comparable components are *not* included in this repository, as they are copyright of their respective owners. However, I have provided a script that allows you to extract these binary components should you have a clean copy of Dragon Quest for the Famicom as an iNES ROM. The script is all dd(1) calls based on the iNES ROM itself, but there should be enough notes contained with in if a bank-based extraction is necessary. YMMV using the script on anything other than the commercial release. It is only intended to bootstrap things, it will not properly work to extract the assets from arbitrary builds.
If anyone has any questions/comments feel free to contact me off list, or if there is worthwhile group discussion to be had, either way, enjoy!
- Matt G.
 - https://github.com/nmikstas/dragon-warrior-disassembly
More information about the COFF