Charles Childers


blog | parable | retro | ios apps | fountain pens


November 3, 2016

A Decompiler for Retro 12

Being able to examine compiled code can help in identifying obscure bugs and gaining a deeper understanding of how things work. For this a decompiler is rather useful.

A simple binary dump is pretty easy:

````
:dump (an-)
  [ fetch-next over putn putn chr:LF putc ] times drop ;
````

This can be useful, but the output doesn't help a lot. Consider an example:

5162 2049 
5163 4593 
5164 1 
5165 4 
5166 1 
5167 5173 
5168 7 
5169 2049 
5170 4634 
5171 2049 
5172 4352 
5173 10 

The left column is the offset, the right is the stored value.

It'd be much more useful to map the stored values to instruction names. This is complicated by the fact that Nga allows for packing up to four instructions per cell. To decompile effectively we need a way to unpack them.

````
{{
  :mask #255 and ;
  :next #8 shift ;
  :reorder (abcd-dcba)
    rot push rot push swap pop pop swap ;
---reveal---
  :unpack (n-dcba)
    dup mask swap next
    dup mask swap next
    dup mask swap next
    reorder ;
}}
````

With this I can then proceed to write a quick and dirty function that maps opcodes to a symbolic short name. As with the Naje assembler, I use two characters for each (this is sufficient to identify all of the Nga instructions).

The NOP instruction is represented by two periods (I do this for readability purposes). Unrecognized values are rendered as two question marks.

````
:name-instruction
   #0 [ '.. ] case
   #1 [ 'LI ] case
   #2 [ 'DU ] case
   #3 [ 'DR ] case
   #4 [ 'SW ] case
   #5 [ 'PU ] case
   #6 [ 'PO ] case
   #7 [ 'JU ] case
   #8 [ 'CA ] case
   #9 [ 'CC ] case
  #10 [ 'RE ] case
  #11 [ 'EQ ] case
  #12 [ 'NE ] case
  #13 [ 'LT ] case
  #14 [ 'GT ] case
  #15 [ 'FE ] case
  #16 [ 'ST ] case
  #17 [ 'AD ] case
  #18 [ 'SU ] case
  #19 [ 'MU ] case
  #20 [ 'DI ] case
  #21 [ 'AN ] case
  #22 [ 'OR ] case
  #23 [ 'XO ] case
  #24 [ 'SH ] case
  #25 [ 'ZR ] case
  #26 [ 'EN ] case
  drop '?? ;
````

And tying together:

````
:render-packed (n-)
  unpack #4 [ name-instruction puts ] times ;

:disassemble (an-)
  [ fetch-next
    over putn         (addres)
    dup render-packed (inst)
    chr:SPACE putc
    putn              (opcode)
    chr:LF putc
  ] times drop ;
````

This gives an output like:

5120 LICA.... 2049 
5121 ??AD.... 4593 
5122 LI...... 1 
5123 SW...... 4 
5124 LI...... 1 
5125 EQDI.... 5131 
5126 JU...... 7 
5127 LICA.... 2049 
5128 ENSU.... 4634 
5129 LICA.... 2049 
5130 ..AD.... 4352 
5131 RE...... 10 

Still somewhat cryptic, but it's enough to let me identify instructions and data.