⌈LUNAR GARDEN⌋

toddgaunt@protonmail.ch

Muse ISA 16-bit Instruction Format

2018-09-23

The redesign of the Muse ISA, the underlying architecture used by MVM, from a fixed-width 32 bit format to a 16-bit fixed width format involved some tricky decisions in order to fit as much of the features of the 32-bit width ISA into the 16-bit width ISA. The new 16 bit format is still capable of addressing 64 bits of memory at a time, however this can be expanded up to 256 bytes for future register size expansion.

One of the reasons I decided to use a 32-bit width format was to allow for 21, 16, and 11 bit immediate integer values to be encoded in the unused bits of each instruction. For most use cases, a 16 bit immediate would serve most purposes for instructions such as ADD-IMMEDIATE, which often only ever worked with smaller values that could be encoded within 16 bits. However, this limited immediate size displayed problems when it came to loading addresses or, obviously, immediates that couldn't fit into 16 bits.

To address the shortcomings of a 21-bit or 16-bit immediate value, the instruction LOAD-UPPER-IMMEDIATE was borrowed from similar RISC architectures, which would load an immediate value into the upper bits of a register. Using a combination of LOAD-UPPER-IMMEDIATE and OR-IMMEDIATE, one could load up to a 32-bit constant in just two instructions. There was still a unpleasant imperfection with this, as loading a 64 bit address would require 5 instructions, two LOAD-UPPER-IMMEDIATES, two OR-IMMEDIATES, and one SHIFT-LEFT-LOGICAL. This would mean it would be better to steal memory from the global address space and just use a LOAD64 instruction to fetch a 64 bit constant from memory, but I would rather avoid doing that.

The following tables show the two instruction formats, using the same key. Each bit is represented by a single letter in the Bit Layout column: 'i' siginifies a bit used by an immediate, 'r' signifies a bit used by a register, 'o' signifies a bit used by an opcode. A '.' seperates fields of different letters to distinguish two contiguous but distinct register fields. As an example, a format of iiii.rr.rr.oo uses a 2-bit opcode, has two 2-bit register fields, and at the end has a 4-bit immediate field.

Figure 1: 32-bit width Instruction format
Format Bit Layout Immediate Size
1 iiiiiiiiiiiiiiiiiiiiiiiiii.oooooo 26 bits
2 iiiiiiiiiiiiiiiiiiiii.rrrrr.oooooo 21 bits
3 iiiiiiiiiiiiiiii.rrrrr.rrrrr.oooooo 16 bits
4 iiiiiiiiiii.rrrrr.rrrrr.rrrrr.oooooo
Figure 2: 16-bit width Instruction format
Format Bit Layout Immediate Size
TYPE-0S iiiiiiiiii.oo.00 12 bits
TYPE-0J iii.rrrr.rrrr.ooo.01 3 bits
TYPE-0L iii.rrrr.rrrr.ooo.10 3 bits
TYPE-0A rrrr.rrr.rrr.oooo.10 0 bits

The new 16-bit format solves the problem of loading constants larger than 4 bytes quite well, and infact overcompensates by allowing up to 256 byte constants. To do this, Instructions that require constants (TYPE-0J and TYPE-0L) have a small 3-bit immediate field. Of course, a 3 bit field is not suficiently large enough to hold the actual constant value the operation needs. This 3-bit immediate field instead provides the instruction a way to encode how much immediate space is needed for the constant. These three bits encode how many bytes following the instruction should be loaded, used as a constant, and then skipped over by incrementing the pc by the instruction. The number of bytes to be loaded is computed by shifting the value 1 left by whatever the value of the 3-bit immediate value is. For example if the left-most bits of the ADD-IMMEDIATE instruction (TYPE-0L) are set to the value 3, then the next 1 << 3, or 8 bytes are loaded and added to the second register argument, and the result is stored in the first register argument. All instructions that use these 3 left-most bits to load following bytes as constant values also automatically increment the program counter to skip those bytes to the next instruction.

Downsides of the 16-bit format

While the 16-bit width format can produce much smaller code than the 32-bit format, and allows constants of up to 256 bytes to be encoded, it isn't without fault. Because of the need to first read the right-most two bit prefix to known which format an instruction is before being able to read the actual opcode, decoding an instruction has become more complex and slightly slower than the 32-bit format. Another compromise, which isn't really that detrimental but worth mentioning, is that the number of registers has been reduced from 32 to 16 appropriately, since the maximum register index that can be addressed by the 16-bit format is the 16th register. Its worth noting that while the TYPE-0A format cannot adress all 16 register with the first two register fields, the convention of the new design puts all 7 registers designated for temporary values within the first 8 registers, the first register being hardwired to the value 0. Since the third register argument of TYPE-0A can still address all 16 registers, it is possible to do arithmetic on all 16 integers still using MOV to shuffle values in and out of the the first 8 registers that TYPE-0A can save results to.

The new instruction set is described in fig3, and can be compared to the one in my previous article.

Figure 3: The new 16-bit Instruction Listing
Opcode Format HEX Binary Description
NOP TYPE-0S 0x00 0000 Do nothing
SYSCALL TYPE-0S 0x04 0100 Call the OS routine specified with im
MODE TYPE-0S 0x08 1000 Set the mode to im (0 halts the vm)
JRL TYPE-0J 0x01 00001 r1 := pc, pc := r2
JAL TYPE-0J 0x05 00101 r1 := pc, pc := pc + mem[pc, pc + im]
JEQ TYPE-0J 0x09 01001 if r1 == r2, pc := pc + mem[pc, pc + im]
JNE TYPE-0J 0x0D 01101 if r1 != r2, pc := pc + mem[pc, pc + im]
JLT TYPE-0J 0x11 10001 if r1 < r2, pc := pc + mem[pc, pc + im]
JGT TYPE-0J 0x15 10101 if r1 > r2, pc := pc + mem[pc, pc + im]
JLE TYPE-0J 0x19 11001 if r1 <= r2, pc := pc + mem[pc, pc + im]
JGE TYPE-0J 0x1D 11101 if r1 >= r2, pc := pc + mem[pc, pc + im]
LDR TYPE-0L 0x02 00010 r1 := mem[r2 + mem[pc, pc + 4], r2 + mem[pc, pc + 4] + im]
STR TYPE-0L 0x06 00110 mem[r2 + mem[pc, pc + 4], r2 + mem[pc, pc + 4] + im] := r1
MOV TYPE-0L 0x0A 01010 Work in Progress
--- TYPE-0L 0x0E 01110 Reserved Instruction
ADDI TYPE-0L 0x12 10010 r1 := r2 + mem[pc, pc + im]
ANDI TYPE-0L 0x16 10110 r1 := r2 & mem[pc, pc + im]
ORI TYPE-0L 0x1A 11010 r1 := r2 || mem[pc, pc + im]
XORI TYPE-0L 0x1E 11110 r1 := r2 ^ mem[pc, pc + im]
ADD TYPE-0A 0x03 000011 r1 := r2 + r3
SUB TYPE-0A 0x07 000111 r1 := r2 - r3
SLL TYPE-0A 0x0B 001011 r1 := r2 << r3
SRL TYPE-0A 0x0F 001111 r1 := r2 >> r3
NAND TYPE-0A 0x13 010011 r1 := !(r2 & r3)
AND TYPE-0A 0x17 010111 r1 := r2 & r3
OR TYPE-0A 0x1B 011011 r1 := r2 || r3
XOR TYPE-0A 0x1F 011111 r1 := r2 ^ r3
MUL TYPE-0A 0x23 100011 r1 := r2 * *r3
DIV TYPE-0A 0x27 100111 r1 := r2 / r3
MOD TYPE-0A 0x2B 101011 r1 := r2 % r3
ADDF TYPE-0A 0x2F 101111 r1 := r2 + r3
SUBF TYPE-0A 0x33 110011 r1 := r2 - r3
MULF TYPE-0A 0x37 110111 r1 := r2 * *r3
DIVF TYPE-0A 0x3B 111011 r1 := r2 / r3
CMPF TYPE-0A 0x3F 111111 r1 := cmpf(r2, r3)
Figure 4: The New 16-bit Standard Directives
Directive Format Transformation Description
LI LI %reg integer-literal ADDI %reg, %zero, im Load a signed immediate value up to 64 bits
LA LA %reg, label ADDI %reg, %zero, label Load the absolute address of a label into a register
ZERO ZERO n NOP * n + x % 2 Fill the next n bytes aligned to 2 bytes with zero
I8 I8 integer-literal integer-literal Fill the next byte with a signed integer
I16 I16 integer-literal integer-literal Fill the next 2 bytes with a signed integer
I32 I32 integer-literal integer-literal Fill the next 4 bytes with a signed integer
I64 I64 integer-literal integer-literal Fill the next 8 bytes with a signed integer
F32 F32 float-literal float-literal Fill the next 4 bytes with a IEEE 32-bit float
F64 F64 float-literal float-literal Fill the next 8 bytes with a IEEE 64-bit float
UTF-8 UTF-8 string-literal sl + (len(string-literal) % 2) Fill the next len(string-literal) bytes with str aligned to 2 bytes
-Todd Gaunt
• return •