After writing an experimental virtual machine I named Dynamic Virtual Machine (Abbreviated DVM), and learning a lot about virtual machine development along the way, It was time to write a virtual machine that countered the design decisions made in DVM to learn about the paths not taken. This meant a register machine instead of a stack machine, an integer based instruction set rather than a complex object based instruction set, fully byte-addressable memory rather than object-addressable memory, and no object tags to facilitate machine level dynamic type checking. This post, however, is not about describing the details of DVM in detail. Rather it is about the Muse Virtual Machine (Abbreviated MVM) ISA and its design. DVM may be brought up to compare and contrast to, however in such a case the feature being compared in DVM will be fully explained.
MVM is a virtual machine implementing the Muse ISA, a RISC load/store instruction set architecture. It is meant to be a target for compilers, while being designed to be run similar to the JVM, utilizing the underlying operating system-calls underneath with little overhead on a host operating system, or run as completely virtualized hardware with no access the underlying system. The goal of the Muse ISA was to allow for a virtual machine to be written easily in very few lines of code, and to keep instructions simple and easy to translate to native machine instructions.
The machine specified by Muse ISA is to have 32 addressable integer registers, 32 addressable floating point registers, a program counter, an instruction register, and a fixed-size stack for data/code to be addressed directly following the section of memory used for program data. Each instruction is fixed 32bits wide, and can be one of four formats.
A register machine model was chosen over a stack machine model for MVM, for a few opinionated reasons. The ideal machine to work with is a memory-to-memory machine, that is a machine where instructions directly address and operate on memory, rather than having some intermediate location such as registers or a stack. The problem with this kind of model is instruction size, as each address operand in an instruction will require 64 bits, so a fixed-width length instruction set would waste memory. This means some sort of variable sized instruction set must be used, or multiple instructions would be required to be able to address all addresses.
Since one of the goals of Muse ISA was simplicity, both in encoding, decoding, and execution, this means that a variable sized instruction set was out of the question. So now the option left is to consider using multiple smaller instructions to be able to address all memory addresses. Well, in order to use multiple instructions to address large memory ranges, the previous instruction's immediate value would need to be stored somewhere, perhaps in a memory cell that can be addressed by a small immediate value. We could call it a register... and that is why Muse ISA is designed as a register machine. It is the closest architecture to a memory-to-memory machine, and the instruction size is kept to a reasonable fixed-width of 32 bits.
Well what about a stack machine? Surely that could be simpler and have even fewer bits in the instruction width. Indeed, a stack machine can. DVM, the virtual machine I made before MVM, has an instruction size of 16 bits in its stack-based architecture. However, one problem with a 16 bit instruction width is that there is no room to store immediate values over 16 bits, especially if there are opcodes in those instructions. So there is no reason to use a stack machine because of instruction size, but there is a more definitive reason why a stack machine model was not used for Musa ISA. Since Muse ISA is about simplicity, and stack machine instructions become more complex as they must manage a stack pointer, a stack machine model would not have done the job.
In MVM, memory is byte-addressable. When executing a program the program counter starts at address 0, and increments in values of 4 after decoding, but before execution of, each instruction. So while the program counter is in units of bytes, it is always divisible by 4 bytes, or 32 bits. No instruction can change this alignment, guaranteeing program counter alignment with instructions at all times. As a side-note, this means misaligned instructions cannot be executed by MVM.
The rationale behind keeping this alignment is that doing so will never cause misalignment performance penalties, and prevent instructions from being split across two memory cells. Keeping 32-bit alignment from address 0 also extends the minimum and maximum of the range of addresses that a jump instruction can reach. Jumps are the only way to modify the program counter, are relative to the program counter, and are always done in units of 4 bytes. This maintains the property of an aligned program counter, and extends changes the max/min value a JMP can use from +-33,553,332, to +-134,217,728, or 25 bits to 27 bits for format 1, and 15 to 17 bits for format 2.
For the registers, there are a few registers reserved for important tasks. These are namely the stack pointer sp, and the memory-mapped I/O pointer io among others. These pointers are set to come directly after the program data. The stack pointer points a distance away from the end of the program data, where that distance is however much user of the virtual machine specifies (by default it is 1MB). This leaves 1MB for the stack to grow downwards into. Above that lies the I/O pointer, which has a similar configuration. The distance the I/O pointer lies above the stack is dependent on the number and kind of I/O devices supported by the system. When running MVM as a slim virtual machine, relying on the underlying OS syscalls, the I/O pointer is set to the same address as the stack pointer, and shouldn't be used. However, when running the VM with the intent of virtualizing hardware and running a guest OS, the I/O pointer will first point to data that describes the devices connected to the VM. The first of these devices will always be a 4-byte-mapped terminal device for the most basic I/O.
MVM uses a fairly standard-looking assembler syntax. Each line can contain up to four grammatical objects. A Label, which is optional, can be followed by an Operation, which is followed by at least one operand but potentially many. Finally the line is ended with an optional comment. Omitting everything except for the comment is also acceptable:
[ LABEL : ] SYMBOL OPERAND [ , OPERAND ]* [ # STRING ]"
There are no required sigils to signal certain literals or symbols, however all registers are prefixed with '%' as a visual convention. For example, to refer to register ra, one would write it as "%ra" in MVM assembler language file. Other symbols, such as opcodes, directives, and labels, do not have any prefix convention, nor require a prefix from the syntax.
L32 %t0, x loop: SUBI %t0, %t0, 1 JEZ %t0, exit JMP loop exit: HALT x: i32 1000000
LI %a0, 1 # Load the stdout file descriptor LI %a1, 12 # Load the number of characters to be LI %a2, x # Load the address of the string to be transmitted SYSCALL 1 exit: HALT x: str "hello world\n"
The rest of this article consists of some reference tables for the Muse ISA. This article serves as an introduction to Muse ISA, not as the full documentation. Although this is an incomplete description of the Muse ISA, as the design has been not been finalized, the most stable design decisions of the machine have been described. The remaining work that needs to be done is decide on floating point comparison instructions, figure out what function the registers between a7 ra should perform, the way in which global data is stored, and finally the method that the I/O section of memory specifies where each device is located to the programmer.
|Register Index||Name||Description||Volatile between calls?|
|r0||zero||Hardwired to zero||No|
|r4||io||Memory mapped I/O pointer||No|
|r16-r23||a0-a7||Function Argument/Return Values||Yes|
|Format||Immediate||Register 3||Register 2||Register 1||Opcode|
|1||26 bits||0 bits||0 bits||0 bits||6 bits|
|2||21 bits||0 bits||0 bits||5 bits||6 bits|
|3||16 bits||0 bits||5 bits||5 bits||6 bits|
|4||11 bits||5 bits||5 bits||5 bits||6 bits|
|HALT||0x01||1||Halt machine execution|
|SYSCALL||0x02||1||Call the operating system routine specified with im|
|JMP||0x03||1||Add im to pc|
|JAL||0x04||1||Store pc in rx, then adds im to pc|
|JR||0x05||2||Set pc to the value of register r1|
|JRL||0x06||2||Store pc in rx, then set pc to value in r1|
|JEZ||0x07||2||If the value in r1 equals 0, add im to pc|
|JNZ||0x08||2||If the value in r1 does not equals 0, add im to pc|
|JLZ||0x09||2||If the value in r1 is less than 0, add im to pc|
|JGZ||0x0a||2||If the value in r1 is greater than 0, add im to pc|
|LUI||0x0b||2||Load im and store it in bits 17-42 of r1|
|L8||0x0c||3||Load the next 8 bits of the value in r2 + im into r1|
|L16||0x0d||3||Load the next 16 bits of the value in r2 + im into r1|
|L32||0x0e||3||Load the next 32 bits of the value in r2 + im into r1|
|L64||0x0f||3||Load the next 64 bits of the value in r2 + im into r1|
|S8||0x10||3||Store the lower 8 bits of the value in r1 into the memory addressed by r2 + im|
|S16||0x11||3||Store the lower 16 bits of the value in r1 into the memory addressed by r2 + im|
|S32||0x12||3||Store the lower 32 bits of the value in r1 into the memory addressed by r2 + im|
|S64||0x13||3||Store the lower 64 bits of the value in r1 into the memory addressed by r2 + im|
|NOT||0x14||3||Invert the bytes of the value in r2, then store the result in r1|
|ADDI||0x15||3||Add im to the value in register r2, then store the result in r1|
|SUBI||0x16||3||Subtract im to the value in r2, then store the result in r1|
|ANDI||0x17||3||Bitwise AND im to the value in r2, then store the result in r1|
|ORI||0x18||3||Bitwise OR im to the value in r2, then store the result in r1|
|XORI||0x19||3||Bitwise XOR im to the value in r2, then store the result in r1|
|ADD||0x1a||4||Add the value in r2 to the value in r3, then store the result in r1|
|ADDU||0x1b||4||Add the unsigned value in r2 to the value in r3, then store the result in r1|
|SUB||0x1c||4||Subtract the value in r3 from the value in r2, then store the result in r1|
|SUBU||0x1d||4||Subtract the unsigned value in r3 from the value in r2, then store the result in r1|
|MUL||0x1e||4||Multiply the value in r2 to the value in r3, then store the result in r1|
|MULU||0x1f||4||Multiply the unsigned value in r2 by the value in r3, then store the result in r1|
|DIV||0x20||4||Divide the value in r2 by the value in r3, then store the result in r1|
|DIVU||0x21||4||Divide the unsigned value in r2 to the value in r3, then store the result in r1|
|MOD||0x22||4||Modulo the value in register r2 by the value in r3, then store the result in r1|
|SLL||0x23||4||Shift left logically the value in r2 by the value in r3, then store the result in r1|
|SRL||0x24||4||Shift right logically the value in r2 by the value in r3, then store the result in r1|
|SLA||0x25||4||Shift left arithmetically the value in r2 by the value in r3, then store the result in r1|
|SRA||0x26||4||Shift right arithmetically the value in r2 by the value in r3, then store the result in r1|
|AND||0x27||4||Bitwise AND the value in r2 by the value in r3, then store the result in r1|
|OR||0x28||4||Bitwise OR the value in r2 by the value in r3, then store the result in r1|
|XOR||0x29||4||Bitwise XOR the value in r2 by the value in r3, then store the result in r1|
|LCF64||0x2a||3||Load the next 64 bits of the value addressed by integer register r2 + im into floating point register r1|
|SCF64||0x2b||3||Store the lower 64 bits of floating point register r1 into the location addressed by integer register r2 + im|
|ADDF||0x2c||4||Add the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1|
|SUBF||0x2d||4||Subtract the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1|
|MULF||0x2e||4||Multiply the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1|
|DIVF||0x2f||4||Divide the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1|
|li||r1, im||Load a signed immediate value up to 64 bits|
|zero||n||Fill the next n bytes aligned to 32bits with the value 0|
|i32||x||Store the integer x into the next 32 bits|
|i64||x||Store the integer x into the next 64 bits|
|f32||x||Store the float x into the next 32 bits|
|f64||x||Store the float x into the next 64 bits|
|str||s||Stores the string s into the next n bytes, where n is the length of s aligned to 32bits|