⌈LUNAR GARDEN⌋

toddgaunt@protonmail.ch

Muse Virtual Machine ISA

2018-09-20

Introduction

After writing an experimental virtual machine I named Dynamic Virtual Machine (Abbreviated DVM), and learning a lot about virtual machine development along the way, It was time to write a virtual machine that countered the design decisions made in DVM to learn about the paths not taken. This meant a register machine instead of a stack machine, an integer based instruction set rather than a complex object based instruction set, fully byte-addressable memory rather than object-addressable memory, and no object tags to facilitate machine level dynamic type checking. This post, however, is not about describing the details of DVM in detail. Rather it is about the Muse Virtual Machine (Abbreviated MVM) ISA and its design. DVM may be brought up to compare and contrast to, however in such a case the feature being compared in DVM will be fully explained.

MVM is a virtual machine implementing the Muse ISA, a RISC load/store instruction set architecture. It is meant to be a target for compilers, while being designed to be run similar to the JVM, utilizing the underlying operating system-calls underneath with little overhead on a host operating system, or run as completely virtualized hardware with no access the underlying system. The goal of the Muse ISA was to allow for a virtual machine to be written easily in very few lines of code, and to keep instructions simple and easy to translate to native machine instructions.

Muse ISA

The machine specified by Muse ISA is to have 32 addressable integer registers, 32 addressable floating point registers, a program counter, an instruction register, and a fixed-size stack for data/code to be addressed directly following the section of memory used for program data. Each instruction is fixed 32bits wide, and can be one of four formats.

A register machine model was chosen over a stack machine model for MVM, for a few opinionated reasons. The ideal machine to work with is a memory-to-memory machine, that is a machine where instructions directly address and operate on memory, rather than having some intermediate location such as registers or a stack. The problem with this kind of model is instruction size, as each address operand in an instruction will require 64 bits, so a fixed-width length instruction set would waste memory. This means some sort of variable sized instruction set must be used, or multiple instructions would be required to be able to address all addresses.

Since one of the goals of Muse ISA was simplicity, both in encoding, decoding, and execution, this means that a variable sized instruction set was out of the question. So now the option left is to consider using multiple smaller instructions to be able to address all memory addresses. Well, in order to use multiple instructions to address large memory ranges, the previous instruction's immediate value would need to be stored somewhere, perhaps in a memory cell that can be addressed by a small immediate value. We could call it a register... and that is why Muse ISA is designed as a register machine. It is the closest architecture to a memory-to-memory machine, and the instruction size is kept to a reasonable fixed-width of 32 bits.

Well what about a stack machine? Surely that could be simpler and have even fewer bits in the instruction width. Indeed, a stack machine can. DVM, the virtual machine I made before MVM, has an instruction size of 16 bits in its stack-based architecture. However, one problem with a 16 bit instruction width is that there is no room to store immediate values over 16 bits, especially if there are opcodes in those instructions. So there is no reason to use a stack machine because of instruction size, but there is a more definitive reason why a stack machine model was not used for Musa ISA. Since Muse ISA is about simplicity, and stack machine instructions become more complex as they must manage a stack pointer, a stack machine model would not have done the job.

In MVM, memory is byte-addressable. When executing a program the program counter starts at address 0, and increments in values of 4 after decoding, but before execution of, each instruction. So while the program counter is in units of bytes, it is always divisible by 4 bytes, or 32 bits. No instruction can change this alignment, guaranteeing program counter alignment with instructions at all times. As a side-note, this means misaligned instructions cannot be executed by MVM.

The rationale behind keeping this alignment is that doing so will never cause misalignment performance penalties, and prevent instructions from being split across two memory cells. Keeping 32-bit alignment from address 0 also extends the minimum and maximum of the range of addresses that a jump instruction can reach. Jumps are the only way to modify the program counter, are relative to the program counter, and are always done in units of 4 bytes. This maintains the property of an aligned program counter, and extends changes the max/min value a JMP can use from +-33,553,332, to +-134,217,728, or 25 bits to 27 bits for format 1, and 15 to 17 bits for format 2.

For the registers, there are a few registers reserved for important tasks. These are namely the stack pointer sp, and the memory-mapped I/O pointer io among others. These pointers are set to come directly after the program data. The stack pointer points a distance away from the end of the program data, where that distance is however much user of the virtual machine specifies (by default it is 1MB). This leaves 1MB for the stack to grow downwards into. Above that lies the I/O pointer, which has a similar configuration. The distance the I/O pointer lies above the stack is dependent on the number and kind of I/O devices supported by the system. When running MVM as a slim virtual machine, relying on the underlying OS syscalls, the I/O pointer is set to the same address as the stack pointer, and shouldn't be used. However, when running the VM with the intent of virtualizing hardware and running a guest OS, the I/O pointer will first point to data that describes the devices connected to the VM. The first of these devices will always be a 4-byte-mapped terminal device for the most basic I/O.

Syntax and Some Programs

MVM uses a fairly standard-looking assembler syntax. Each line can contain up to four grammatical objects. A Label, which is optional, can be followed by an Operation, which is followed by at least one operand but potentially many. Finally the line is ended with an optional comment. Omitting everything except for the comment is also acceptable:

[ LABEL : ] SYMBOL OPERAND [ , OPERAND ]* [ # STRING ]"

There are no required sigils to signal certain literals or symbols, however all registers are prefixed with '%' as a visual convention. For example, to refer to register ra, one would write it as "%ra" in MVM assembler language file. Other symbols, such as opcodes, directives, and labels, do not have any prefix convention, nor require a prefix from the syntax.

A program that loops 1,000,000 times

                L32  %t0, x
        loop:   SUBI %t0, %t0, 1
                JEZ  %t0, exit
                JMP loop
        exit:   HALT
        x: i32 1000000

A program that uses the operating system to print "hello world"

                LI %a0, 1         # Load the stdout file descriptor
                LI %a1, 12        # Load the number of characters to be
                LI %a2, x         # Load the address of the string to be transmitted
                SYSCALL 1
        exit:   HALT
        x: str "hello world\n"

The rest of this article consists of some reference tables for the Muse ISA. This article serves as an introduction to Muse ISA, not as the full documentation. Although this is an incomplete description of the Muse ISA, as the design has been not been finalized, the most stable design decisions of the machine have been described. The remaining work that needs to be done is decide on floating point comparison instructions, figure out what function the registers between a7 ra should perform, the way in which global data is stored, and finally the method that the I/O section of memory specifies where each device is located to the programmer.

Muse ISA Reference Tables

Registers and Calling Conventions:
Register Index Name Description Volatile between calls?
r0 zero Hardwired to zero No
r1 sp Stack Pointer No
r2 gp Global Pointer No
r3 tp Thread Pointer No
r4 io Memory mapped I/O pointer No
r5 fp Frame Pointer No
r6-r15 t0-t9 Temporaries Yes
r16-r23 a0-a7 Function Argument/Return Values Yes
r31 ra Return Address No
Instruction Formats:
Format Immediate Register 3 Register 2 Register 1 Opcode
1 26 bits 0 bits 0 bits 0 bits 6 bits
2 21 bits 0 bits 0 bits 5 bits 6 bits
3 16 bits 0 bits 5 bits 5 bits 6 bits
4 11 bits 5 bits 5 bits 5 bits 6 bits
Standard Instructions:
Opcode Encoding Format Description
NOP 0x00 1 Do nothing
HALT 0x01 1 Halt machine execution
SYSCALL 0x02 1 Call the operating system routine specified with im
JMP 0x03 1 Add im to pc
JAL 0x04 1 Store pc in rx, then adds im to pc
JR 0x05 2 Set pc to the value of register r1
JRL 0x06 2 Store pc in rx, then set pc to value in r1
JEZ 0x07 2 If the value in r1 equals 0, add im to pc
JNZ 0x08 2 If the value in r1 does not equals 0, add im to pc
JLZ 0x09 2 If the value in r1 is less than 0, add im to pc
JGZ 0x0a 2 If the value in r1 is greater than 0, add im to pc
LUI 0x0b 2 Load im and store it in bits 17-42 of r1
L8 0x0c 3 Load the next 8 bits of the value in r2 + im into r1
L16 0x0d 3 Load the next 16 bits of the value in r2 + im into r1
L32 0x0e 3 Load the next 32 bits of the value in r2 + im into r1
L64 0x0f 3 Load the next 64 bits of the value in r2 + im into r1
S8 0x10 3 Store the lower 8 bits of the value in r1 into the memory addressed by r2 + im
S16 0x11 3 Store the lower 16 bits of the value in r1 into the memory addressed by r2 + im
S32 0x12 3 Store the lower 32 bits of the value in r1 into the memory addressed by r2 + im
S64 0x13 3 Store the lower 64 bits of the value in r1 into the memory addressed by r2 + im
NOT 0x14 3 Invert the bytes of the value in r2, then store the result in r1
ADDI 0x15 3 Add im to the value in register r2, then store the result in r1
SUBI 0x16 3 Subtract im to the value in r2, then store the result in r1
ANDI 0x17 3 Bitwise AND im to the value in r2, then store the result in r1
ORI 0x18 3 Bitwise OR im to the value in r2, then store the result in r1
XORI 0x19 3 Bitwise XOR im to the value in r2, then store the result in r1
ADD 0x1a 4 Add the value in r2 to the value in r3, then store the result in r1
ADDU 0x1b 4 Add the unsigned value in r2 to the value in r3, then store the result in r1
SUB 0x1c 4 Subtract the value in r3 from the value in r2, then store the result in r1
SUBU 0x1d 4 Subtract the unsigned value in r3 from the value in r2, then store the result in r1
MUL 0x1e 4 Multiply the value in r2 to the value in r3, then store the result in r1
MULU 0x1f 4 Multiply the unsigned value in r2 by the value in r3, then store the result in r1
DIV 0x20 4 Divide the value in r2 by the value in r3, then store the result in r1
DIVU 0x21 4 Divide the unsigned value in r2 to the value in r3, then store the result in r1
MOD 0x22 4 Modulo the value in register r2 by the value in r3, then store the result in r1
SLL 0x23 4 Shift left logically the value in r2 by the value in r3, then store the result in r1
SRL 0x24 4 Shift right logically the value in r2 by the value in r3, then store the result in r1
SLA 0x25 4 Shift left arithmetically the value in r2 by the value in r3, then store the result in r1
SRA 0x26 4 Shift right arithmetically the value in r2 by the value in r3, then store the result in r1
AND 0x27 4 Bitwise AND the value in r2 by the value in r3, then store the result in r1
OR 0x28 4 Bitwise OR the value in r2 by the value in r3, then store the result in r1
XOR 0x29 4 Bitwise XOR the value in r2 by the value in r3, then store the result in r1
Floating Point Unit Instructions:
Opcode Encoding Format Description
LCF64 0x2a 3 Load the next 64 bits of the value addressed by integer register r2 + im into floating point register r1
SCF64 0x2b 3 Store the lower 64 bits of floating point register r1 into the location addressed by integer register r2 + im
ADDF 0x2c 4 Add the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1
SUBF 0x2d 4 Subtract the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1
MULF 0x2e 4 Multiply the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1
DIVF 0x2f 4 Divide the value of floating point register r2 to the value of floating point register r3, then store the result in floating point register r1
Standard Assembler Directives:
Directive Operands Description
li r1, im Load a signed immediate value up to 64 bits
zero n Fill the next n bytes aligned to 32bits with the value 0
i32 x Store the integer x into the next 32 bits
i64 x Store the integer x into the next 64 bits
f32 x Store the float x into the next 32 bits
f64 x Store the float x into the next 64 bits
str s Stores the string s into the next n bytes, where n is the length of s aligned to 32bits
-Todd Gaunt
• return •