Operands

An operand is some data that an instruction operates on. As such, they are similar to parameters to functions in a higher level language. Each operand has a size associated with it. The size of an operand affects the selection of instructions issued to the CPU. In most cases, the Intermediate Language will verify that all operands to the same instruction have the same size. The Intermediate Language supports four sizes of operands in general: bytes (1 byte), integers (4 bytes), longs (8 bytes), and pointers (4 or 8 bytes, depending on the platform). Note that the pointer size is considered to be a type distinct from both integers and words, since the size of a pointer may vary across platforms.

In the string representations, operands are always prefixed by their size. The following prefixes are used:

Operands are represented by the core.asm.Operand type. It can store the the types of operands described in the remainder of this page.

In the remainder of this page, we will use the following conventions:

Constants (Literals)

A constant is a literal value that is used by an instruction. There are two types of constants. The simplest type has the same value for all platforms. These are created by functions named <type>Const. For example, intConst(5) creates the integer constant i5. Additionally, ptrConst(Nat) can be used to create a pointer-sized constant from a Nat. To dynamically select the size of the constant, the xConst function can be used.

The second type is called a dual constant. It is a constant that has two separate values, one that is used on 32-bit systems and another that is used on 64-bit systems. This is useful to store Size and Offsets that may be different on different platforms. For this, the overloads of intConst, natConst, and ptrConst may be used.

It is also possible to store pointers to objects as constants in the code. Such constants are created with the objPtr function.

References

A lightweight reference can be implicitly converted into an operand. Since references are pointer sized by default, the operands are treated as a pointer-sized constant.

References are represented as @<title>, where <title> is what the title member of the associated RefSource returns.

Registers

The Intermediate Language provides three registers that are available for general purpose usage. These are called a, b, and c. By convention, the register a is used to store the return value from functions. Since the calling convention is typically handled automatically by the Intermediate Representation, the only implication of this fact is that the system is able to generate slightly more efficient code if the return value is already located in the a register.

As is the case in many assembler dialects, each register has multiple names so that it is possible to refer to different sizes of the same register easily. The Intermediate Language uses a naming convention similar to the Intel CPU:s. Byte-sized registers are named <r>l (the low byte of register <r>), integer sized registers are named e<r>x, and long sized registers are named r<r>x. Pointer sized registers are named ptr<R>. The names are summarized below:

Size:ByteIntegerLongPointer
Register a
al
eax
rax
ptrA
Register b
bl
ebx
rbx
ptrB
Register c
cl
ecx
rcx
ptrC

The above-mentioned names are available as constants in Basic Storm.

Note: it is generally not possible to make assumptions about the contents of the parts of a register that is not written to. For example, if an instruction writes to al, the lowest byte of register a, then it is not possible to assume anything about the first three bytes of register eax, for example. Different architectures behave differently. For example, on 32-bit X86 systems, the old contents of the top 3 bytes remains unchanged, while they are zeroed on ARM. As such, use a suitable cast when converting to a wider type. Casts in the other direction is, however, well-defined. That is, writing to eax and then reading from al will yield the low byte of the value stored to eax.

There are also two special-purpose registers available in the Intermediate Representation: ptrStack and ptrFrame. The system uses them to refer to the current top of the stack (ptrStack), and the start of the current stack frame (ptrFrame). These are used internally to access local variables. While it is possible to modify them, it is not a good idea as it will make variable access and exception handling to break.

The Intermediate Language does support many more registers in order to successively lower code from platform independent code into platform dependent code. Using registers other than the ones mentioned above will, however, produce code that is not platform independent.

Registers are represented by their name. Registers are the only operands that are not prefixed by b, i, l, or p as is the case for all other operands.

Variables

Variables can be used as an operand. The Operand type contains a type-cast constructor that automatically converts variables into corresponding operands. Manual type-conversions are thus typically not necessary.

In some cases, it is useful to refer to a particular part inside a variable. This can be achieved using any of the <size>Rel overloads. This overload creates an operand that refers to a chunk of the variable that is <size> bytes large, and starts at the supplied offset. There is also an overload xRel for supplying the size as a parameter. It is worth noting that the supplied offset needs to be aligned to the natural alignment of the underlying type to ensure that it can be accessed properly on all platforms (e.g. ARM does not always support unaligned memory access).

Variables are represented as [Var<id>], or [Var<id>+<offset>] in the printed representation.

Blocks

As with variables, blocks can be implicitly converted into an operand. Blocks are only used as operands to the instructions that deal with blocks. It is not possible to otherwise access or manipulate blocks in the generated code, as the concept of a block only exists at compile-time.

Blocks are represented as [Block<id>] in the printed representation.

Labels

As with variables and blocks, labels can also be implicitly converted to an operand. Labels are typically only used with jump instructions to direct program flow. The Operand does, however, support some more advanced features, such as offsets from labels, to simplify the implementation of code generation backends. These features are, however, not universally supported.

Labels are represented as [Label<id>] in the printed representation.

Pointers

Operands may also refer to memory by using the value in a pointer-sized register as an address. The address may also optionally be offset by a constant offset. This type of operands are created by the <size>Rel overloads that accept a register as their first parameter. For example, accessing the 4 bytes that ptrA points to can be done by the operand intRel(ptrA). The next four bytes can be accessed by intRel(ptrA, Offset(sInt)). There is also a function xRel that accepts a Size instance as its first parameter that allows selecting the size dynamically.

As with offsets into variables, it is important to ensure that the final address is properly aligned. Otherwise, some systems will be unable to perform the memory access.

Pointers are represented as [<reg>+<offset>] in the output, where <reg> is the name of the register, and <offset> is the offset from the pointer.

Condition Flags

Finally, an operand can contain a condition flag (core.asm.CondFlag). The condition flag is only used as an operand to conditional operations to determine what to check for. The CondFlag enum contains the following comparisions:

Generic comparisons

Unsigned comparisons

These comparisons treat the compared numbers as unsigned integers.

Signed comparisons

These comparisons treat the compared numbers as signed integers.

Floating point comparisons

Source Position

It is also possible to store a SrcPos object inside an operand. This is used to emit the location pseudo-operation to allow including metadata for connecting the generated machine code to the corresponding source code.