Chapter 5

The Text Section

This chapter describes Sassy’s instruction syntax and its facilities for controlling the flow of computation. If you only want to write traditional-looking assembly programs with Sassy, consisting of just labels and instructions, then Sassy can handle that just fine.

(text <text-top-level> ...)

text-top-level = (label <label>)                             ; "empty" label definition
               | (label <label> <text-item> ...)             ; label definition
               | (locals (<label-name> ...) <text-item> ...) ; locals declaration
               | <text-item>                                 ; anonymous text
               | (align <amount>)
               | (text <text-item> ...)                      ; splice
               | (data <data-item> ...)                      ; one time section switch
               | (heap <heap-item> ...)                      ; ditto

text-item      = (label <label>)
               | (label <label> <text-item> ...)
               | (locals (<label-name> ...) <text-item> ...)
               | <instruction>
               | <assertion>
               | <control-primitive>
               | (bytes  <any-data> ...)                    ; raw data in a text section
               | (words  <any-data> ...)
               | (dwords <any-data> ...)

Note: in the text section, the align special form may only appear at a text directive’s top level.

5.1 Instructions

Sassy’s instruction syntax is based on Intel’s, except that each instruction looks like a Scheme function call. Thus there are no commas. Sassy uses the same instruction and register names and recognizes the same operands as those listed in the Intel manuals. The order of the operands is also identical, and so, like the set! idiom of Scheme, the destination comes first, the source second.

intel: add eax, 4 
intel: mov cx, 3 
 
sassy: (add eax 4) 
sassy: (mov cx 3)

5.1.1 Immediates

Immediates are usually integers of the appropriate size. If an operand is a dword-sized operand and you write a float, sassy converts it to its little-endian IEEE-754 single-precision representation.

You may also write characters and strings. sassy places the byte value of the character in the lowest order byte and places 0 in the rest. A string can have no more characters than the size in bytes of the operand; sassy pads the remainder with 0.

(mov eax (dword #\a))   => (mov eax #x61)
(mov eax (dword "abc")) => (mov eax #x61626300)

5.1.2 Addressing

Sassy currently only understands 32-bit addressing syntax, regardless of the current setting of the bits directive. In 16-bit mode, sassy emits an extra prefix byte (#x67) to signal to the processor that the following instruction is using 32-bit addressing syntax.

Write effective addresses using the following form:

(& <items> ...)

The <items> should be at least one, but not more than one of each, of the following in any order. The effective address is the implied sum of all the items.

Any number of integers (displacements)
Zero or one 32-bit general purpose registers (base)
Zero or one labels or custom relocations (displacements)
Zero or one indexes and scales, written as follows, where <scale> is 1, 2, 4, 8.
```
(* <32-bit-reg> <scale>)
(* <scale> <32-bit-reg>)
```

Some examples:

(add ecx (& edx))
(mov edx (& (* 8 ecx)))
(add eax (& #x64))
(mov eax (& foo (* ebx 4) edx 1000))
(add eax (& -1 2 -3 4 ebx -5 6 -7 8))

Sassy understands these idioms as well:

(& edx ebx)   => (& edx (* ebx 1)) ; If two registers are supplied,
                                   ; the second is assembled as an index with
                                   ; a scale of 1
(& (* eax 2)) => (& eax (* eax 1)) ; If only a scale and index are given,
                                   ; it is assembled as a base+index*scale/2

Finally, if you want to tell Sassy to emit a segment override prefix for a particular memory operand, use one of the following syntaxes for the addressing operand (If you are trying to generate branch taken/branch not taken prefixes, which are the same prefix byte as cs and ds, please see below):

(cs (& edx))
(ds (& (* 8 ecx)))
(ss (& #x64))
(es (& edx))
(fs (& (* 8 ecx)))
(gs (& #x64))

Sassy also includes some macros that translate into the above:

(cs: edx)       => (cs (& edx))
(ds: (* 8 ecs)) => (ds (& (* 8 ecs)))
(ss: #x64)      => (ss (& #x64))
(es: edx)       => (es (& edx))
(fs: (* 8 ecs)) => (fs (& (* 8 ecs)))
(gs: #x64)      => (gs (& #x64))

5.1.3 Operand Sizes

Because many of the x86’s instructions are overloaded, meaning the same instruction can sometimes accept different operands of various sizes in various orders, and will output different opcode sequences, Sassy has to try and infer the operand size from the context in which the operand appears. Sassy uses the opcode of the instruction itself, the other operands in the instruction, and the current setting of the bits directive (the default is 32), to do so.

If instead you would like to be explicit, you may use the supplied hinting mechansim to specify an operand size for immediates and memory addresses (registers always have an implied size):

(byte   <operand>) => 8-bit
(word   <operand>) => 16-bit
(dword  <operand>) => 32-bit
(qword  <operand>) => 64-bit
(tword  <operand>) => 80-bit
(dqword <operand>) => 128-bit

If you don’t use the hinting mechanism, Sassy tries, with one exception (see below), to match an ambiguous operand size to the size of another operand in the instruction. Any hint you supply to one operand will be used to infer the size of the other:

(mov ebx 4)      => (mov ebx (dword 4))
(mov cx (& foo)) => (mov cx (word (& foo)))
(mov al 100)     => (mov al (byte 100))
(mov (& foo) (byte 100)) => (mov (byte (& foo)) (byte 100))

If that’s not possible, Sassy examines the current bits setting and uses that size for the operands:

(bits 32)
(mov (& foo) 10) => (mov (dword (& foo)) (dword 10))

(bits 16)
(mov (& foo) 10) => (mov (word (& foo)) (word 10))

The exception to the above is the case where certain instructions can generate shorter opcode sequences when their source operand is an immediate and a byte, instead of a word or dword. In those cases, Sassy uses the shorter form when the source operand is in fact a byte. This applies to the following instructions: adc add and cmp or sbb sub xor push imul.

For example:

(add ecx 4)         => Sassy assumes the default of (add ecx (byte 4))
(add ecx (dword 4)) => Sassy uses the long form

Finally, for any floating-point, mmx, or sse instruction that can accept memory operands of different sizes, the default is always a dword-sized operand. In these cases, other operand sizes of memory addresses must be explicitly specified:

(fst (& foo))         => Sassy assumes the default of (fst (dword (& foo)))
(fst (qword (& foo))) => Explicit qword memory operand

5.1.4 Jumps and Calls

The normal syntax for writing direct branches or conditional branches is (jmp foo) or (jnz bar). For these direct branches that you write, sassy assumes that they are near branches, and thus generates 2-byte or 4-byte relative address depending on the current setting of the bits directive. You always write the branch target you want (not the relative distance - sassy computes that).

Some special forms exists for designating explicit short, near, and far versions of jmp, call, and the jcc-family of instructions. For branches that you write (not Sassy’s internally generated branches — see below), if you write a “short” branch, sassy assembles a short branch provided the branch target is within range. Otherwise an “out of range” error will be signalled.

(jnz short foo)
(jnz near  foo)

(jmp short foo)
(jmp near  foo)

For far jumps and calls to other segments, if you want to write a direct call, you specify a far pointer with two operands:

(jmp  <imm16>  <imm32>) ; jmp #x1234:12345678
(jmp  <imm16>  <imm16>) ; jmp #x1234:1234
(call <imm16>  <imm32>) ; call #x1234:12345678
(call <imm16>  <imm16>) ; call #x1234:1234

The first operand specifies the segment, and the second the offset into that segment. For either operand, you can specify an operand size of word or dword, to be explicit.

To write an indirect far call, where the segment and offset are specified at a memory address, you use the keyword far in the instruction:

(jmp far <mem32>)
(jmp far (word <mem32))
(call far <mem32>)
(call far (word <mem32))

5.1.5 Prefixes

Sassy knows the prefixes lock, rep, repe, repne, repz, and repnz. Write them in the following manner:

(<prefix> <instruction>)
e.g.
(lock (inc (& my-guard)))

Sassy also knows about the branch hint prefixes used to control the processor’s default branch-prediction behavior. Sassy uses brt to generate a “branch taken” prefix, and brnt to generate a “branch not taken” prefix. Use these prefixes with a jcc instruction, as above:

(brt  (jnz foo))
(brnt (jz foo))

5.2 Assertions

You can control the flow of computation by using “assertions” and “control primitives”.

Assertions check whether or not particular flags are set in x86’s “eflags” register, and alter the flow of computation accordingly by inserting conditional and unconditional branches. Exactly how the flow of computation is altered depends on their contextual use within a particular control primitive.

You write the assertions by writing the cc-code for the jcc-family of instructions followed by an exclamation point.

o!               => assert overflow
no!              => assert not overflow
b!  / c!  / nae! => assert carry
ae! / nb! / nc!  => assert not carry
e!  / z!         => assert zero
ne! / nz!        => assert not zero
be! / na!        => assert either carry or zero
a!  / nbe!       => assert neither carry or zero
s!               => assert sign
ns!              => assert not sign
p!  / pe!        => assert parity
np! / po!        => assert not parity
l!  / nge!       => assert less than
ge! / nl!        => assert greater than or equal to
le! / ng!        => assert less than or equal to
g!  / nle!       => assert greater than

Since assertions may succeed or fail, there are always two possible paths to take, called the “win” and “lose” continuations. In addition, control primitives themselves may also “win” or “lose” depending upon whether they succeed or fail, but instructions always succeed or “win”. By saying “something wins” I mean that the computation immediately proceeds with the “win” continuation, possibly via a branch, and when “something loses”, computation immediately proceeds with the “lose” continuation, also possibly by branching.

5.3 Control Primitives

In the following, item and refers to a <text-item>. For illustrative examples (and the code they compile down to), please have a look at the Scheme files in the tests/prims directory in Sassy’s distribution directory.

5.3.1 The COMFY core

The following implement Baker’s semantics:

(seq item ...) tries to execute each item in order. As soon as any of them fail, then the whole seq immediately loses. If they all succeed, then the seq wins.

(inv item) is equivalent to Baker’s “not”. This exchanges the win and lose continuations of the item. That is, if it would normally win, it loses, and vice-versa.

(if test conseq altern) Each of the arguments to if is a <text-item>. This form executes test with a win of conseq and a lose of altern. The whole if always wins.

(alt item ...) tries to execute each item in order. As soon as one of them succeeds, the whole alt wins. If an item fails, the next item is tried.

(times amount item) “unrolls loops”. That is, the item will be executed (and compiled) a number of times equal to amount. The copies are wrapped in a begin.

(iter item) is equivalent to Baker’s “loop”. The item is executed ad infinitum, meaning iter can never win. However, if the item fails then the whole iter loses.

(while test body) is another looping construct. Each time through the loop, test is tried. If it succeeds, the body is executed. If it fails, then the whole while wins. On the other hand, if the body fails, then the whole while loses.

5.3.2 Sassy Extensions

(begin item ... tail) executes each item with both a win and lose continuation of the next item. The exception is the tail, which is executed with the win and lose continuations of the whole begin. So if tail succeeds, the begin wins. Otherwise it loses.

At the top level of text directive, (and indeed, in between text directives) sassy implicitly wraps all of the <text-items> in a begin. As well, following a <label> declaration, all of the <text-items> at the label’s top level are explicitly wrapped in a begin.

(until test body) is like while, except that the test is subjected to a inv. So each time through the loop, if test fails, the body is executed, but if it succeeds, then the whole until wins. Like while, if the body fails, the whole until loses.

The following are provided to provide some means of “capturing” and over-riding the continuations.

(with-win k-win [item])
(with-lose k-lose [item])
(with-win-lose k-win k-lose [item])

(The square brackets around item are meant to indicate that it is optional. See below)

Each of these compiles item with an explicit win or lose continuation (or both) of k-win or k-lose, effectively overriding the particular default or implicit continuation Sassy would normally supply to the item. The continuation may be a text-item or one of the specials symbols $win or $lose. Thus it is possible to express the semantics of many of Sassy’s primitives in an explicit continuation-passing style. Examples of this are here.

(with-win bar
 (if (seq (cmp eax 3)
          e!)
     (push eax)   ; after the push jmp to bar
     (push ebx))) ; after the push jmp to bar

(with-win-lose (jmp 1000) (call foo)
  (seq
   (push eax)
   (= ecx 4))) ; if eax is 4, then (jmp 1000), else (call foo)

(label and-some-blocks
  (with-win (begin (push eax)
                   (push ebx))
    (with-win (zero? ebx)
      (zero? eax))))

==

(label and-some-blocks
  (seq (zero? eax)
       (zero? ebx)
       (begin (push eax)
              (push ebx))))

sassy places the win or lose continuations after the items. If you use with-win-lose, the lose continuation occurs last, the win continuation second, and the item first.

If an explicit continuation is either an unconditional branch (jmp ...) or the instruction (ret), sassy does not emit an extra branch to the contextual continuation of the “jmp” or “ret”, since these imply that the actual continuation of the thread of computation is the target of these branches.

In addition, sometimes you may want sassy to emit a “single instruction” as a continuation, but nothing else. This might occur in the succeed or fail arm of an if, for instance. In this case you can write either (seq) or (begin) (an “empty” sequence or block) for the item. This triggers the continuation generators without emitting anything else into the instruction stream. (The empty sequence and block are actually valid syntax anywhere.) Or you may simply elide the item, and the compiler will insert the extra (seq) automatically.

(text (mov eax 10)

      (label foo 
        (if (= eax 3)
	    (with-win (ret))          ; if eax is 3, just (ret)
            (with-win foo             ; otherwise loop to foo
              (sub eax 1)))))

$win and $lose are two special symbols that Sassy reserves for itself so that you may explicitly refer to the values (the addresses) of the current win and lose continuations. They always refer to the exact win or lose continuation in effect at the point of their usage, including explicit continuations given by with-win etc. (Sassy records relocations for every usage of these).

(seq (add eax 1)
     (push $win)  ; pushes the address of (add ebx 2)
     (add ebx 2)
     (push $lose) ; pushes the address of the
                  ; lose continuation of the enclosing seq
     (add ecx 3))

$eip is a special symbol that Sassy reserves for itself to allow you to refer to the address of the next instruction. It always refers to the next instruction.

(esc (instruction ...) item) “turns off” Sassy’s continuation tracking for a moment so that you may explicitly store the value of a continuation (which is just an address). Sassy compiles item in the normal way, but it places each instruction in order just before the item, and each instruction is compiled with the item’s win and lose continuations. Thus, if any of the instructions utilize the special symbols $win and $lose, they will represent the win and lose addresses of the item.

This is useful, for instance, in the following “multiple-dispatch” situation, where the calling convention consists of pushing the return address first, and the arguments second. Assume the functions “foo” and “bar” pop their arguments, do their thing, and end with a (ret) (or a pop and a branch).

(esc ((push $win))
     (if (seq (cmp eax 10)
              z!)
         (with-win foo (push ebx))
         (with-win bar (push ecx))))

The functions “foo” and “bar” will both return to the win continuation of the if, rather than into an arm of the if itself, from which they would immediately branch out of (“branch tensioning”, in other words).

(leap item-with-mark)
(mark item)

These two forms work together to allow you to write a branch into the middle of an otherwise nested structure. At the desired entry point to the structure use mark, and wrap the whole thing in a leap. If leap can’t find a mark it does nothing. This is useful, for instance, for entering a loop at an arbitrary point.

(leap (iter (seq (add esp 8)
                 (mark (pop ecx))
                 (= ecx 3))))

5.3.3 A Note on Branch Optimization

Sassy currently optimizes all of its internally generated branches for size, so whenever it can assemble the “short” form of an internally generated conditional or unconditonal branch, it does so (provided the branch is not to an explicit continuation that is a label), regardless of the branch’s direction. This comes at a small cost, because this means sassy has to make at least two passes, and possibly several more, over its looping forms (iter, while, and until). Though this is the only time Sassy makes more than one pass, in the future, if this cost becomes unbearable, I may provide a compiler option for strict one-pass assembly of these forms, using Baker’s techniques (see section D.4).

5.4 Raw text data

If you want to place raw data into a text section that isn’t the generated output of an insruction or control primitive, you may use the following as “pseudo-opcodes” to do so within a text section.

(bytes  <any-data> ...)
(words  <any-data> ...)
(dwords <any-data> ...)

The <any-datas> may be numbers, characters, strings, or labels (including custom relocs), and follow the same conventions for writing data in a data section, including the zero-filling of strings. For the purposes of flow control, sassy considers each occurrence of the above as an indivisible "opcode" that always wins.