Directives

sassy processes a series of “directives” that describe various aspects of the object that sassy is assembling. Some, none, or all of the directives described here may be present, and they usually may occur in any order (there are some exceptions; see below). Directives may occur multiple times; sassy either appends the results of the later usages to the results of the earlier usages, or simply updates the object it is assembling with the new information from the later usages.

sassy processes the directives in the order that you wrote them. It’s possible to include additional sources of input. In this case, sassy does not continue with the current source until it has completely processed the directives of each additional source specified by include, in order.

4.1  Label Definitions and Lexical Scoping

4.1.1  Defining Labels

To “define” a label or symbol, meaning to assign an actual location or address to a label so that you can reference that address by using the label, write the following in a heap, data, or text directive:

(label <label-name> <item> ...)

As mentioned above, <label> is a Scheme symbol, and <item> is the appropriate syntax for the item in the enclosing directive. In a text directive, if there are more than one <item>, all the <items> are wrapped in an implicit begin, for the purposes of continuations. See below.

Sassy also records, for each label definition, the total size of all the <items> that it encloses, for use by Sassy’s output API.

You may also omit the <items> in a label definition. In that case, the label is defined at that point, but sassy records its size as zero, and processing continues. Thus you may define multiple labels to be the same address:

(text (label foo)
      (label bar)
      (label wizo)
      (nop))

No label may be defined (assigned an address) more than once in any given lexical scope. No label may be both defined in the top-level scope and imported. The top-level scope consists of every label not shadowed by any locals declaration.

4.1.2  Lexical Scoping

Sassy has a mechanism to allow you to write lexically scoped blocks of arbitrary depth so that you can re-use names of labels without fear of referring to the wrong address. This mechanism is usuable in the text and data directives.

To open up a new scope, you write:

(locals (<label-name> ...) <item> ...)

The locals form declares that for all references to the <label-names> in the <items> in the current scope level, the addresses that they refer to are the addresses of the accompanying label definitions of those names in the <items>.

(text
  (label loop        ; the outer 'loop'
    
    (locals (loop)   ; open up a new scope and declare 'loop' to be shadowed
      (push loop)    ; push the address of the inner 'loop' defined later in this scope
      (nop)
      (label loop    ; the definition of the inner 'loop'
	(jmp loop))  ; jmp to the inner 'loop'
      (nop))
    
    (push loop)))    ; this is outside the previously declared scope,
                     ; and so refers to the outer 'loop'

Or in a data section:

(data
  (label foo (dwords bar))    ; The outer 'foo' contains the address of the outer 'bar'

  (locals (foo bar)           ; Declare a new scope
    (label foo (dwords bar))  ; The inner 'foo' contains the address of the inner 'bar'
    (label bar (dwords foo))) ; The inner 'bar' contains the address of the inner 'foo'
  
  (label bar (dwords foo)))   ; The outer 'bar' contains the address of the outer 'foo'

Please keep in mind:

4.2  The align special form

You may insert an align special form in a heap, data, or text directive:

(align <amount>)

In a text directive, any uses of align must occur at the text directive’s top level.

The align special form tells sassy to place the next item in the section at the next available address (counting up from the current address in the section) such that

(zero? (modulo <the-new-address> <amount>))

is true.

The <amount> must be a positive power of two. If you use align within a data directive, sassy pads the data section with the byte 0. If you use it within a text directive, sassy pads the text section with the (nop) instruction.

4.3  Sections

You declare new sections by using either the text, data, or heap directives. Any number of these may appear, and in any order. sassy concatenates the result of assembling later invocations of these to the existing assembled contents for that section type, but does not intermingle the contents of sections with different types.

You may write nested section directives, but not arbitrarily. Any number of section directives with the same name may be written inside a top-level section directive with that name. In this case sassy splices the contents of the inner directive into the outer.

(text (nop)
      (text (push eax)
            (label foo)
            (pop ebx))
      (nop))

==>

(text (nop)
      (push eax)
      (label foo)
      (pop ebx)
      (nop))

In addition, you may write a nested section directive with a different name than that of the enclosing section directive. In this case the assembled result of the inner section directive is appended to the output for the specified section, as if the nested section directive had appeared at the top level. Note, however, that once you have entered a nested section directive with a different name, it is an error to write a second nested section directive inside the first with a name different from that of the first. You may only temporarily switch sections once before returning to the parent.

For example:

(text
  (label publish-private-box
    (locals (my-private-box)               

      (data                                ; temporarily switch sections
        (label my-private-box (dwords 1))) 

      (mov (& my-private-box) 0)           ; zero the contents
      (mov eax my-private-box)
      (ret))))

But this is an error:

(text
  (label publish-private-box
    (locals (my-private-box)               

      (data                                ; temporarily switch sections
        (label my-private-box (dwords 1))
        (text (mov (& my-private-box) 0))) ; WRONG - too much nesting

      (mov eax my-private-box)
      (ret))))

4.4  Descriptions of directives

4.4.1  heap

(heap <heap-item> ...)

heap-item      = (label <label-name>)                 ; "empty" label definition
               | (label <label-name> <heap-item> ...) ; label definition
               | <heap-sizer>                         ; anonymous heap "space"
               | (align <amount>)
               | (begin <heap-item> ...)              ; useful for macro defs.
               | (heap  <heap-item> ...)              ; splice
               | (text  <text-item> ...)              ; one time section switch
               | (data  <data-item> ...)              ; ditto

heap-sizer     = (bytes  <integer>)
               | (words  <integer>)
               | (dwords <integer>)
               | (qwords <integer>)

Use the heap directive to reserve uninitialized storage space. sassy computes the number of bytes to reserve by multiplying the <integer> by 1, 2, 4, or 8, depending on whether you wrote bytes, words, dwords, or qwords, respectively, in the <heap-sizer>. The <integer> may not be negative.

(heap (bytes 7)           ; reserve 7 bytes
      (align 32)          ; align the next item to 32
      (label foo (dwords 100))) ; reserve 400 bytes and call it "foo"

4.4.2  data

(data <data-item> ...)

data-item      = (label <label>)                             ; "empty" label definition
               | (label <label> <data-item> ...)             ; label definition
               | (locals (<label-name> ...) <data-item> ...) ; local labels declaration
               | <data-contents>                             ; anonymous data
               | (align <amount>)
               | (begin <data-item> ...)                     ; useful for macro defs.
               | (data <data-item> ...)                      ; splice
               | (text <test-item> ...)                      ; one time section switch
               | (heap <data-item> ...)                      ; ditto

data-contents  = (bytes  <any-data> ...)
               | (words  <any-data> ...)
               | (dwords <any-data> ...)
               | (qwords <any-data> ...)
               | (asciiz <any-data>)

any-data       = <integer>
               | <character>
               | <string>
               | <label-name>        ; data-contents size must match current bits setting
               | <custom-relocation> ; ditto
               | <float>             ; data contents size must be dwords or qwords

With the data directive you may specify initialized data consisting of numbers, byte-sized characters, or strings of bytes, and in some cases addresses. sassy writes the data into the assembled object’s data section in a series of fields of specific widths. You specify the width of the fields to be 1, 2, 4, or 8 bytes with the bytes, words, dwords, and qwords forms of <data-contents>, respectively.

sassy places integers into the field in little-endian order. Integers may be unsigned, or signed, in which case sassy uses the two’s-complement representation. In either case the number must be able to fit in the specified field size.

If you write a float, it should be in the context of a dwords or qwords field, and produces the little-endian IEEE-754 single-precision or double-precision representation. (Sassy currently has no mechanism for specifying double-extended-precision, meaning 80-bit, floats).

sassy places characters at the low address of the field and pads the rest of the unused field (if any) with the byte 0. If you write a string, sassy places each character into a byte, and pads any unused bytes of the last field with the byte 0. If you specify a field size of bytes, however, no zero padding occurs at the end.

The included (asciiz <any-data>) macro places characters at the low address towards higher addresses and then places a 0 at the end. It is equivalent to (bytes <any-data> 0).

If you want to use the address of a <label> or a <custom-relocation> as data, you have to place it in a field whose size matches the current setting of the bits directive, e.g. either in a dwords field under (the default) of (bits 32), or in a words field under (bits 16).

(data (label foo (dwords "abcde"))   ; #x61 62 63 64  65 00 00 00
      (align 16)                     ; #x00 00 00 00  00 00 00 00
      (dwords -3242.52 1000)         ; #x52 a8 4a c5  e8 03 00 00
      (qwords -84930284902.48392048) ; #xe2 7b 66 4d  3d c6 33 c2
      (label quux (dwords #\a #\b))  ; #x61 00 00 00  62 00 00 00
      (bytes #\a #\b #\c #\d #\e)    ; #x61 62 63 64  65
      (align 4)                      ;                   00 00 00
      (dwords foo quux))             ; #x00 00 00 00  20 00 00 00

4.4.3  text

(text <top-level-text> ...)

top-level-text = (label <label>)                             ; "empty" label definition
               | (label <label> <text-item> ...)             ; label definition
               | (locals (<label-name> ...) <text-item> ...) ; local labels declaration
               | <text-item>                                 ; anonymous text
               | (align <amount>)
               | (text <text-item> ...)                      ; splice
               | (data <data-item> ...)                      ; one time section switch
               | (heap <heap-item> ...)                      ; ditto

text-item      = (label <label>)
               | (label <label> <text-item> ...)
               | (locals (<label-name> ...) <text-item> ...)
               | <instruction>
               | <assertion>
               | <control-primitive>
               | (bytes  <any-data> ...)                    ; raw data in a text section
               | (words  <any-data> ...)
               | (dwords <any-data> ...)
               

The above is the basic grammar for the text directive. For a full explanation, please refer to Instructions and The Text Section.

4.4.4  include

(include <input> ...)

include lets you specify (by name) additional input sources, such as libraries of code, to add to the object sassy is assembling. Each <input> should be either a symbol that is bound to a list of Sassy’s directives in the host Scheme’s “interaction-environment”, or a file of directives. sassy processes each additional <input> in its entirety before proceeding to the next <input>, or the directive following include. An <input> may in turn use include, and thus sassy processes all of the sources in depth-first order. This means that if you want to include a file of Sassy macros, for instance, you must include that file before using any of the macros in it.

4.4.5  import

(import <label> ...)

Use import to declare that each <label> is not defined in the current object sassy is assembling, and that the output module or linker should link references in the current object to their definitions in other objects. This is similar to, for instance, NASM’s “.extern” directive.

4.4.6  export

(export <label> ...)

The export directive declares that each <label> should be made available to other assembled objects for linking, similar to, for instance NASM’s “.global” directive.

4.4.7  entry

(entry <label>)

The entry directive declares the <label> as the main entry point for the assembled object. The <label> will also become exported.

4.4.8  org

(org <integer>)

The org directive must appear before any text directives, and tells sassy to assemble the contents of all the text directives such that the resulting text section will be loaded at the “absolute” address given by <integer>. The <integer> may not be negative.

4.4.9  bits

(bits <16|32>))

The bits directive alters the assembly of the following text directives to follow either 16-bit or 32-bit programming. 32-bit programming is fully supported, but 16-bit programming currently uses 32-bit addressing. If bits is not defined, it defaults to 32-bit.

4.4.10  macro

(macro <name> <transformer>)

transformer = <constant-like>
            | <lambda-expression>

Sassy has a simple internal macro system so that you may write macros with names that would otherwise re-define or be confused with Scheme bindings, syntactic or otherwise. Sassy comes with a few such macros installed internally. Please refer to Appendix A for their expansions. Since the procedure sassy accepts lists as input, it’s always possible to use quasiquotation. As well, you may write escapes in order to use the host Scheme’s macro system(s).

In the above <name> should be a Scheme symbol, and it should be different from any labels that you plan to use. You may bind a macro <name> to a constant, such as a number, string, or character, or for instance, an expression or a piece of an expression to be inserted wholesale. To use such a macro just write the <name>, without parantheses.

(sassy
 '((macro true "TRUE")
   (macro set! mov)
   (macro return-true (set! eax true))
   (text (seq (= eax 3)
              return-true))))

Or you may bind a name to a <lambda-expression> that composes a piece of valid syntax from its arguments. To use the macro you “call” it in the normal way.

(sassy
 '((macro push-some (lambda regs
		      `(begin ,@(map (lambda (reg)
				       `(push ,reg))
				     regs))))
   (text (push-some eax ebx ecx))))

Please be aware of the following:

The expander is available to the user for debugging purposes:

—procedure: sassy-expand hash-table or something to expand

If the argument is a hash-table, sassy-expand installs the hash-table as its current user syntactic environment. See the file macros.scm for an example of such a hash-table. If the argument is something to expand, sassy-expand returns the expansion within the context of its current user syntactic environment. All user macros are lost in between invocations of sassy, since every invocation of sassy calls sassy-expand with an empty hash-table. However, after calling sassy you may call sassy-expand to see how something was expanded.

(sassy '((macro my-register 3)
         (macro foo (lambda (x) `(mov ,x eax)))
         (text (foo my-register)))) 
=> an error is signalled

(sassy-expand '(text (foo my-register))) 
=> (text (mov 3 eax))

4.4.11  begin

(begin <item or directive> ...)

Sometimes you may want to write a macro that returns a sequence of several directives, or a sequence of several items in a heap or data directive. Using begin in these contexts allows you to do this. This form is also usable in a text directive, where it has certain additional semantics regarding continuations.

(sassy
 '((macro define-cell
	  (lambda (name init)
	    `(begin (macro cell-tag "CELL")
		     (data (label ,name (dwords cell-tag ,init)))
		     (macro ,(string->symbol
			      (string-append (symbol->string name) "-ref"))
			    (& ,name 4)))))
   (define-cell foo 100)

   (text (mov ebx foo-ref))))