My first domain-specific language with Racket
Step 2: Code generation
Guillaume SavatonIn step 1, I have written the full adder example in Racket, following a few mapping rules to express the concepts of the Tiny-HDL language. The next step consists in writing a code generator that implements these mapping rules so that we can generate Racket code automatically.
In a typical compiler, the code generator transforms an intermediate representation of a program, such as an abstract syntax tree (AST), into an executable form. The intermediate representation is a data structure that results from the syntax and semantic analysis steps. The executable form depends on the target execution platform: it can be machine code for a processor, bytecode for a virtual machine, or source code in an interpreted language. In some cases, a code generator can be composed of several stages, involving lower-level source code generation that is fed into another compiler.
Code generation from domain-specific languages
Before discovering Racket, I was familiar with the concepts of standalone (or external) and embedded (or internal) domain-specific languages.
An embedded DSL is built on top of another programming language. It uses the syntax and the compilation infrastructure of the underlying language, extending it with libraries and usage guidelines to support domain-specific concepts. As an example, Clash is a hardware description language built on Haskell: a valid circuit description in Clash is also a valid Haskell program whose execution constitutes a simulation of the circuit.
On the opposite side, a standalone DSL has its own syntax and semantics, and requires a specific compilation infrastructure. VHDL and Verilog are examples of standalone DSLs for digital circuit development. If you want to simulate a circuit described in one of these languages, you will need a dedicated simulator or compiler, such as GHDL or Verilator.
As a third way, Racket developers promote the concept of hosted DSL. If my understanding is correct, a hosted DSL extends a host language with custom syntactic forms. At compile time, these custom syntactic forms are translated into the host language by executing rewriting rules. When Racket is the host language, rewriting rules are typically specified as procedural macros written in Racket itself.
Each of these three types of DSLs handles code generation in a different way: an embedded DSL uses a pre-existing code generator for its underlying language; a standalone DSL needs a dedicated code generator written in another language; a hosted DSL specifies code generation rules that can be written as macros in the host language itself.
From Tiny-HDL to Racket using macros
In this project, Tiny-HDL is clearly intended as a standalone language. However, since the syntax of Tiny-HDL is based on S-expressions, implementing it as a hosted language in Racket felt more natural to me at first. This is the approach that we will follow in steps 2 to 5. As a consequence, in step 2, the full-adder example will look like this:
#lang racket
(require tiny-hdl)
(entity half-adder ([input a] [input b] [output s] [output co]))
(architecture half-adder-arch half-adder
(assign (port-ref half-adder s) (xor (port-ref half-adder a) (port-ref half-adder b)))
(assign (port-ref half-adder co) (and (port-ref half-adder a) (port-ref half-adder b))))
...
You can notice the #lang racket
directive that shows that this file is still
considered as Racket source code.
The (require tiny-hdl)
form imports the macros provided by the tiny-hdl
package,
so that the entity
, architecture
, and other Tiny-HDL forms are expanded into
Racket code.
Finally, you may have noticed the port-ref
forms that were absent from
the language in the previous posts.
These will be explained in the following sections.
Entities
Based on the example from step 1, we can write a Racket macro that generalizes the translation of an entity into a structure type:
(define-simple-macro (entity ent-name ([_ port-name] ...))
(begin
(provide (struct-out ent-name))
(struct ent-name ([port-name #:auto] ...) #:mutable)))
The only new element here is the provide
form that makes the structure type
available to other modules.
Architectures
Here again, a generalization of the example from step 1 leads to this macro:
(define-simple-macro (architecture arch-name ent-name body ...)
(begin
(provide arch-name)
(define (arch-name)
(define self (ent-name))
body ...
self)))
In the following sections, we will modify this macro slightly to provide additional information needed during the expansion of the architecture
body
.
Instances
An instantiation statement expands to a variable that receives the result of a call to an architecture constructor:
(define-simple-macro (instance inst-name arch-name)
(define inst-name (arch-name)))
Assignments and expressions
Boolean literals (#t
, #f
) are already valid Racket expressions
that won’t need any specific treatment.
The same applies to boolean operations (not
, and
, or
, xor
)
after their operands have been expanded.
Reading and writing ports is not as straightforward.
Let’s analyze an example from the architectures half-adder-arch
and full-adder-arch
:
(assign s (xor a b))
(assign (h2 a) (h1 s))
These statements should be translated into:
(set-half-adder-s! self (λ () (xor ((half-adder-a self)) ((half-adder-b self)))))
(set-half-adder-a! h2 (λ () ((half-adder-s h1))))
As you can see, the entity name half-adder
is present in the generated code
but not in the source code.
In the first statement, since we are inside half-adder-arch
, we could use
a syntax parameter
to store the name of the current entity and make it
available in the expansion of the architecture body.
But this technique is not usable in the second statement because it
needs to retrieve the name of the entity from the architecture
of h1
and h2
.
In this step, I have chosen to make a dumb translator where each rewriting
rule operates only with the information immediately available.
For this reason, I introduce a port-ref
form that associates the port name
with its entity name, along with an optional instance name.
The semantic checker, in the name resolution step, will be responsible for
creating these port-ref
expressions.
In the code generation step, the two statements above will take the following form:
(assign (port-ref half-adder s) (xor (port-ref half-adder a) (port-ref half-adder b)))
(assign (port-ref half-adder a h2) (port-ref half-adder s h1))
The corresponding macros will use the function format-id
to concatenate the
entity and port names into setter and getter function names.
The first argument of the setter and getter will depend on the presence or
absence of an instance name in the port-ref
form.
If no instance name is present, it should use the self
variable defined in
the current architecture constructor (but it will not work).
(define-simple-macro (assign ((~literal port-ref) ent-name port-name (~optional inst-name)) expr)
#:with setter-name (format-id #'port-name "set-~a-~a!" #'ent-name #'port-name)
#:with arg-name (if (attribute inst-name) #'inst-name #'self)
(setter-name arg-name (λ () expr)))
(define-simple-macro (port-ref ent-name port-name (~optional inst-name))
#:with getter-name (format-id #'port-name "~a-~a" #'ent-name #'port-name)
#:with arg-name (if (attribute inst-name) #'inst-name #'self)
((getter-name arg-name)))
Working around macro hygiene
If you try to run the full adder example with the above macros, you will get
the following error message in macros assign
and port-ref
:
self: unbound identifier
The reason is that the symbol self
introduced in the architecture
macro
is bound to a compile-time lexical scope, to avoid conflicts with
other bindings of the same name at runtime.
As a consequence, in the code generated by macros assign
and port-ref
the variable self
is out of scope.
To fix this, we can use a syntax parameter
and a rename transformer
that will provide an alias of the self
variable in the body of an architecture’s
constructor.
Here are fixed versions of the architecture
, assign
, and port-ref
macros:
(define-syntax-parameter current-instance
(λ (stx)
(raise-syntax-error (syntax-e stx) "can only be used inside an architecture")))
(define-simple-macro (architecture arch-name ent-name body ...)
(begin
(provide arch-name)
(define (arch-name)
(define self (ent-name))
(syntax-parameterize ([current-instance (make-rename-transformer #'self)])
body ...)
self)))
(define-simple-macro (assign ((~literal port-ref) ent-name port-name (~optional inst-name)) expr)
#:with setter-name (format-id #'port-name "set-~a-~a!" #'ent-name #'port-name)
#:with arg-name (if (attribute inst-name) #'inst-name #'current-instance)
(setter-name arg-name (λ () expr)))
(define-simple-macro (port-ref ent-name port-name (~optional inst-name))
#:with getter-name (format-id #'port-name "~a-~a" #'ent-name #'port-name)
#:with arg-name (if (attribute inst-name) #'inst-name #'current-instance)
((getter-name arg-name)))
Getting the source code and running the example
The source code for this step can be found in branch step-02 of the git repository for this project. You will find the following new files:
- main.rkt: the main module of the Tiny-HDL package.
- lib/expander.rkt: the code generator module.
- examples/full-adder-step-02.rkt: the full adder example written with macros in Racket.
- examples/full-adder-step-02-test.rkt: the main test program for this step.
Getting the source code for step 2
Assuming you have already cloned the git repository,
switch to branch step-02
:
git checkout step-02
Running the example
Register the current folder as a Racket collection:
raco link -n tiny-hdl .
Run full-adder-step-02-test.rkt
with Racket:
racket examples/full-adder-step-02-test.rkt
Hopefully, you will get the same result as in step 1:
a b ci s co
#f #f #f -> #f #f
#f #f #t -> #t #f
#f #t #f -> #t #f
#f #t #t -> #f #t
#t #f #f -> #t #f
#t #f #t -> #f #t
#t #t #f -> #f #t
#t #t #t -> #t #t