Skip to content

Commit a191f38

Browse files
committed
Overhaul of notes from Abscond to Hustle.
1 parent d8d53d4 commit a191f38

9 files changed

Lines changed: 1442 additions & 968 deletions

File tree

www/notes/abscond.scrbl

Lines changed: 62 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
@(ev '(require rackunit a86))
1717
@(for-each (λ (f) (ev `(require (file ,(path->string (build-path langs "abscond" f))))))
18-
'("interp.rkt" "ast.rkt" "compile.rkt"))
18+
'("main.rkt" "correct.rkt"))
1919

2020
@(define (shellbox . s)
2121
(parameterize ([current-directory (build-path langs "abscond")])
@@ -80,7 +80,15 @@ implementation artifacts. Formal definitions balance precision while
8080
allowing for under-specification, but require detailed definitions and
8181
training to understand.
8282

83-
We will use a combination of each.
83+
For the purposes of this course, we will use interpreters to specify
84+
the meaning of programs. The interpreters provide a specification for
85+
the compilers we write and make precise what means for a compiler to
86+
be @emph{correct}. Any time the compiler produces code that, when
87+
run, produces a different result that the interpreter produces for the
88+
same program, the compiler is broken (or the specification is wrong).
89+
Interpreters are useful for specifying what the compiler should do and
90+
sometimes writing interpreters is also useful for informing @emph{how}
91+
it should do it.
8492

8593

8694
To begin, let's start with a dead simple programming language called
@@ -151,13 +159,6 @@ While not terribly useful for a language as overly simplistic as Abscond, we use
151159
an AST datatype for representing expressions and another syntactic categories.
152160
For each category, we will have an appropriate constructor. In the case of Abscond
153161
all expressions are integers, so we have a single constructor, @racket[Lit].
154-
155-
@(define-language A-concrete
156-
(e ::= (Lit i))
157-
(i ::= integer))
158-
159-
@centered{@render-language[A-concrete]}
160-
161162
A datatype for representing expressions can be defined as:
162163

163164
@codeblock-include["abscond/ast.rkt"]
@@ -168,6 +169,12 @@ an integer and constructs an instance of the AST datatype if
168169
it is, otherwise it signals an error:
169170
@codeblock-include["abscond/parse.rkt"]
170171

172+
@ex[
173+
(parse 5)
174+
(parse 42)
175+
(eval:error (parse #t))]
176+
177+
171178
@section{Meaning of Abscond programs}
172179

173180
The meaning of an Abscond program is simply the number itself. So
@@ -184,6 +191,14 @@ produces it's meaning:
184191
(interp (Lit -8))
185192
)
186193

194+
The @racket[interp] function specifies the meaning of expressions,
195+
i.e. elements of the type @tt{Expr}. This language is so simple, the
196+
@racket[interp] function really doesn't @emph{do} much of anything,
197+
but this will change as the langauge grows.
198+
199+
200+
201+
187202
We can add a command line wrapper program for interpreting Abscond
188203
programs from stdin:
189204

@@ -198,6 +213,7 @@ the result.
198213
For example, interpreting the program @tt{42.rkt} shown above:
199214
@shellbox["cat 42.rkt | racket -t interp-stdin.rkt -m"]
200215

216+
@;{
201217
Even though the semantics is obvious, we can provide a formal
202218
definition of Abscond using @bold{operational semantics}.
203219

@@ -249,6 +265,7 @@ and integers @racket[i], if (@racket[e],@racket[i]) in @render-term[A
249265
We now have a complete (if overly simple) programming language with an
250266
operational semantics and an interpreter, which is (obviously)
251267
correct. Now let's write a compiler.
268+
}
252269

253270
@section{Toward a Compiler for Abscond}
254271

@@ -283,7 +300,7 @@ computer; it's interpreter is implemented in hardware on your
283300
computer's CPU,}
284301

285302
@item{it is one of the two dominant computing architectures (the other
286-
being ARM), and}
303+
being ARM) in use today, and}
287304

288305
@item{it is a mature technology with good tools and materials.}
289306
]
@@ -303,7 +320,7 @@ as follows:
303320

304321
Separating out @tt{print_result}, which at this point is just a simple
305322
@tt{printf} statement, seems like overkill, but it will be useful in
306-
the future as the language gets more complicated.
323+
the future as the language and its set of values gets more complicated.
307324

308325
The runtime must be linked against an object file that provides the
309326
definition of @tt{entry}; this is the code our compiler will emit.
@@ -496,75 +513,69 @@ Moreover, we can compare our compiled code to code compiled by Racket:
496513
@section{But is it @emph{Correct}?}
497514

498515
At this point, we have a compiler for Abscond. But is it correct?
516+
What does that even mean, to be correct?
499517

500-
Here is a statement of compiler correctness:
518+
First, let's formulate an alternative implementation of
519+
@racket[interp] that composes our compiler and a86 interpreter to define
520+
a (hopefully!) equivalent function to @racket[interp]:
501521

502-
@bold{Compiler Correctness}: @emph{For all expressions @racket[e] and
503-
integers @racket[i], if (@racket[e],@racket[i]) in @render-term[A
504-
𝑨], then @racket[(asm-interp (compile e))] equals
505-
@racket[i].}
522+
@codeblock-include["abscond/exec.rkt"]
506523

507-
Ultimately, we want the compiler to capture the operational semantics
508-
of our language (the ground truth of what programs mean). However,
509-
from a practical stand-point, relating the compiler to the intepreter
510-
may be more straightforward. What's nice about the interpreter is we
511-
can run it, so we can @emph{test} the compiler against the
512-
interpreter. Moreover, since we claimed the interpreter is correct
513-
(w.r.t. to the semantics), testing the compiler against the interpreter
514-
is a way of testing it against the semantics, indirectly. If the
515-
compiler and interpreter agree on all possible inputs, then the
516-
compiler is correct with respect to the semantics since it is
517-
equivalent to the interpreter, and the interpreter is correct.
524+
This function can be used as a drop-in replacement to @racket[interp]:
518525

519-
So, in this setting, means we have the following equivaluence:
526+
@ex[
527+
(exec (Lit 42))
528+
(exec (Lit 19))]
520529

521-
@verbatim{
522-
(interp e) @emph{equals} (asm-interp (compile e))
523-
}
530+
It captures the idea of a phase-distinction in that you can first
531+
compile a program into a program in another language---in this case
532+
a86---and can then interpret @emph{that} program to get the result.
533+
If the compiler is correct, the result should be the same:
534+
535+
@bold{Compiler Correctness}: @emph{For all @racket[e] @math{∈}
536+
@tt{Expr} and @racket[i] @math{∈} @tt{Integer}, if @racket[(interp e)]
537+
equals @racket[i], then @racket[(exec e)] equals
538+
@racket[i].}
539+
540+
One thing that is nice about specifying our language with an
541+
interpreter is that we can run it. So we can @emph{test} the compiler
542+
against the interpreter. If the compiler and interpreter agree on all
543+
possible inputs, then the compiler is correct.
524544

525-
But we don't actually have @racket[asm-interp], a function that
526-
interprets the Asm code we generate. Instead we printed the code and
527-
had @tt{gcc} assembly and link it into an executable, which the OS
528-
could run. But this is a minor distinction. We can use
529-
@racket[asm-interp] to interact with the OS to do all of these steps.
530545

531546
This is actually a handy tool to have for experimenting with
532547
compilation within Racket:
533548

534549

535-
@examples[#:eval ev
536-
(asm-interp (compile (Lit 42)))
537-
(asm-interp (compile (Lit 37)))
538-
(asm-interp (compile (Lit -8)))
539-
]
550+
@ex[
551+
(exec (Lit 42))
552+
(exec (Lit 37))
553+
(exec (Lit -8))]
540554

541555
This of course agrees with what we will get from the interpreter:
542556

543-
@examples[#:eval ev
557+
@ex[
544558
(interp (Lit 42))
545559
(interp (Lit 37))
546-
(interp (Lit -8))
547-
]
560+
(interp (Lit -8))]
548561

549562
We can turn this in a @bold{property-based test}, i.e. a function that
550563
computes a test expressing a single instance of our compiler
551564
correctness claim:
552-
@examples[#:eval ev
553-
(define (check-compiler e)
554-
(check-eqv? (interp e)
555-
(asm-interp (compile e))))
556565

566+
@codeblock-include["abscond/correct.rkt"]
567+
568+
@ex[
557569
(check-compiler (Lit 42))
558570
(check-compiler (Lit 37))
559-
(check-compiler (Lit -8))
560-
]
571+
(check-compiler (Lit -8))]
561572

562573
This is a powerful testing technique when combined with random
563574
generation. Since our correctness claim should hold for @emph{all}
564575
Abscond programs, we can randomly generate @emph{any} Abscond program
565576
and check that it holds.
566577

567-
@examples[#:eval ev
578+
@ex[
568579
(check-compiler (Lit (random 100)))
569580

570581
; test 10 random programs

0 commit comments

Comments
 (0)