Skip to content

Commit 55504ae

Browse files
committed
Start of rework of Fraud.
1 parent 8d2d156 commit 55504ae

4 files changed

Lines changed: 228 additions & 74 deletions

File tree

www/notes/fraud.scrbl

Lines changed: 181 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -32,25 +32,25 @@ We'll call it @bold{Fraud}.
3232

3333
We will use the following syntax to bind local variables:
3434

35-
@verbatim{
36-
(let ((@math{id_0} @math{e_0}))
37-
@math{e})
38-
}
35+
@racketblock[
36+
(let ((_id0 _e0))
37+
_e)
38+
]
3939

40-
This form binds the identifier @math{i_0} to value of @math{e_0}
41-
within the scope of @math{e}.
40+
This form binds the identifier @racket[_i0] to value of @racket[_e0]
41+
within the scope of @racket[_e].
4242

4343
This is a specialization of Racket's own local binding form, which
4444
allows for any number of bindings to be made with @racket[let]:
4545

46-
@verbatim{
47-
(let ((@math{id_0} @math{e_0}) ...)
48-
@math{e})
49-
}
46+
@racketblock[
47+
(let ((_id0 _e0) ...)
48+
_e)
49+
]
5050

51-
We adopt this specialization of Racket's let syntax so that you can
52-
always take a Fraud program and run it in Racket to confirm what it
53-
should produce.
51+
We adopt this specialization of Racket's @racket[let] syntax so that
52+
you can always take a Fraud program and run it in Racket to confirm
53+
what it should produce.
5454

5555
Adding a notion of variable binding also means we need to add
5656
variables to the syntax of expressions.
@@ -68,10 +68,6 @@ let's return to this after considering the semantics and interpreter.
6868

6969
@section{Meaning of Fraud programs}
7070

71-
@;(declare-exporting ,`(file ,(path->string (build-path notes "fraud/interp.rkt"))))
72-
73-
74-
7571
The meaning of Fraud programs depends on the form of the expression and
7672
in the case of integers, increments, and decrements, the meaning is
7773
the same as in the prior languages.
@@ -80,59 +76,63 @@ The two new forms are let-expressions and variables.
8076

8177
@itemlist[
8278

83-
@item{the meaning of a let expression @tt{(let ((@math{x} @math{e_0}))
84-
@math{e})} is the meaning of @math{e} (the @bold{body} of the let)
85-
when @math{x} means the value of @math{e_0} (the @bold{right hand
86-
side} of the let),}
79+
@item{the meaning of a let expression @racket[(let ((_x _e0))
80+
_e)] is the meaning of @racket[_e] (the @bold{body} of the @racket[let])
81+
when @racket[_x] means the value of @racket[_e0] (the @bold{right hand
82+
side} of the @racket[let]),}
8783

88-
@item{the meaning of a variable @math{x} depends on the context in
84+
@item{the meaning of a variable @racket[_x] depends on the context in
8985
which it is bound. It means the value of the right-hand side of the
90-
nearest enclosing let expression that binds @math{x}. If there is no
91-
such enclosing let expression, the variable is meaningless.}
86+
nearest enclosing @racket[let] expression that binds @racket[_x]. If
87+
there is no such enclosing @racket[let] expression, the variable is
88+
meaningless.}
9289

9390
]
9491

9592
Let's consider some examples:
9693

9794
@itemlist[
9895

99-
@item{@tt{x}: this expression is meaningless on its own.}
96+
@item{@racket[x]: this expression is meaningless on its own.}
10097

101-
@item{@tt{(let ((x 7)) x)}: this means 7, since the body
102-
expression, @tt{x}, means 7 because the nearest enclosing binding for
103-
@tt{x} is to @tt{7}, which means 7.}
98+
@item{@racket[(let ((x 7)) x)]: this means @racket[7], since the body
99+
expression, @racket[x], means @racket[7] because the nearest enclosing binding for
100+
@racket[x] is to @racket[7], which means @racket[7].}
104101

105-
@item{@tt{(let ((x 7)) 2)}: this means @tt{2} since the body
106-
expression, @tt{2}, means 2.}
102+
@item{@racket[(let ((x 7)) 2)]: this means @racket[2] since the body
103+
expression, @racket[2], means @racket[2].}
107104

108-
@item{@tt{(let ((x 7)) (add1 x))}: this means 8 since the body
109-
expression, @tt{(add1 x)}, means one more than @tt{x} and @tt{x} means
110-
7 because the nearest enclosing binding for @tt{x} is to @tt{7}.}
105+
@item{@racket[(let ((x 7)) (add1 x))]: this means @racket[8] since the
106+
body expression, @racket[(add1 x)], means one more than @racket[x] and @racket[x]
107+
means @racket[7] because the nearest enclosing binding for @racket[x] is to
108+
@racket[7].}
111109

112-
@item{@tt{(let ((x (add1 7))) x)}: this means 8 since the body
113-
expression, @tt{x}, means 8 because the nearest enclosing binding for
114-
@tt{x} is to @tt{(add1 7)} which means 8.}
110+
@item{@racket[(let ((x (add1 7))) x)]: this means @racket[8] since the
111+
body expression, @racket[x], means @racket[8] because the nearest
112+
enclosing binding for @racket[x] is to @racket[(add1 7)] which means
113+
@racket[8].}
115114

116-
@item{@tt{(let ((x 7)) (let ((y 2)) x))}: this means 7 since the body
117-
expression, @tt{(let ((y 2)) x)}, means 2 since the body expression,
118-
@tt{x}, means 7 since the nearest enclosing binding for @tt{x} is to
119-
@tt{7}.}
115+
@item{@racket[(let ((x 7)) (let ((y 2)) x))]: this means @racket[7] since the body
116+
expression, @racket[(let ((y 2)) x)], means @racket[2] since the body expression,
117+
@racket[x], means @racket[7] since the nearest enclosing binding for @racket[x] is to
118+
@racket[7].}
120119

121-
@item{@tt{(let ((x 7)) (let ((x 2)) x))}: this means 2 since the body
122-
expression, @tt{(let ((x 2)) x)}, means 2 since the body expression,
123-
@tt{x}, means 7 since the nearest enclosing binding for @tt{x} is to
124-
@tt{2}.}
120+
@item{@racket[(let ((x 7)) (let ((x 2)) x))]: this means 2 since the
121+
body expression, @racket[(let ((x 2)) x)], means @racket[2] since the
122+
body expression, @racket[x], means @racket[7] since the nearest
123+
enclosing binding for @racket[x] is to @racket[2].}
125124

126-
@item{@tt{(let ((x (add1 x))) x)}: this is meaningless, since the
127-
right-hand side expression, @tt{(add1 x)} is meaningless because
128-
@tt{x} has no enclosing let that binds it.}
125+
@item{@racket[(let ((x (add1 x))) x)]: this is meaningless, since the
126+
right-hand side expression, @racket[(add1 x)] is meaningless because
127+
@racket[x] has no enclosing @racket[let] that binds it.}
129128

130-
@item{@tt{(let ((x 7)) (let ((x (add1 x))) x))}: this means 8 because
131-
the body expression @tt{(let ((x (add1 x))) x)} means 8 because the
132-
body expression, @tt{x}, is bound to @tt{(add1 x)} is in the nearest
133-
enclosing let expression that binds @tt{x} and @tt{(add1 x)} means 8
134-
because it is one more than @tt{x} where @tt{x} is bound to @tt{7} in
135-
the nearest enclosing let that binds it.}
129+
@item{@racket[(let ((x 7)) (let ((x (add1 x))) x))]: this means
130+
@racket[8] because the body expression @racket[(let ((x (add1 x))) x)]
131+
means @racket[8] because the body expression, @racket[x], is bound to
132+
@racket[(add1 x)] is in the nearest enclosing @racket[let] expression
133+
that binds @racket[x] and @racket[(add1 x)] means @racket[8] because
134+
it is one more than @racket[x] where @racket[x] is bound to @racket[7]
135+
in the nearest enclosing @racket[let] that binds it.}
136136

137137
]
138138

@@ -249,6 +249,134 @@ examples given earlier:
249249
@render-term[F 𝑭], then @racket[(interp e)] equals
250250
@racket[v].}
251251

252+
@section{Lexical Addressing}
253+
254+
Just as we did with @seclink["Dupe"], the best way of understanding
255+
the forthcoming compiler is to write a ``low-level'' interpreter that
256+
explains some of the ideas used in the compiler without getting bogged
257+
down in code generation details.
258+
259+
At first glance at @racket[interp], it would seem we will need to
260+
generate code for implementing the @tt{REnv} data structure and its
261+
associated operations: @racket[lookup] and @racket[ext]. @tt{REnv} is
262+
an inductively defined data type, represented in the interpreter as a
263+
list of lists. Interpreting a variable involves recursively scanning
264+
the environment until the appropriate binding is found. This would
265+
take some doing to accomplish in x86.
266+
267+
However...
268+
269+
It is worth noting some invariants about environments created during
270+
the running of @racket[interp]. Consider some subexpression
271+
@racket[_e] of the program. What environment will be used whenever
272+
@racket[_e] is interpreted? Well, it will be a mapping of every
273+
variable bound outside of @racket[_e]. It's not so easy to figure out
274+
@emph{what} these variables will be bound to, but the skeleton of the
275+
environment can be read off from the program structure.
276+
277+
For example:
278+
279+
@racketblock[
280+
(let ((x ...))
281+
(let ((y ...))
282+
(let ((z ...))
283+
_e)))
284+
]
285+
286+
The subexpression @racket[_e] will @emph{always} be evaluated in an
287+
environment that looks like:
288+
@racketblock[
289+
'((z ...) (y ...) (x ...))
290+
]
291+
292+
Moreover, every free occurrence of @racket[x] in @racket[_e] will
293+
resolve to the value in the third element of the environment; every
294+
free occurrence of @racket[y] in @racket[_e] will resolve to the
295+
second element; etc.
296+
297+
This suggests that variable locations can be resolved
298+
@emph{statically} using @bold{lexical addresses}. The lexical address
299+
of a variable is the number of @racket[let]-bindings between the
300+
variable occurrence and the @racket[let] that binds it.
301+
302+
So for example:
303+
304+
@itemlist[
305+
306+
@item{@racket[(let ((x ...)) x)]: the occurrence of @racket[x] has a
307+
lexical address of @racket[0]; there are no bindings between the
308+
@racket[let] that binds @racket[x] and its occurrence.}
309+
310+
@item{@racket[(let ((x ...)) (let ((y ...)) x))]: the occurrence of
311+
@racket[x] has a lexical address of @racket[1] since the
312+
@racket[let]-binding of @racket[y] sits between the
313+
@racket[let]-binding of @racket[x] and its occurrence.}
314+
315+
]
316+
317+
We can view a variable @emph{name} as just a placeholder for the
318+
lexical address; it tells us which binder the variable comes from.
319+
320+
Using this idea, let's build an alternative interpreter that operates
321+
over an intermediate form of expression that has no notion of
322+
variables, but just lexical addresses:
323+
324+
@#reader scribble/comment-reader
325+
(racketblock
326+
;; type IExpr =
327+
;; | Integer
328+
;; | Boolean
329+
;; | `(address ,Natural)
330+
;; | `(add1 ,Expr)
331+
;; | `(sub1 ,Expr)
332+
;; | `(zero? ,Expr)
333+
;; | `(if ,Expr ,Expr ,Expr)
334+
;; | `(let ((_ ,Expr)) ,Expr)
335+
)
336+
337+
Notice that variables have gone away, replaced by a @racket[`(address
338+
,Natural)] form. The @racket[let] binding form no longer binds a
339+
variable name either.
340+
341+
The idea is that we will translate expression (@tt{Expr}) like:
342+
343+
@racketblock[
344+
(let ((x ...)) x)]
345+
346+
into intermediate expressions (@tt{IExpr}) like:
347+
348+
@racketblock[
349+
(let ((_ ...)) (address 0))
350+
]
351+
352+
And:
353+
354+
@racketblock[
355+
(let ((x ...)) (let ((y ...)) x))
356+
]
357+
358+
into:
359+
360+
@racketblock[
361+
(let ((_ ...)) (let ((_ ...)) (address 1)))
362+
]
363+
364+
365+
Similar to how @racket[interp] is defined in terms of a helper
366+
function that takes an environment mapping variables to their value,
367+
the @racket[translate] function will be defined in terms of a helper
368+
function that takes an environment mapping variables to their lexical
369+
address.
370+
371+
The lexical environment will just be a list of variable names. The
372+
address of a variable occurrence is count of variable names that occur
373+
before it in the list. When a variable is bound (via-@racket[let])
374+
the list grows:
375+
376+
@codeblock-include["fraud/interp-lexical.rkt"]
377+
378+
379+
252380
@section{An Example of Fraud compilation}
253381

254382
Suppose we want to compile @racket['(let ((x 7)) (add1 x))]. There

www/notes/fraud/interp-lexical.rkt

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
#lang racket
2+
(provide (all-defined-out))
3+
(require (only-in "interp.rkt" prim? value? interp-prim))
4+
5+
;; Expr -> IExpr
6+
(define (translate e)
7+
(translate-e e '()))
8+
9+
;; Expr LEnv -> IExpr
10+
(define (translate-e e r)
11+
(match e
12+
[(? value? v) v]
13+
[(list (? prim? p) e)
14+
(list p (translate-e e r))]
15+
[`(if ,e0 ,e1 ,e2)
16+
`(if ,(translate-e e0)
17+
,(translate-e e1)
18+
,(translate-e e2))]
19+
[(? symbol? x)
20+
(lexical-address x r)]
21+
[`(let ((,x ,e0)) ,e1)
22+
`(let ((_ ,(translate e0 r)))
23+
,(translate e1 (cons x r)))]))
24+
25+
;; Variable LEnv -> Natural
26+
(define (lexical-address x r)
27+
(match r
28+
['() (error "unbound variable")]
29+
[(cons y r)
30+
(match (symbol=? x y)
31+
[#t (length r)]
32+
[#f (lexical-address x r)])]))

www/notes/fraud/interp.rkt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@
4848
[(list 'zero? (? integer? i0)) (zero? i0)]
4949
[_ 'err]))
5050

51-
;; Env Variable -> Answer
51+
;; REnv Variable -> Answer
5252
(define (lookup env x)
5353
(match env
5454
['() 'err]
@@ -57,6 +57,6 @@
5757
[#t v]
5858
[#f (lookup env x)])]))
5959

60-
;; Env Variable Integer -> Integer
61-
(define (ext r x i)
62-
(cons (list x i) r))
60+
;; REnv Variable Value -> Value
61+
(define (ext r x v)
62+
(cons (list x v) r))

www/notes/fraud/main.c

Lines changed: 11 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,22 @@
22
#include <stdlib.h>
33
#include <inttypes.h>
44

5-
#define fixnum_mask 3
6-
#define fixnum_tag 0
7-
#define fixnum_shift 2
8-
9-
#define boolean_tag 31
10-
#define boolean_mask 127
11-
#define boolean_shift 7
5+
#define typeof_mask 1
6+
#define val_shift 1
7+
#define type_fixnum 0
8+
#define type_bool 1
129

1310
int64_t entry();
1411

1512
int main(int argc, char** argv) {
1613
int64_t result = entry();
17-
if ((result & fixnum_mask) == fixnum_tag) {
18-
printf("%" PRId64 "\n", result >> fixnum_shift);
19-
} else if ((result & boolean_mask) == boolean_tag) {
20-
if (result >> boolean_shift) {
21-
printf("#t\n");
22-
} else {
23-
printf("#f\n");
24-
}
25-
} else {
26-
printf("unknown value\n");
14+
switch (typeof_mask & result) {
15+
case type_fixnum:
16+
printf("%" PRId64 "\n", result >> val_shift);
17+
break;
18+
case type_bool:
19+
printf("#%c\n", result >> val_shift ? 't' : 'f');
20+
break;
2721
}
2822
return 0;
2923
}

0 commit comments

Comments
 (0)