Skip to content

Commit a3626f4

Browse files
committed
Revise Grift notes.
1 parent d6eff2a commit a3626f4

1 file changed

Lines changed: 94 additions & 1 deletion

File tree

www/notes/grift.scrbl

Lines changed: 94 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,8 @@ Binary expressions are easy to deal with at the level of the semantics
109109
and interpreter. However things are more complicated at the level of
110110
the compiler.
111111

112-
To see the problem consider blindly following the pattern we used:
112+
To see the problem consider blindly following the pattern we used (and
113+
ignoring type errors for the moment):
113114

114115
@#reader scribble/comment-reader
115116
(racketblock
@@ -143,8 +144,100 @@ the first subexpression:
143144

144145
Can you think of how this could go wrong?
145146

147+
To come up with a general solution to this problem, we need to save
148+
the result of @racket[_e0] and then retrieve it after computing
149+
@racket[_e1] and it's time to sum.
146150

151+
Note that this issue only comes up when @racket[_e0] is a
152+
@bold{serious} expression, i.e. an expression that must do some
153+
computation. If @racket[_e0] were a literal integer or a variable, we
154+
could emit working code. For example:
147155

148156

157+
@#reader scribble/comment-reader
158+
(racketblock
159+
;; Integer Expr CEnv -> Asm
160+
;; A special case for compiling (+ i0 e1)
161+
(define (compile-+-int i0 e1 c)
162+
(let ((c1 (compile-e e1 c)))
163+
`(,@c1 ; result in rax
164+
(add rax ,(arithmetic-shift i0 imm-shift)))))
165+
166+
;; Variable Expr CEnv -> Asm
167+
;; A special case for compiling (+ x0 e1)
168+
(define (compile-+-var x0 e1)
169+
(let ((c1 (compile-e e1 c))
170+
(i (lookup x0 c)))
171+
`(,@c1
172+
(add rax (offset rsp ,(- (add1 i)))))))
173+
)
174+
175+
The latter suggests a general solution could be to transform binary
176+
primitive applications into a @racket[let] form that binds the first
177+
subexpression to a variable and then uses the @racket[compile-+-var]
178+
function above. The idea is that every time the compiler encounters
179+
@racket[(+ _e0 _e1)], we transform it to @racket[(let ((_x _e0)) (+ _x
180+
_e1))]. For this to work out, @racket[_x] needs to be some variable
181+
that doesn't appear free in @racket[_e1]. This transfomration is
182+
what's called @bold{ANF} (adminitrative normal form) and is a widely
183+
used intermediate representation for compilers.
184+
185+
186+
But, we can also solve the problem more directly by considering the
187+
code that is generated for the ANF style expression above.
188+
189+
Consider the lexical address of @racket[_x] in the transformed code
190+
above. It is @emph{always} 0 becuase the transformation puts the
191+
@racket[let] immediately around the occurrence of @racket[_x]. So if
192+
we're compiling @racket[(+ _e0 _e1)] in environment @racket[_c] using
193+
this approach, we know the value of @racket[_e0] will live at
194+
@racket[`(offset rsp ,(- (add1 (length c))))]. There's no need for a
195+
@racket[let] binding or a fresh variable name. And this observation
196+
enables us to write a general purpose compiler for binary primitives
197+
that doesn't require any program transformation: we simply push the
198+
value of @racket[e0] on the top of the stack and retrieve it later.
199+
200+
Here is a first cut:
201+
202+
@#reader scribble/comment-reader
203+
(racketblock
204+
;; Expr Expr CEnv -> Asm
205+
(define (compile-+ e0 e1 c)
206+
(let ((x (gensym))) ; generate a fresh variable
207+
(let ((c0 (compile-e e0 c))
208+
(c1 (compile-e e1 (cons x c))))
209+
`(,@c0
210+
(mov (offset rsp ,(add1 (- (length c)))) rax)
211+
,@c1
212+
(add rax (offset rsp ,(- (add1 (lookup x (cons x c))))))))))
213+
)
214+
215+
There are a couple things to notice. First: the @racket[(lookup x
216+
(cons x c))] just produces @racket[(length c)]. Second, when
217+
compiling @racket[_e1] in environment @racket[(cons x c)], we know
218+
that no variable in @racket[_e1] resolves to @racket[x] because
219+
@racket[x] is a freshly @racket[gensym]'d symbol. Putting (an
220+
unreferenced) @racket[x] in the environment serves only to ``bump up''
221+
by one the offset of any variable bound after @racket[x] so as to not
222+
override the spot where @racket[e0]'s values lives. We can acomplish
223+
the same thing by sticking in something that no variable is equal to:
224+
@racket[#f]:
225+
226+
@#reader scribble/comment-reader
227+
(racketblock
228+
;; Expr Expr CEnv -> Asm
229+
(define (compile-+ e0 e1 c)
230+
(let ((c0 (compile-e e0 c))
231+
(c1 (compile-e e1 (cons x c))))
232+
`(,@c0
233+
(mov (offset rsp ,(add1 (- (length c)))) rax)
234+
,@c1
235+
(add rax (offset rsp ,(- (add1 (length c))))))))
236+
)
237+
238+
239+
240+
The commplete code for the compiler is:
241+
149242

150243
@codeblock-include["grift/compile.rkt"]

0 commit comments

Comments
 (0)