|
| 1 | +#lang scribble/manual |
| 2 | +@title[#:tag "Assignment 3" #:style 'unnumbered]{Assignment 3: Conditional forms and parsing} |
| 3 | + |
| 4 | +@(require (for-label (except-in racket ...))) |
| 5 | +@(require "../notes/con-plus/semantics.rkt") |
| 6 | +@(require redex/pict) |
| 7 | + |
| 8 | +@bold{Due: Tues, Sept 17, 11:59PM} |
| 9 | + |
| 10 | +@(define repo "https://classroom.github.com/a/pxQO5YFd") |
| 11 | + |
| 12 | +The goal of this assignment is to extend a compiler with some simple |
| 13 | +unary numeric operations and conditional expressions, and to write a |
| 14 | +parser. |
| 15 | + |
| 16 | +Assignment repository: |
| 17 | +@centered{@link[repo repo]} |
| 18 | + |
| 19 | +You are given a repository with a starter compiler for the Con |
| 20 | +language we studied in class. You are tasked with: |
| 21 | + |
| 22 | +@itemlist[ |
| 23 | +@item{extending the language in a number of ways and} |
| 24 | +@item{implementing a parser for the language} |
| 25 | +] |
| 26 | + |
| 27 | +@section[#:tag-prefix "a3-" #:style 'unnumbered]{More primitives} |
| 28 | + |
| 29 | +Add the following forms of expression to the Con language: |
| 30 | + |
| 31 | +@itemlist[ |
| 32 | +@item{@racket[(abs _e)]: compute the absolute value of @racket[_e], and} |
| 33 | +@item{@racket[(- _e)]: flips the sign of @racket[_e], i.e. compute @math{0-@racket[_e]}.} |
| 34 | +] |
| 35 | + |
| 36 | +Here's one possible way to compute the absolute value of the value in @tt{rax}: |
| 37 | + |
| 38 | +@verbatim{ |
| 39 | +mov rbx, rax |
| 40 | +neg rax |
| 41 | +cmovl rax, rbx |
| 42 | +} |
| 43 | + |
| 44 | +To do this, you should: |
| 45 | +@itemlist[ |
| 46 | +@item{Update @tt{ast.rkt} to include these new forms of expression.} |
| 47 | +@item{Update @tt{syntax.rkt} to recognize s-expressions that represent valid programs.} |
| 48 | +@item{Update @tt{interp.rkt} to correctly interpret these expressions.} |
| 49 | +@item{Update @tt{compile.rkt} to correctly compile these expressions.} |
| 50 | +] |
| 51 | + |
| 52 | +The @tt{neg} and @tt{cmovl} instructions have been included in the |
| 53 | +given @tt{asm} code. If you need other x86 instructions, you will |
| 54 | +need to modify the @tt{asm/*} code. |
| 55 | + |
| 56 | + |
| 57 | +@section[#:tag-prefix "a3-" #:style 'unnumbered]{Con with Cond} |
| 58 | + |
| 59 | +The Con language we studied added a simple form of performing |
| 60 | +conditional evaluation of sub-expressions: |
| 61 | + |
| 62 | +@racketblock[ |
| 63 | +(if (zero? _e0) _e1 _e2) |
| 64 | +] |
| 65 | + |
| 66 | +However, in the original paper on Lisp, |
| 67 | +@link["http://jmc.stanford.edu/articles/recursive.html"]{@emph{Recursive |
| 68 | +Functions of Symbolic Expressions and Their Computation by Machine, |
| 69 | +Part I}}, John McCarthy introduced a generalization of @racket[if] |
| 70 | +called ``conditional expressions,'' which we could add to our |
| 71 | +language with the following syntax: |
| 72 | + |
| 73 | +@racketblock[ |
| 74 | +(cond [(zero? _e-p1) _e-a1] |
| 75 | + ... |
| 76 | + [else _e-an]) |
| 77 | +] |
| 78 | + |
| 79 | +A @racket[cond] expression has any number of clauses @racket[[(zero? |
| 80 | +_e-pi) _e-ai] ...], followed by an ``else'' clause @racket[[else |
| 81 | +_en]]. (We are using @racket[zero?] here to avoid having to |
| 82 | +introduce booleans as a distinct data type; a restriction we |
| 83 | +will remove later.) |
| 84 | + |
| 85 | +The meaning of a @racket[cond] expression is computed by evaluating |
| 86 | +each @racket[(zero? _e-pi)] in order until the first one that is true |
| 87 | +is found, in which case, the corresponding @racket[_e-ai] is evaluated |
| 88 | +and its value is the value of the @racket[cond] expression. If no |
| 89 | +such @racket[_e-pi] exists, @racket[_e-an]'s value is the value of the |
| 90 | +@racket[cond]. |
| 91 | + |
| 92 | +The formal semantics can be defined as: |
| 93 | + |
| 94 | +@(define ((rewrite s) lws) |
| 95 | + (define lhs (list-ref lws 2)) |
| 96 | + (define rhs (list-ref lws 3)) |
| 97 | + (list "" lhs (string-append " " (symbol->string s) " ") rhs "")) |
| 98 | + |
| 99 | +@(require (only-in racket add-between)) |
| 100 | +@(define-syntax-rule (show-judgment name i j) |
| 101 | + (with-unquote-rewriter |
| 102 | + (lambda (lw) |
| 103 | + (build-lw (lw-e lw) (lw-line lw) (lw-line-span lw) (lw-column lw) (lw-column-span lw))) |
| 104 | + (with-compound-rewriters (['+ (rewrite '+)] |
| 105 | + ['- (rewrite '–)] |
| 106 | + ['= (rewrite '=)] |
| 107 | + ['!= (rewrite '≠)]) |
| 108 | + (apply centered |
| 109 | + (add-between |
| 110 | + (build-list (- j i) |
| 111 | + (λ (n) (begin (judgment-form-cases (list (+ n i))) |
| 112 | + (render-judgment-form name)))) |
| 113 | + (hspace 4)))))) |
| 114 | + |
| 115 | +@(show-judgment 𝑪 0 1) |
| 116 | +@(show-judgment 𝑪 1 2) |
| 117 | + |
| 118 | + |
| 119 | +Your task is to extend Con with this (restricted) form of @racket[cond]. |
| 120 | + |
| 121 | +To do this, you should: |
| 122 | + |
| 123 | +@itemlist[ |
| 124 | +@item{Update @tt{ast.rkt} to include @racket[cond] expressions.} |
| 125 | +@item{Update @tt{syntax.rkt} to recognize s-expressions that represent valid programs.} |
| 126 | +@item{Update @tt{interp.rkt} to correctly interpret @racket[cond] expressions.} |
| 127 | +@item{Update @tt{compile.rkt} to correctly compile @racket[cond] expressions.} |
| 128 | +] |
| 129 | + |
| 130 | +The subset of x86 needed to compile this extension of Con should not |
| 131 | +require anything more than what was used for Con without |
| 132 | +@racket[cond], so you should not need make changes to @tt{asm/*}. |
| 133 | + |
| 134 | +@section[#:tag-prefix "a3-" #:style 'unnumbered]{Reading is Overrated} |
| 135 | + |
| 136 | +We have so far side-stepped the issue of parsing by (1) relying on |
| 137 | +s-expression notation for the concrete syntax of programs and (2) |
| 138 | +using the built-in @racket[read] function for parsing s-expressions. |
| 139 | + |
| 140 | +Your task is to design and implement a parser for the extended Con |
| 141 | +language based on the following grammar: |
| 142 | + |
| 143 | +@verbatim{ |
| 144 | +<expr> ::= integer |
| 145 | + | ( <compound> ) |
| 146 | + | [ <compound> ] |
| 147 | + |
| 148 | +<compound> ::= <prim> <expr> |
| 149 | + | if <question> <expr> <expr> |
| 150 | + | cond <clause>* <else> |
| 151 | + |
| 152 | +<prim> ::= add1 | sub1 | abs | - |
| 153 | + |
| 154 | +<clause> ::= ( <question> <expr> ) |
| 155 | + | [ <question> <expr> ] |
| 156 | + |
| 157 | +<question> ::= ( zero? <expr> ) |
| 158 | + | [ zero? <expr> ] |
| 159 | + |
| 160 | +<else> ::= ( else <expr> ) |
| 161 | + | [ else <expr> ] |
| 162 | +} |
| 163 | + |
| 164 | +There is a lexer given to you in @tt{lex.rkt}, which provides two |
| 165 | +functions: @racket[lex-string] and @racket[lex-port], which consume a |
| 166 | +string or an input port, respectively, and produce a list of tokens, |
| 167 | +which are defined as: |
| 168 | + |
| 169 | +@#reader scribble/comment-reader |
| 170 | +(racketblock |
| 171 | +;; type Token = |
| 172 | +;; | Integer |
| 173 | +;; | 'add1 |
| 174 | +;; | 'sub1 |
| 175 | +;; | 'zero? |
| 176 | +;; | 'cond |
| 177 | +;; | 'else |
| 178 | +;; | 'abs |
| 179 | +;; | '- |
| 180 | +;; | 'lparen ;; ( |
| 181 | +;; | 'rparen ;; ) |
| 182 | +;; | 'lsquare ;; [ |
| 183 | +;; | 'rsquare ;; ] |
| 184 | +;; | 'eof ;; end of file |
| 185 | +) |
| 186 | + |
| 187 | +The lexer will take care of reading the @tt{#lang racket} header and |
| 188 | +remove any whitespace. |
| 189 | + |
| 190 | +You must complete the code in @tt{parse.rkt} to implement the parser |
| 191 | +which constructs an s-expression representing a valid (extended) Con |
| 192 | +expression, if possible, from a list of tokens. The @racket[parse] |
| 193 | +function should have the following signature and must be provided by |
| 194 | +the module: |
| 195 | + |
| 196 | +@#reader scribble/comment-reader |
| 197 | +(racketblock |
| 198 | +;; parse : [Listof Token] -> Expr |
| 199 | +) |
| 200 | + |
| 201 | +As an example, @racket[parse] should produce @racket['(add1 (sub1 7))] |
| 202 | +if given @racket['(lparen add1 lparen sub1 7 rparen rparen eof)]. |
| 203 | + |
| 204 | +You should not need to make any changes to @tt{lex.rkt}. |
| 205 | + |
| 206 | +You may use any approach you'd like to write the parser, but following |
| 207 | +the recursive descent predictive parsing as studied in CMSC 330 is |
| 208 | +recommended. See the |
| 209 | +@link["http://www.cs.umd.edu/class/spring2019/cmsc330/lectures/04-parsing.pdf"]{slides} |
| 210 | +if you need a refresher. |
| 211 | + |
| 212 | +If you want to set things up as done in 330, you can do the following: |
| 213 | + |
| 214 | +@#reader scribble/comment-reader |
| 215 | +(racketblock |
| 216 | +(define *input* (box '())) |
| 217 | + |
| 218 | +;; [Listof Token] -> Expr |
| 219 | +(define (parse lot) |
| 220 | + (set-box! *input* lot) |
| 221 | + (let ((e (parse-expr!)) |
| 222 | + (_ (match-tok! 'eof))) |
| 223 | + e)) |
| 224 | + |
| 225 | +;; -> Expr |
| 226 | +;; EFFECT: consume one expression's worth of tokens |
| 227 | +(define (parse-expr!) |
| 228 | + (match (look-ahead) |
| 229 | + [... ...])) |
| 230 | + |
| 231 | +;; -> Token |
| 232 | +;; Produce (but don't consume) the next token |
| 233 | +(define (look-ahead) |
| 234 | + (match *input* |
| 235 | + ['() (error "no look ahead available")] |
| 236 | + [(cons t _) t])) |
| 237 | + |
| 238 | +;; Token -> Token |
| 239 | +;; EFFECT: consumes one token of input |
| 240 | +(define (match-tok! t) |
| 241 | + (match *input* |
| 242 | + ['() (error "no token available")] |
| 243 | + [(cons next ts) |
| 244 | + (set-box! *input* ts) |
| 245 | + (unless (equal? t next) |
| 246 | + (error "parse error")) |
| 247 | + t])) |
| 248 | +) |
| 249 | + |
| 250 | +The @racket[box], @racket[unbox], and @racket[set-box!] functions |
| 251 | +correspond to OCaml's @tt{ref}, @tt{!}, and @tt{:=} operators, |
| 252 | +respectively. |
| 253 | + |
| 254 | +The @tt{bang!} naming convention is a Scheme convention for marking |
| 255 | +effectful functions (but it's just a naming convention). |
| 256 | + |
| 257 | +This construction closely follows the 330 notes. |
| 258 | + |
| 259 | +There is one complication, which is that the grammar requires 2 tokens |
| 260 | +of look-ahead when parsing a @racket[cond] in order to determine if |
| 261 | +the next thing to parse is a @tt{<clause>} or an @tt{<else>}. |
| 262 | + |
| 263 | +The simplest solution is just to add a @racket[look-ahead2] function |
| 264 | +that let's you peek at the second token in the input stream. |
| 265 | + |
| 266 | +As an alternative to the 330 design, you could try to do things |
| 267 | +functionally with the following set-up: |
| 268 | + |
| 269 | +@#reader scribble/comment-reader |
| 270 | +(racketblock |
| 271 | +;; [Listof Token] -> Expr |
| 272 | +(define (parse lot) |
| 273 | + (match (parse-expr lot) |
| 274 | + [(cons '(eof) e) e] |
| 275 | + [_ (error "parse error")])) |
| 276 | + |
| 277 | +;; [Listof Token] -> (Pairof [Listof Token] Expr) |
| 278 | +(define (parse-expr lot) |
| 279 | + (match lot |
| 280 | + [... ...])) |
| 281 | +) |
| 282 | + |
| 283 | +Here the idea is that each function that corresponds to a non-terminal |
| 284 | +is given the list of tokens to parse. It produces a pair of things: |
| 285 | +the remaining tokens after parsing and the thing it parsed. (The |
| 286 | +functional approach is much easier to test, IMO.) |
| 287 | + |
| 288 | + |
| 289 | +Once your parser is complete, you can make the noted changes in |
| 290 | +@tt{compile-file.rkt} and @tt{interp-file.rkt} to make use of your own |
| 291 | +parser and remove the dependence on Racket's @racket[read] function. |
| 292 | + |
| 293 | +@section[#:tag-prefix "a3-" #:style 'unnumbered]{Testing} |
| 294 | + |
| 295 | +You can test your code in several ways: |
| 296 | + |
| 297 | +@itemlist[ |
| 298 | + |
| 299 | + @item{Using the command line @tt{raco test .} from |
| 300 | + the directory containing the repository to test everything.} |
| 301 | + |
| 302 | + @item{Using the command line @tt{raco test <file>} to |
| 303 | + test only @tt{<file>}.} |
| 304 | + |
| 305 | + @item{Pushing to github. You can |
| 306 | + see test reports at: |
| 307 | + @centered{@link["https://travis-ci.com/cmsc430/"]{ |
| 308 | + https://travis-ci.com/cmsc430/}} |
| 309 | + |
| 310 | + (You will need to be signed in in order see results for your private repo.)}] |
| 311 | + |
| 312 | +Note that only a small number of tests are given to you, so you should |
| 313 | +write additional test cases. |
| 314 | + |
| 315 | +Ther @tt{random.rkt} module provides a @racket[random-expr] function |
| 316 | +for generating random (extended) Con expressions. It is used in the |
| 317 | +@tt{test/compile-rand.rkt} file to randomly test compiler correctness. |
| 318 | + |
| 319 | +There is a property-based random tester for the compiler in |
| 320 | +@tt{test/compile-rand.rkt} that compiles and runs 500 random programs. |
| 321 | + |
| 322 | +@section[#:tag-prefix "a3-" #:style 'unnumbered]{Submitting} |
| 323 | + |
| 324 | +Pushing your local repository to github ``submits'' your work. We |
| 325 | +will grade the latest submission that occurs before the deadline. |
| 326 | + |
0 commit comments