Skip to content

Commit 753d1cf

Browse files
committed
Add graphviz notes
1 parent 99db851 commit 753d1cf

1 file changed

Lines changed: 248 additions & 0 deletions

File tree

www/notes/graphviz.scrbl

Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
#lang scribble/manual
2+
3+
@(require (for-label (except-in racket ...)))
4+
@(require redex/pict
5+
racket/runtime-path
6+
scribble/examples
7+
"utils.rkt"
8+
"ev.rkt"
9+
"../utils.rkt")
10+
11+
@(define codeblock-include (make-codeblock-include #'h))
12+
13+
@(for-each (λ (f) (ev `(require (file ,(path->string (build-path notes "knock" f))))))
14+
'("interp.rkt" "compile.rkt" "ast.rkt" "syntax.rkt" "asm/interp.rkt" "asm/printer.rkt"))
15+
16+
@title[#:tag "Graphviz"]{Using Graphviz/dot to visualize our AST}
17+
18+
@table-of-contents[]
19+
20+
@section[#:tag-prefix "graphviz"]{Visualizing ASTs}
21+
22+
Abstract Syntax Trees (ASTs) are a useful abstraction when dealing with
23+
programming languages as an object for analysis or manipulation (e.g.
24+
compilation). At the same time, these structures can quickly become
25+
too large reason about just by looking at it. For example, in Knock,
26+
our AST for @racket[(if (zero? x) (add1 (add1 x)) (sub1 x))] looks
27+
like the following:
28+
29+
@#reader scribble/comment-reader
30+
(racketblock
31+
(if-e
32+
(prim-e 'zero? (list (var-e 'x)))
33+
(prim-e 'add1 (list (prim-e 'add1 (list (int-e 1)))))
34+
(prim-e 'sub1 (list (var-e 'x))))
35+
)
36+
37+
This has all the information necessary for manipulating our program (and more),
38+
it's a bit unwieldy to look at. Particularly when debugging, it can
39+
be useful to see the overal @emph{shape} of the AST. This is particularly
40+
true when speaking about program transformations.
41+
42+
Take, for example, the program transformation from the first Midterm. Applying
43+
that transformation (updated for Knock) to the above program, results in
44+
the following AST:
45+
46+
@#reader scribble/comment-reader
47+
(racketblock
48+
(if-e
49+
(prim-e 'zero? (list (var-e 'x)
50+
(let-e
51+
(list (binding 'g387 (prim-e 'add1 (list (int-e 1))))
52+
(prim-e 'add1 (list (var-e 'g387))
53+
(prim-e 'sub1 (list (var-e 'x)))
54+
55+
Was the program transformation done correctly? If you study the AST
56+
carefully, you can determine that it was. However, it would be easier
57+
if we could, at a glance, answer the question ``Are primitive operations
58+
only applied to simple (i.e. not nested) expressions?''
59+
60+
Using diagrams makes answering this question marginally easier:
61+
62+
Before transformation:
63+
64+
@image{img/initial.png}
65+
66+
After transformation:
67+
68+
@image{img/transformed.png}
69+
70+
The diagram above helps us visualize the transformed AST, but we still
71+
have to study the diagram carefully to know which nodes correspond
72+
to primitive operations (which are the subject of the transformation).
73+
This can be remedied easily, by coloring these nodes differently:
74+
75+
Before transformation:
76+
77+
@image{img/initial-v2.png}
78+
79+
After transformation:
80+
81+
@image{img/transformed-v2.png}
82+
83+
These diagrams were made using @tt{dot} a tool provided by the
84+
@link["https://graphviz.org/"]{Graphviz}, which is a set of software components
85+
for visualizing graphs.
86+
87+
@section[#:tag-prefix "graphviz"]{Using dot}
88+
89+
Graphviz has many components, but we will focus on @tt{dot}, which is
90+
the tool for laying out directed graphs. The full manual for @tt{dot}
91+
can be found on the graphviz website:
92+
@link["https://www.graphviz.org/pdf/dotguide.pdf"]{https://www.graphviz.org/pdf/dotguide.pdf}.
93+
94+
Instructions for downloading Graphviz (and therefore @tt{dot}) can be found on
95+
their website as well:
96+
@link["https://www.graphviz.org/download/"]{https://www.graphviz.org/download/}
97+
98+
The syntax for @tt{dot} files is fairly straightforward, you first declare the
99+
type of graph, and give it a name. For our purposes the type will always be
100+
@tt{digraph} (i.e. directed-graph), and the name can be whatever you choose
101+
(though it will likely not matter much). For example:
102+
103+
@verbatim|
104+
digraph CMSC430 {
105+
...
106+
}
107+
|
108+
109+
The ellipses are where you describe the graph you'd like to visualize. The
110+
designers of Graphviz provide a grammar describing the language accepted by
111+
their tools (I wish all system designers provided a grammar!). This can be
112+
found on the Graphviz website:
113+
@link["https://graphviz.org/doc/info/lang.html"]{https://graphviz.org/doc/info/lang.html}
114+
115+
Most of the time you will not need to consult the grammar, as most of the
116+
simple rules are straightforward for those that have programmed in C or Java.
117+
118+
In short, the description of a graph is a list of statements, statements can
119+
take many forms, but for this course (and most likely for any uses beyond this
120+
course), you can basically just use the following three types of statements:
121+
122+
123+
@itemlist[
124+
125+
@item{Node statements}
126+
@item{Edge statements}
127+
@item{Attribute statements}
128+
129+
]
130+
131+
132+
Node statements are just an ASCII string (representing a Node ID) and an
133+
optional list of attributes for that node. For example:
134+
135+
@verbatim|
136+
digraph CMSC430 {
137+
lexer;
138+
parser [shape=box];
139+
code_gen [color=red];
140+
}
141+
|
142+
143+
Using the @tt{dot} tool on a file with the above as its contents produces the
144+
following diagram:
145+
146+
@image{img/nodes.png}
147+
148+
Edge statements connect nodes in our graph, for example:
149+
150+
@verbatim|
151+
digraph CMSC430 {
152+
lexer -> parser -> code_gen;
153+
parser [shape=box];
154+
code_gen [color=red];
155+
}
156+
|
157+
158+
This produces the following diagram:
159+
160+
@image{img/edges1.png}
161+
162+
You may wonder if the order matters here. While the @emph{horizontal} order
163+
matters when specifying the edges in an edge statement, the @emph{vertical}
164+
order does not matter in this case. The following produces the same diagram:
165+
166+
@verbatim|
167+
digraph CMSC430 {
168+
parser [shape=box];
169+
code_gen [color=red];
170+
lexer -> parser -> code_gen;
171+
}
172+
|
173+
174+
Notice that @tt{lexer} does not have its own `declaration' this is because it
175+
is unnecessary unless you want to attach attributes to a node (as we do
176+
with @tt{parser} and @tt{code_gen}).
177+
178+
Edge statements also support an optional list of attributes, the following
179+
produces a similar diagram except that both edges are shaded ``deeppink2'' (for
180+
the full list of supported colors, see the official documentation).
181+
182+
@verbatim|
183+
digraph CMSC430 {
184+
lexer -> parser -> code_gen [color=deeppink2];
185+
parser [shape=box];
186+
code_gen [color=red];
187+
}
188+
|
189+
190+
Attribute nodes describe a set of attributes that apply to all subsequent
191+
statements (which means that vertical order @emph{does} matter here!). Unless
192+
overridden by a specific attribute, all statements following an attribute
193+
statement will `default' to the attributes specified in the statement.
194+
195+
Here we added three attribute statements. Take a minute to study the example
196+
below and see how each attribute statement affects the output.
197+
198+
199+
@verbatim|
200+
digraph CMSC430 {
201+
edge [color=blue];
202+
lexer -> parser
203+
edge [color=deeppink2];
204+
node [shape=triangle];
205+
parser -> optimizer;
206+
parser [shape=box];
207+
code_gen [color=red];
208+
optimizer -> code_gen;
209+
}
210+
|
211+
212+
@image{img/edges3.png}
213+
214+
215+
@section[#:tag-prefix "graphviz"]{Using graphviz programmatically}
216+
217+
What we've done is write a small Racket library that abstracts away some of the
218+
details of making @tt{dot} diagrams so that we can automatically generate
219+
digrams from our AST. One such detail is that we have to generate unique node
220+
IDs for each node in our AST (we do this using @tt{gensym}), but then add
221+
attributes that label our nodes with the relevant information (e.g. that it's
222+
an @tt{if} node).
223+
224+
Here is an example of a @tt{dot} description make using our library on the program
225+
@racket[(if (zero? x) 1 2)]:
226+
227+
@verbatim|
228+
digraph prog {
229+
g850 [ label=" x " ];
230+
g849 [ color=red,label=" (zero? ...) " ];
231+
g849 -> g850 ;
232+
g851 [ label=" 1 " ];
233+
g852 [ label=" 2 " ];
234+
g848 [ label=" if " ];
235+
g848 -> g849 ;
236+
g848 -> g851 ;
237+
g848 -> g852 ;
238+
}
239+
|
240+
241+
Not super nice to read, but we had a program write it for us!
242+
243+
244+
The complete library (three files):
245+
246+
@codeblock-include["knock/dot.rkt"]
247+
@codeblock-include["knock/render-ast.rkt"]
248+
@codeblock-include["knock/pretty-printer.rkt"]

0 commit comments

Comments
 (0)