Showing posts with label visitor pattern. Show all posts
Showing posts with label visitor pattern. Show all posts

Monday, May 05, 2008

A functional visitor pattern

When your data definitions get big enough, it becomes Tedious and Error-Prone (oh my!) to write multiple operations over the same datatype, so it's useful to define a generic traversal. Here's a pattern I used recently for the ECMAScript Edition 4 reference implementation.

Lots of languages have mutually recursive AST definitions. For example, take a language with mutually recursive expressions Ast.expr, statements Ast.stmt, and declarations Ast.decl. Define an interface VISITOR:
signature VISITOR = sig
datatype cmd = Stop | Skip | Cont

type ('a, 'z) method = 'a * 'z -> 'z * cmd

type 'z visitor = { visitExpr : (Ast.expr, 'z) method,
visitStmt : (Ast.stmt, 'z) method,
visitDecl : (Ast.decl, 'z) method }

val foldExpr : 'z visitor * Ast.expr * 'z -> 'z
val foldStmt : 'z visitor * Ast.stmt * 'z -> 'z
val foldDecl : 'z visitor * Ast.decl * 'z -> 'z
end;
A visitor is a record of operations, one for each variant of AST nodes. (Alternatively, we could have defined a single type datatype ast = Expr of ... | Stmt of ... | Decl of ...;. But there's no need for the extra indirection.) Each operation takes a node and an intermediate result, and produces a new intermediate result along with a "traversal command" cmd.

This simple traversal language instructs the fold either to continue (Cont), to skip all descendants of the current node (Skip), or to abort the traversal entirely (Stop).

The implementation of the fold uses a few auxiliary functions to compute a list of child nodes for each node:
type children = Ast.expr list * Ast.stmt list * Ast.decl list

fun exprChildren (e : Ast.expr) : children = ...
fun stmtChildren (e : Ast.stmt) : children = ...
fun declChildren (e : Ast.decl) : children = ...
The fold itself is (internally) in CPS. At each node, there are three potential continuations: the empty continuation for aborting, the given continuation for skipping the current node's children, or a new continuation that first traverses the current node's children and then uses the given continuation.
type 'z cont = 'z -> 'z

fun cont (result : 'z, cmd : cmd)
(skipk : 'z cont)
(contk : 'z cont)
: 'z =
case cmd of
Stop => result
| Skip => skipk result
| Cont => contk result

fun foldExprK (v as { visitExpr, ... }, expr, init) k =
cont (visitExpr (expr, init))
(fn x => foldAllK v (exprChildren expr) x k)
k

and foldStmtK (v as { visitStmt, ... }, stmt, init) k =
cont (visitStmt (stmt, init))
(fn x => foldAllK v (stmtChildren stmt) x k)
k

and foldDeclK (v as { visitDecl, ... }, decl, init) k =
cont (visitDecl (decl, init))
(fn x => foldAllK v (declChildren decl) x k)
k

and foldAllK (v : 'z visitor)
((exprs, stmts, decls) : children)
(init : 'z)
(k : 'z cont)
: 'z =
let
fun foldList f (v, asts, init) k =
case asts of
[] => k init
| [ast] => f (v, ast, init) k
| ast::asts => f (v, ast, init)
(fn x => foldList f (v, asts, x) k)
in
foldList foldExprK (v, exprs, init) (fn result =>
foldList foldStmtK (v, stmts, result) (fn result =>
foldList foldDeclK (v, decls, result) k))
end
I originally had foldList defined at the top-level, in the same mutual recursion group with foldExprK et al, and got smacked by SML because that would require polymorphic recursion. Luckily I was able to work around it by placing its definition inside the body of foldAllK, but since I'd never run into SML's lack of polymorphic recursion before, it took quite a while to decipher the type error (with help from #sml).

Finally, the top-level functions seed the CPS functions with an initial continuation:
fun id x = x

fun foldExpr x = foldExprK x id
fun foldStmt x = foldStmtK x id
fun foldDecl x = foldDeclK x id
(Note that because of the value restriction, there's no possibility of refactoring these into point-free val-declarations.)