SML/NJ includes a quasiquotation facility, but you have to write your own parser for the object language. In this case, my object language was ML itself. So I wanted to reuse the ML parsing libraries that come with SML/NJ. But a quasiquoted source expression is a list of "fragments"--either a (quoted) string or an (unquoted) term:
So I needed a way to extend the existing parser to be able to parse not just raw ML source, but ML with holes in it.datatype 'a frag = QUOTE of string | ANTIQUOTE of 'a
My solution is to generate fresh variables at the holes (analogous to "Skolem constants" in a type inference algorithm) to 1) generate a single, parseable string; 2) parse that string using the existing parser; and then 3) replace instances of the fresh variables in the resulting AST with their original unquoted terms. Luckily the parser that comes with SML/NJ (from the MLRISC library) has a generic term-rewriting library so this last step was easy.
Here's the entire parser:
The skolemize function traverses the list of fragments, generating fresh "skolem variables" for each unquoted term. The ML parser then parses the resulting string. The last part (the exp function and MLAstRewriter.rewrite) simply uses the rewriter module to replace any instances of IDexp (an AST representing an ML variable reference) that contain skolem variables with their original, unquoted terms, and otherwise leave the expressions intact.fun parseQuasiML frags =
let fun skolemize [] = ([], [])
| skolemize ((QUOTE s)::frags) =
let val (ss, skolems) = skolemize frags
in
(s::ss, skolems)
end
| skolemize ((ANTIQUOTE x)::frags) =
let val skolem = gensym "qqSkolem"
val (ss, skolems) = skolemize frags
in
(skolem::ss, (skolem, x)::skolems)
end
val (ss, skolems) = skolemize frags
fun exp _ (e as (IDexp (IDENT ([], x)))) =
(case lookup (x, skolems) of
NONE => e
| SOME e' => e')
| exp _ e = e
val NIL = MLAstRewriter.noRewrite
val rw = #exp(MLAstRewriter.rewrite
{exp=exp,pat=NIL,decl=NIL,ty=NIL,sexp=NIL})
in
rw (parseMLExp (String.concat ss))
end
Update: Bawden suggests that quasiquotation (or at least the term "quasiquotation") is indeed due to Quine.
3 comments:
This has an interesting relationship to macros, since the places you can unquote in your system are precisely the places that you could write an expression macro in Scheme.
Interestingly enough, there are two axes on which you can extend this, and you get something more expressive (loosely speaking) than Scheme expression macros: the types of holes and the types of fragments.
For the first axis, you create a datatype of holes with multiple variants representing multiple types of holes:
datatype hole = EXPhole of exp | PAThole of pat
Then you extend parseQuasiML to substitute unquoted EXPholes in expression positions and unquoted PATholes in pattern positions, raising a "dynamic type error" for ill-typed fragments (e.g., ANTIQUOTE (PAThole _) in an expression position).
For the second axis, you extend parseQuasiML to take an extra argument representing which parser you want to use: an expression parser, pattern parser, declaration parser, etc. That way you can quasiquote different types of program terms.
Thank you. Recently, I had to deal with quotation/antiquotaion and your posting helped me a lot. Cheers, Wonseok
Post a Comment