Tuesday, March 04, 2008

Intentional capture

Procedural hygienic macro systems like the syntax-case system make it possible to write capturing macros--macros which, depending on your philosophy, you might call "non-hygienic." The classic example is the "anaphoric" conditional form if-it, which implicitly binds a variable it to the result of the test expression:
(if-it 42 (+ it 1) #f) ; => 43
The difficulty in getting such a macro right comes when you try to write another macro that expands into if-it. To quote the mzscheme manual's section on macros, "macros that expand into non-hygienic macros rarely work as intended."

Andre van Tonder's SRFI 72 document contains a perfect and concise example, due to Kent Dybvig, of two different ways a macro might expand into a capturing macro. On the one hand, we might want to write when-it, a simple "one-armed" conditional that implicitly binds it in the same way as if-it:
(when-it 42 (+ it 1)) ; => 43
On the other hand, we might want to use if-it to implement the hygienic or macro, which shouldn't capture any variables.
(let ([it 10]) (or #f it)) ; => 10
First, here's the implementation of if-it: we create an identifier for it with the same lexical context as the operator of the expression:
(define-syntax (if-it stx)
(syntax-case stx ()
[(op e1 e2 e3)
(with-syntax ([it (datum->syntax #'op 'it)])
#'(let ([it e1])
(if it e2 e3)))]))
The references that will be captured by the introduced binding of it are the ones that were introduced into the program in the same expansion step as the occurrence of if-it in the macro call. In particular, if the occurrence of if-it was in the original program (i.e., written explicitly by the programmer), it captures references to it that were in the original program; if the occurrence of if-it is the result of a macro expansion, it captures only those references to it that were generated in that same expansion step.

This means that a hygienic macro that expands into if-it will work as expected:
(define-syntax or
(syntax-rules ()
[(op e1 e2)
(if-it e1 it e2)]))
Since the reference to it appears in the same expansion step as the occurrence of if-it, that reference is captured, but no references to it within subexpressions e1 or e2 (which had to have already been there before this expansion step) are captured.

If you want to write another capturing macro that expands into if-it, it's a little more work. Essentially, you have to capture it all over again. The moral of the story is that you always have to ask explicitly for a macro to capture an introduced identifier.
(define-syntax (when-it stx)
(syntax-case stx ()
[(op e1 e2)
(with-syntax ([it* (datum->syntax #'op 'it)])
#'(if-it e1 (let ([it* it]) e2) (void)))]))
Here we once again create an identifier with the same lexical context as the operator, and we bind it to the occurrence of it introduced by if-it.

These are good defaults for a hygienic macro system: it's easier to write hygienic macros but still possible (albeit a little harder) to write macros that capture. This is even true when you abstract over capturing macros: macros that expand into capturing macros are hygienic by default, but with a little more work again, you can create capturing macros that abstract over other capturing macros.

5 comments:

Unknown said...

What about introducing if-it itself with the scope of the caller? (so the when-it macro would expand into if-it as if it were written in the original program)

Dave Herman said...

You mean like:

(define-syntax (when-it stx)
  (syntax-case stx ()
    [(op e1 e2)
     (with-syntax ([if-it (d->s #'op 'if-it)])
        #'(if-it e1 e2 (void)))]))

? Yes, I think that works, although it's a bit more subtle. I kind of like being explicit about exactly what names are being captured.

OTOH, your suggestion has the benefit of not requiring an additional binding --for example, my approach wouldn't deal correctly with set! (unless instead of let-binding it* to it, you use let-alias).

Dave Herman said...

Also, your suggestion is a little more brittle. For example, in mzscheme or Chez, if the implementation of if-it used the entire syntax object of the expression (instead of just the if-it operator) as the lexical context of the captured variable, your implementation of when-it would break and mine would still work.

Anonymous said...

Here is an alternative definition of it-it, that makes the definition of when-it easier:


(define-syntax (if-it stx)
(syntax-case stx ()
[(if-it e1 e2 e3)
(with-syntax ([it (syntax-local-introduce
(syntax-local-get-shadower #'it))])
#'(let ([it e1])
(if it e2 e3)))]))

(define-syntax (when-it stx)
(syntax-case stx ()
[(op e1 e2)
#'(if-it e1 e2 (void))]))


See http://scheme.dk/blog/2006/05/how-to-write-unhygienic-macro.html .

Dave Herman said...

Jens--

Thanks for the comment! My issue with the implementation you suggest is that (I contend) it gets the defaults wrong. Any macro that expands into your if-it gets "infected" with its non-hygiene. So if I write

(define-syntax or
  (syntax-rules ()
    [(or e1 e2)
     (if-it e1 it e2)]))

With your implementation, my or macro will capture it.

The implementation I suggested makes it so that if you want to write another non-hygienic macro, you have to take that extra step to make it non-hygienic; but if you want to write a hygienic macro that expands into if-it, it just works by default.