Since the early days, Redex has come with a library for building capture-avoiding substitution functions called subst. It's a little awkward to work with, though. Here's a definition of substitution using subst:(define lambda-calculus

(language

[e (e e) x v]

[v (lambda (x) e)]

[x (variable-except lambda)]))

The subst special form relies on the subform all-vars to list the bound variables of an expression. The build subform reconstructs an expression from its pieces, including the variables (potentially automatically freshened to avoid capture) and the subexpressions. Then the subterm subform identifies each subexpression and lists the bound variables in scope for the subexpression.;; lc-subst : variable × expression × expression → expression

(define lc-subst

(subst

[`(lambda (,x) ,body)

(all-vars (list x))

(build (lambda (vars body) `(lambda ,vars ,body)))

(subterm (list x) body)]

[(? symbol?) (variable)]

[`(,e1 ,e2)

(all-vars '())

(build (lambda (vars e1 e2) `(,e1 ,e2)))

(subterm '() e1)

(subterm '() e2)]))

This requires a fair amount of work for the client to coax subst into automatically substituting or freshening bound variables. These days, the authors recommend directly implementing capture-avoiding substitution. The example on the Redex web site gives a definition of capture-avoiding substitution as a metafunction:

This implements capture-avoiding substitution with Redex's new define-metafunction form. Redex's term macro is a quasiquote-like data constructor that produces S-expressions; define-metafunction defines implicitly unquoted functions that can be invoked within a term form. The subst metafunction substitutes an expression e_1 for a variable x_1 within an expression e_2 by applying;; subst : variable × expression × expression → expression

(define-metafunction subst

lambda-calculus

;; 1. x_1 bound, so don't continue in lambda body

[(x_1 e_1 (lambda (x_1) e_2))

(lambda (x_1) e_2)]

;; 2. in this case, we know that no capture can occur,

;; so it is safe to recur normally.

[(x_1 e_1 (lambda (x_2) e_2))

(lambda (x_2) (subst (x_1 e_1 e_2)))

(side-condition

(equal? (variable-not-in (term e_1) (term x_2))

(term x_2)))]

;; 3. general purpose capture avoiding case

[(x_1 e_1 (lambda (x_2) e_2))

,(term-let ([x_new (variable-not-in (term (x_1 e_1 e_2))

(term x_2))])

(term

(lambda (x_new)

(subst (x_1 e_1 (subst (x_2 x_new e_2)))))))]

;; 4. replace x_1 with e_1

[(x_1 e_1 x_1) e_1]

;; 5. x_1 and x_2 are different, so don't replace

[(x_1 e_1 x_2) x_2]

;; 6. ordinary recursive case

[(x_1 e_1 (e_2 e_3))

((subst (x_1 e_1 e_2)) (subst (x_1 e_1 e_3)))])

within a term form.(subst (x_1 e_1 e_2))

This metafunction is correct, but I find its definition a little subtle. I think it's clearer to separate concerns a little more and divide its definition into two pieces. Following Gunter's definition of capture-avoiding substitution we define a separate "change of variable" function:

This function replaces a variable name x_1 with a new name x_2 within an expression e_1. The subst metafunction uses change-variable for renaming a bound variable with a fresh name.;; change-variable : variable × variable × expression → expression

(define-metafunction change-variable

lambda-calculus

[(x_1 x_2 x_1) x_2]

[(x_1 x_2 x_3) x_3]

[(x_1 x_2 (lambda (x_1) e_1))

(lambda (x_1) e_1)]

[(x_1 x_2 (lambda (x_3) e_1))

(lambda (x_3) (change-variable (x_1 x_2 e_1)))]

[(x_1 x_2 (e_1 e_2))

((change-variable (x_1 x_2 e_1)) (change-variable (x_1 x_2 e_2)))])

I prefer this definition of capture-avoiding substitution for several reasons. First, it corresponds directly to Gunter's definition. Furthermore, its runtime efficiency is a little clearer. The first definition of the metafunction recursively calls itself twice on the same subexpression in case #3; the reason why this doesn't cause exponential growth is because its behavior in one of the two cases is equivalent to change-variable (because it substitutes a variable) and consequently more efficient. But this took me a while to figure out. Finally, the types are a little tighter. For example, if we were just defining call-by-value substitution, subst could take a value for its second argument, rather than arbitrary expressions.;; subst : variable × expression × expression → expression

(define-metafunction subst

lambda-calculus

[(x_1 e_1 (lambda (x_1) e_2))

(lambda (x_1) e_2)]

[(x_1 e_1 (lambda (x_2) e_2))

,(term-let ([x_new (variable-not-in (term e_1) (term x_2))])

(term

(lambda (x_new)

(subst (x_1 e_1 (change-variable (x_2 x_new e_2)))))))]

[(x_1 e_1 x_1) e_1]

[(x_1 e_1 x_2) x_2]

[(x_1 e_1 (e_2 e_3))

((subst (x_1 e_1 e_2)) (subst (x_1 e_1 e_3)))])

## 2 comments:

Hey, that's a nice way to write it! And, I agree, clearer.

Interesting post!

But the second case of your `subst' should avoid picking a new name that can capture free variables in e2:

> (term (subst (x y (lambda (y) (y y1)))))

(lambda (y1) (y1 y1))

Post a Comment