There are lots of ways to implement substitution in

PLT Redex (an embedded DSL in PLT Scheme for defining operational semantics with rewrite rules and evaluation contexts). I'll demonstrate with the lambda calculus, of course:

(define lambda-calculus

(language

[e (e e) x v]

[v (lambda (x) e)]

[x (variable-except lambda)]))

Since the early days, Redex has come with a library for building capture-avoiding substitution functions called

subst. It's a little awkward to work with, though. Here's a definition of substitution using

subst:

;; lc-subst : variable × expression × expression → expression

(define lc-subst

(subst

[`(lambda (,x) ,body)

(all-vars (list x))

(build (lambda (vars body) `(lambda ,vars ,body)))

(subterm (list x) body)]

[(? symbol?) (variable)]

[`(,e1 ,e2)

(all-vars '())

(build (lambda (vars e1 e2) `(,e1 ,e2)))

(subterm '() e1)

(subterm '() e2)]))

The

subst special form relies on the subform

all-vars to list the bound variables of an expression. The

build subform reconstructs an expression from its pieces, including the variables (potentially automatically freshened to avoid capture) and the subexpressions. Then the

subterm subform identifies each subexpression and lists the bound variables in scope for the subexpression.

This requires a fair amount of work for the client to coax

subst into automatically substituting or freshening bound variables. These days, the authors recommend directly implementing capture-avoiding substitution. The

example on the Redex web site gives a definition of capture-avoiding substitution as a

metafunction:

;; subst : variable × expression × expression → expression

(define-metafunction subst

lambda-calculus

;; 1. x_1 bound, so don't continue in lambda body

[(x_1 e_1 (lambda (x_1) e_2))

(lambda (x_1) e_2)]

;; 2. in this case, we know that no capture can occur,

;; so it is safe to recur normally.

[(x_1 e_1 (lambda (x_2) e_2))

(lambda (x_2) (subst (x_1 e_1 e_2)))

(side-condition

(equal? (variable-not-in (term e_1) (term x_2))

(term x_2)))]

;; 3. general purpose capture avoiding case

[(x_1 e_1 (lambda (x_2) e_2))

,(term-let ([x_new (variable-not-in (term (x_1 e_1 e_2))

(term x_2))])

(term

(lambda (x_new)

(subst (x_1 e_1 (subst (x_2 x_new e_2)))))))]

;; 4. replace x_1 with e_1

[(x_1 e_1 x_1) e_1]

;; 5. x_1 and x_2 are different, so don't replace

[(x_1 e_1 x_2) x_2]

;; 6. ordinary recursive case

[(x_1 e_1 (e_2 e_3))

((subst (x_1 e_1 e_2)) (subst (x_1 e_1 e_3)))])

This implements capture-avoiding substitution with Redex's new

define-metafunction form. Redex's

term macro is a quasiquote-like data constructor that produces S-expressions;

define-metafunction defines implicitly unquoted functions that can be invoked within a

term form. The

subst metafunction substitutes an expression

e_1 for a variable

x_1 within an expression

e_2 by applying

(subst (x_1 e_1 e_2))

within a

term form.

This metafunction is correct, but I find its definition a little subtle. I think it's clearer to separate concerns a little more and divide its definition into two pieces. Following

Gunter's definition of capture-avoiding substitution we define a separate "change of variable" function:

;; change-variable : variable × variable × expression → expression

(define-metafunction change-variable

lambda-calculus

[(x_1 x_2 x_1) x_2]

[(x_1 x_2 x_3) x_3]

[(x_1 x_2 (lambda (x_1) e_1))

(lambda (x_1) e_1)]

[(x_1 x_2 (lambda (x_3) e_1))

(lambda (x_3) (change-variable (x_1 x_2 e_1)))]

[(x_1 x_2 (e_1 e_2))

((change-variable (x_1 x_2 e_1)) (change-variable (x_1 x_2 e_2)))])

This function replaces a variable name

x_1 with a new name

x_2 within an expression

e_1. The subst metafunction uses

change-variable for renaming a bound variable with a fresh name.

;; subst : variable × expression × expression → expression

(define-metafunction subst

lambda-calculus

[(x_1 e_1 (lambda (x_1) e_2))

(lambda (x_1) e_2)]

[(x_1 e_1 (lambda (x_2) e_2))

,(term-let ([x_new (variable-not-in (term e_1) (term x_2))])

(term

(lambda (x_new)

(subst (x_1 e_1 (change-variable (x_2 x_new e_2)))))))]

[(x_1 e_1 x_1) e_1]

[(x_1 e_1 x_2) x_2]

[(x_1 e_1 (e_2 e_3))

((subst (x_1 e_1 e_2)) (subst (x_1 e_1 e_3)))])

I prefer this definition of capture-avoiding substitution for several reasons. First, it corresponds directly to Gunter's definition. Furthermore, its runtime efficiency is a little clearer. The first definition of the metafunction recursively calls itself twice on the same subexpression in case #3; the reason why this doesn't cause exponential growth is because its behavior in one of the two cases is equivalent to

change-variable (because it substitutes a variable) and consequently more efficient. But this took me a while to figure out. Finally, the types are a little tighter. For example, if we were just defining call-by-value substitution,

subst could take a value for its second argument, rather than arbitrary expressions.