## Sunday, March 19, 2006

### Observable differences between reduction strategies

It's easy to misunderstand the difference between lazy and eager evaluation. You might be tempted to think that 1) lazy evaluation does reductions from the "outside in," meaning that finds the outermost redex and reduces it before reducing its subterms, and 2) eager evaluation, by contrast, reduces from the "inside out," not performing an outer reduction until it has reduced all its subterms.

But this isn't the case. The difference between lazy and eager evaluation is not a difference in evaluation strategy, but rather a difference in the definition of what is reducible. In ordinary evaluation (as opposed to, say, compiler optimizations using partial evaluation), we always use the exact same strategy when reducing terms--typically, always reduce the leftmost, outermost redex. The difference is in the definition of redexes.

In call-by-name lambda calculus, the definition of a reducible expression is an application whose operator is a value. The operand can be any arbitrary expression. As a result, non-values end up being passed as arguments to functions. In call-by-value, we likewise only perform reduction when the operator is a value, but we also require that the operand be a value as well.

The combination of an outermost reduction strategy with the call-by-value rule may appear to be the same as an innermost reduction strategy; after all, it forces the evaluation of subterms before reducing a compound term. But it doesn't force the evaluation of all subterms, specifically, underneath abstractions. This is the key distinction.

Here's an example that distinguishes the three semantics. Imagine an untyped lambda calculus with two effects: we can ask Alice a question, say by sending her computer a message over the network and receiving a response, or we can similarly ask Bob a question. Each of these operations results in a distinct observable effect. Now take the following simple program:
`(λx.(ask bob)) (ask alice)`
We'll annotate each reduction step with a note recording the side effect it produces, if any, so we can see when the effects happen.

In call-by-name lambda calculus with a leftmost-outermost reduction strategy:
`      (λx.(ask bob)) (ask alice)   -> (ask bob)b! -> 22`
In call-by-value lambda calculus with leftmost-outermost:
`      (λx.(ask bob)) (ask alice)a! -> (λx.(ask bob)) 11   -> (ask bob)b! -> 22`
And finally, the weirdest of them all, an innermost reduction strategy:
`      (λx.(ask bob)) (ask alice)b! -> (λx.22) (ask alice)a! -> (λx.22) 44   -> 22`
Intuitively, this program should always ask Bob a question and return its answer. Each of these programs preserves that meaning. But in outermost call-by-value it asks Alice a question first and then ignores it, in an innermost evaluation strategy it asks Alice a question second and then ignores it, and in call-by-name it never asks Alice at all.

I added a couple simple effects to the lambda calculus to demonstrate the idea. But in pure lambda calculus, we can still observe differences between these systems because we can demonstrate side effects through non-termination. An expression that leads to non-termination results in different evaluation behavior of a system when the system makes different decisions about when and whether to evaluate the expression.