Tuesday, March 14, 2006

Proper tail recursion and space efficiency

I just read through Will Clinger's Proper Tail Recursion and Space Efficiency. I'd been scared of it for a few years because I had the impression it was a very technical and subtle paper. There are perhaps a few subtleties but I actually found it very clear and well-written. Here are some interesting points made in the paper:

1. Proper tail recursion is a property of the asymptotic space complexity of a language's runtime behavior. That is, in improperly tail recursive languages, control can consume unbounded amounts of space for programs that, when run in properly tail recursive languages, only require a constant amount of space.

2. In order to measure the space complexity of a language, you have to measure the entire system, not just, say, the stack. Otherwise, someone could always cheat and hide away its representation of the stack in the heap to make it look like the control is more space-efficient than it really is.

3. But when you measure everything including the heap, you have to deal with the fact that the language doesn't ever explicitly deallocate storage. This means that you have to add garbage collection into your model or else the model always increases its space usage unrealistically.

4. Proper tail recursion is totally natural in the lambda calculus. An application is always replaced by the body of the applied function, regardless of whether it's a tail call. It's just languages with that freaky return construct that accidentally make the mistake of leaving around residue of the application expression until it's completed evaluating. In order to model improper tail calls in Scheme, Will actually has to add a silly return context frame, whose sole purpose is to degrade the space efficiency of the language.

5. By leaving the expression language the same but changing the definition of the runtime system (in this case, the set of context frames), you can compare apples to apples and look at the same space consumption of the exact same program in one system or another, assuming the systems always produce the same results from the programs. Then we can actually construct space complexity classes to classify the different systems.


Sam Tobin-Hochstadt said...

On your first point, I note that two languages could be in different space complexity classes, even if no program consumes constant space in one and unbounded space in another. There would just have to be some program where they both consumed unbounded space, but one nonetheless consumed asymptotically more.

However, I can't think of a runtime system that would have this property.

Sam Tobin-Hochstadt said...

To record for posterity the answer to my question (due to Dave):

Evaluator 1: normal, non-proper-tail-calling evaluator.

Evaluator 2: Like evaluator 1, but pushes n frames at every tail call (where n is the height of the stack at that point).

There is no program in which one uses a bounded amount of space, and the other an unbounded amount, since to have a non-bounded difference, you need a program that makes an unbounded number of tail calls, which would cause both to consume an unbounded amount of space. But the trivial infinite loop consumes O(n) space after n iterations in the first one, and O(n^2) space in the second.

MarkM said...

I can't follow your link to Clinger's paper, but did find it here.