Wednesday, March 01, 2006

The call stack is not a history

I had a fascinating conversation with John Clements yesterday about continuation marks, and he gave me a nice way of clarifying a common misconception about the call stack. There are two major pieces of information you might like to inspect about the control flow of a program: what the program has already done (its history), and what it has left to do (its future).

The call stack might appear to be giving you information about the history of the program execution, but in fact, it's an incomplete history. For example, any functions that have already returned are not recorded; they've already popped off the stack. Languages that implement proper tail calls also remove information about function calls that have completed just about everything but may not yet have returned from their tail operations. Now, if this is information you need, it doesn't mean you can't record it through some other means. It just means the default behavior of the runtime is not to record it--and for good reason, too. Not accumulating that information affects the asymptotic space complexity of the runtime. You wouldn't want your runtime collecting complete information about every iteration of a for-loop by default, either, just on the off-chance that information might be useful!

The evaluation context, i.e. call stack, is not about the history of the program execution. It contains a complete record of the remainder of the program execution, the work the program has left to do. That's the whole reason for maintaining stack frames, so that when a function returns, the runtime knows what to do next. So stack inspection is fundamentally about inspecting the future of the computation, not about recording its history.


Chui Tey said...

Awww but it's so convenient to think of it as history. I wouldn't be able to debug if I thought that the call stack represented future computation.

Dave Herman said...

Well, maybe I should have made a somewhat softer statement: "the call stack is not a complete history." Of course the stack is useful for debugging, but you can't rely on it for the entire history.

It's true that it's useful to think of the stack trace as the history, but it's only that part of the computation history that had not yet been completed. Functions that have returned, and functions that have made tail calls are all dead computations, and are thus subject to removal from the evaluation context.

Ryan Culpepper said...

One idea that I talked to John about is a variant of continuation marks (co-marks?) that have the opposite "collapsing" rule.

Here's the normal rule:

(wcm A (wcm B e)) = (wcm B e)

Here's another possibility:

(wccm A (wccm B e)) = (wccm A e)

So the first mark to get to a frame wins. I argued that this might be more useful for debugging since it has more historical-type information. You could also throw both into the same system and get kind of an "interval" for each frame: it started out in this procedure, and it's now in this other procedure waiting for an answer.

(In other news, no pre tags in comments makes me sad.)

Dave Herman said...

That's funny, I wondered about the same thing! I guess it's only natural, given that it's the other of only two options. :) I didn't think about the "both" option, though.

John and I also talked about having options to turn on tail mark accumulation. I've also wondered about having fixed, bounded accumulation (most recent n marks) for debugging's sake, which would still exhibit the same asymptotic behavior.

No pre tags does suck. No code or blockquote tags sucks too. Blogger is decidedly mediocre.

anton van straaten said...

The call stack is not a complete history, but in languages/implementations without TCO, the call stack does serve a well-defined historic purpose which is lost when TCO is performed: it provides a complete history of every procedure call that directly led to the current execution point. This is not an arbitrarily incomplete history: it can provide answers to the question "how did execution get here?" that a TCO'd call stack often cannot.

Even if you assume that this common property of call stacks is completely accidental, e.g. because the language implementors didn't know about TCO, the fact is that it's relied on enough by people in the real world that it's perfectly reasonable and accurate to say that call stacks in many language implementations serve two overlapping purposes: to record a particularly useful aspect of the history of the computation, and to store the future of the computation.

A theoretical model that restricts the call stack to only being about the future of the computation does not model what many real languages actually do. Further, in languages that don't rely heavily on recursion, using the call stack for this dual purpose is actually an excellent bit of optimal engineering.

That said, I still want you to make sure they add TCO to Javascript, m'kay? :)

Dave Herman said...

I still maintain that what we consider a valid entry in this history is a somewhat arbitrary distinction.

Consider again the point that you don't keep in the stack trace the set of for-loops that led to the current stack. Yes, you're right that imperative languages make a nice distinction between primitive iterative control constructs and functions, but by forcing this distinction, it prevents programmers from implementing new iterative control constructs using functions.

Unless you're content to commit to a fixed set of iterative forms in your language, distinguishing between iteration implemented with primitive loops and iteration implemented with functions simply to claim the dubious distinction that your stack contains a complete history of all "interesting" control events (for this arbitrary definition of "interesting") is a pyrrhic win.

Furthermore, it's perfectly possible for users to define particular tail calls as "interesting" control events all on their own by converting them to non-tail calls. This can either be done by applying the identity function, or by adding a "non-tail return" construct to the language. So it's still possible to allow the programmer to decide on a case-by-case basis what control events they want to record, admittedly with some amount of extra work.

anton van straaten said...

My point is not to defend the design decisions that mainstream languages have made, but rather that having made those decisions, they are using the call stack for a dual purpose. This means that theoretically constraining the purpose of the call stack to representing only the future of the computation results in a mismatch between theory and practice in those particular cases. The view of the call stack as being only about the future of the computation makes perfect sense in the functional world, but not so much in the imperative one [but see my caveat in the penultimate paragraph].

IOW, if you simply add TCO to a typical imperative language (say Javascript), there will be a loss of information that programmers will notice, while the theory seems to say that nothing has been lost.

Of course, the relevant information can be recovered by tail mark accumulation, etc., but as long as you're operating within the imperative paradigm, you may as well just use the call stack for this purpose, as long as you're willing to live without TCO (rather them than me!)

[Caveat: the issue is more subtle than the way I've been describing it: without TCO, the future of the computation is exactly equivalent to the history that imperative programmers are interested in. So you could look at this equivalence and conclude that the call stack is, indeed, only about the future of the computation, in all cases, both imperative & functional. But this breaks down as soon as you decide that it's possible to optimize that representation of the future by performing TCO -- then you discover that the imperative call stack contains information that's redundant from the perspective of the future of the computation, but that's useful to the programmer using a debugger or examining an error trace. So the stack-is-the-future theory, and the apparent imperative/functional equivalence, only holds in the imperative world as long as the call stack is not TCO'd.]

BTW, the reason I'm bothering you about this is that I had wanted to post almost exactly the same observation in this LtU thread: that the stack represents the future, and shouldn't be confused with a history. However, in writing that post, I came to the conclusion that there was more to it, and you're now suffering the consequences! :)

Anonymous said...

Great post!

John Cowan said...

Sorry for the belated comment, but it's a big Internet. In Chicken, of course, the stack is a call history, though it's truncated to the last N calls. And that's exactly what you get when Chicken prints an error traceback: the last N calls, where N depends on the time since the last minor GC.

simplecall said...

I am looking for International calling service provider that provide full Online access of my account including Top Up, Registering Phones, Call History, Current Balance etc.