Monday, December 19, 2005

Gotcha gotcha

In "a huge gotcha with Javascript closures," Keith Lea describes an example function written in JavaScript with surprising behavior. But he misattributes the unexpected results to the language spec's discussion of "joining closures." The real culprit, rather, is JavaScript's rules for variable scope. Let me explain.

Here's Keith's example:
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
We might expect the function to produce "a", "b", "c", "a", "b", "c", but surprisingly, it displays "a", "b", "c", "c", "c", "c"! Wha' happened?

The answer is that, in JavaScript, all var declarations are hoisted to the top of the nearest enclosing function definition. So while x appears to be local to the for-loop, it is in fact allocated once at the top of the loadme function and in scope throughout the function body. In other words, the above function is equivalent to:
function loadme() {
var x;
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
In this version, it's clear why the function behaves as it does: the closure is mutating a global variable every time it's called!

To get the desired behavior, you need to use nested functions instead of loops, e.g.:
function loadme() {
var arr = ["a", "b", "c"]
var fs = [];
(function loop(i) {
if (i < arr.length) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
loop(i + 1);
}
})(0);
for (var j in fs) {
fs[j]();
}
}
Now x is truly local to the loop body, and the function produces "a", "b", "c", "a", "b", "c" as expected.

Update: My suggested fix is really bad advice for ECMAScript Edition 3! I should never recommend using tail recursion in a language that is not properly tail recursive. If all goes according to plan for proper tail calls in Edition 4, my suggestion would be fine for that language. But today, you can't use tail recursion. Instead, you should use any of the existing lexical scope mechanisms in ECMAScript for binding another variable to the loop variable i.

Solution 1 - wrap the closure in another closure that gets immediately applied:
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = (function() {
var y = x;
return function() { alert(y) }
})();
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
Solution 2 - wrap the loop body in a closure that gets immediately applied (suggested in the comments by Lars):
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
(function(i) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
})(i);
}
for (var j in fs) {
fs[j]();
}
}
Solution 3 - wrap the closure in a let-binding (in JavaScript 1.7, currently implemented only in Firefox 2):
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = let (y = x) function() { alert(y) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
Solution 4 - ditch var entirely and only use let (again, currently specific to Firefox 2):
function loadme() {
let arr = ["a", "b", "c"];
let fs = [];
for (let i in arr) {
let x = arr[i];
let f = function() { alert(x) };
f();
fs.push(f);
}
for (let j in fs) {
fs[j]();
}
}
The moral of the story is that in ECMAScript Edition 4, let is a "better var", behaving more like you'd expect. But until you can rely on let as a cross-browser feature, you'll have to use solutions like 1 and 2.

3 comments:

Lars said...

Dave, I appreciate your post on this. The issue of what exactly is captured by a javascript closure has been confusing me. I used to be a student of Scheme but it's been 15 years! So thanks to you and others who are explaining the nuances.

Help me clarify one thing. You mention in your explanation that the gotcha in Keith's example is that the "var x" gets hoisted to the top of its enclosing function definition. However, your solution differs from Keith's in two ways: not only do you put a function definition around the (body of the) loop, but you also turn the loop from an iterative "for" into a recursive function call. As a result, each instantiation (is that the term?) of the function call is saved on the stack at a different level (unless there is tail-recursion optimization). Thus the various var x's are "forced" to coexist at different places on the stack.

This second difference muddies the water for me. Would your solution still work without making the loop recursive? Why or why not? (And conversely, if you made the loop recursive but didn't using an enclosing function definition [which sounds impossible, but my brain's fuzzy enough on this that I would be curious to hear your answer], what would be the result?)

I'm going to test one or more of those variations in my browser, but given that some aspects of closures in javscript are implementation-dependent, I won't be able to take the results as gospel. I would like to know what is correct in theory as well as what happens to work in practice.

Regards,
Lars

Lars said...

As a followup to my previous comment: yes, your solution still works (for me in Fx 2.0) without a recursive loop. I.e. the following code yielded 'abcabc' instead of 'abcccc':

function loadme3() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
(function loopbody(i) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
})(i);
}
for (var j in fs) {
fs[j]();
}
}

(Sorry, I couldn't figure out how to preserve indentation. The blog software rejected 'pre' and 'code' tags.)

I would still like to hear your take on whether (and why) this should work in theory (for all compliant ECMAscript implementations) as well as in practice. Am I right in understanding that because "var x" is now truly local to the loop body, each closure captured in fs[] has x bound to a different variable (a different "box" in memory)?

Thanks,
Lars

P.S. The application that brought this up for me was AJAX, specifically, geocoding in Google Maps. You send out an XML http request with a callback function to receive the response. How do you preserve the context of each call so that it can be used in processing the result? A handy way would be to capture it in the closure of the callback function. Seems like there must be a simpler way to handle this though. There are an awful lot of GMaps mashups out there! Do they all use an extra function definition to capture state? None of the tutorials I've read talk about this.

Dave Herman said...

I've been meaning to post a follow-up for a while -- it's very bad to recommend writing tail-recursive programs in ECMAScript Edition 3. So ignore my original recommended fix. Yours is fine, and so are some of the other suggestions I've mentioned in the update.