Thursday, December 22, 2005

Expressions and statements

This post explaining the I/O monad to a newbie on the Haskell mailing list motivates it in terms of expressions and statements:
We have a[n] "evaluation machine" and a[n] "execution machine". The former
evaluates expressions, the latter executes I/O actions. When the program is
started, the execution machine wants to execute the actions from the "main to
do list".
Interesting. Imperative languages like C distinguish expressions and statements syntactically, but with various implicit conversions between them, e.g., expression-statements and procedures that execute statements but then return values. Haskell separates statements from expressions with the type system, and keeps them apart more strictly (no pun intended). But in a sense they're both making the same distinction.

Tuesday, December 20, 2005

I dare you

Read these three posts about Real's business practices and try not to get angry.

12 Weeks With Geeks

This looks interesting:
http://www.projectaardvark.com/movie/
It features some of your favorite outspoken software evangelists, Joel Spolsky and Paul Graham.

I'm not sure what I'll think of it, because on the one hand they're billing it as a more honest look into the software development process, which should be informative, but on the other hand, the preview hints at a number of the myths computer people like to believe about themselves.

I find myself increasingly upset at the way people in software are portrayed both by the media and by themselves--as idiot savants, socially inept geniuses, lone ranger superheroes, and revenge-of-the-nerds underdogs. It's dangerously delusional, it fosters extreme anti-social behavior, allows for all sorts of irresponsibility and incredibly bad software, continually dissuades women from entering the field, and perpetuates the belief that each hacker can solve any problem alone with complete disregard and disdain for a century of progress and collective knowledge.

How long can we continue duping the world into relying on us to design the heart of every automated mechanism known to man? How long will people continue believing in the myth of the 15-year-old hacker genius while simultaneously decrying the unreliability of software before the cognitive dissonance finally cracks?

At any rate, I don't know if that's where this movie is going. You can only tell so much from a preview. But it should be very interesting, especially about the product lifecycle and management perspectives.

Try-catch-finally in Scheme

I need a little break from all this parsing crap. Let's have some macro fun. Here's a PLT Scheme version of Java's try-catch-finally. Note the use of symbolic literals, rather than the usual syntax-rules/syntax-case literals list. This is due to my philosophy on macro keywords.
(define-for-syntax (symbolic-identifier=? id1 id2)
(eq? (syntax-object->datum id1)
(syntax-object->datum id2)))

(define-syntax (try stx)
(syntax-case* stx (catch finally) symbolic-identifier=?
[(_ e (catch ([pred handler] ...)) (finally e0 e1 ...))
#'(dynamic-wind void
(lambda ()
(with-handlers ([pred handler] ...)
e))
(lambda () e0 e1 ...))]
[(_ e (catch ([pred handler] ...)))
#'(with-handlers ([pred handler] ...) e)]
[(_ e (finally e0 e1 ...))
#'(dynamic-wind void
(lambda () e)
(lambda () e0 e1 ...))]))
Example use:
(try (begin (printf "hi")
(printf " there")
(printf ", world")
(newline)
(error 'blah))
(catch ([exn? (lambda (exn) 3)]))
(finally (printf "finally!~n")))

Monday, December 19, 2005

Dynamic scope

Now I'm confused by another aspect of JavaScript scope:
var f = function() { alert("default") }
function test(b) {
if (b) {
function f() { alert("shadowed") }
}
f()
}
Evaluate test(true), and you'll see "shadowed". Evaluate test(false), and you'll see "default". Apparently function declarations are another instance where JavaScript is not lexically scoped (in addition to this, arguments, and the behavior of references to unbound variables).

Update: Oh, I forgot to mention the with construct -- another instance of dynamic scope.

Update: Also note that this is not the same thing as the behavior of var. If we write this function instead:
var f = function() { alert("default") };
function test(b) {
if (b) {
var f = function() { alert("overridden") };
}
f()
}
Then with test(true) we still get "overridden", but with test(false) we get an error from trying to apply undefined as a function. In other words, in this example, we have normal lexical scope with hoisting.

Update: I think this settles it--it's pretty much what it looks like. In chapter 13 of ECMA-262, the dynamic semantics of a FunctionDeclaration is to create a new binding in the "variable object" (ECMA-262 terminology for an environment frame) upon evaluation of the declaration.

If you imagine an environment as a list of frames, then in a lexically scoped language, this is a statically computable list. In a language with dynamic scope, frames in the environment may be generated or mutated at runtime, so it's not statically computable. JavaScript has some of both, so there are some things you can know statically about a variable reference, and some things you can't.

Update: The saga continues... I realized that I couldn't actually derive the above example from the official ECMAScript grammar! I see that others have made this surprising discovery as well: you can't nest FunctionDeclarations beneath Statements. Furthermore, you can't begin an ExpressionStatement with a FunctionExpression. As a result, it's impossible to derive the above program.

However, in practice, JavaScript engines do appear to accept the program, and use the dynamic semantics from chapter 13 as I describe above. But for strict ECMAScript, I can't seem to produce an example. I thought I could still force the issue by placing the nested function at the top-level of the enclosing function body and conditionally returning. But I was thwarted by two more interesting phenomena. First of all, when a var declaration and a function declaration compete to bind the same identifier, the var always wins, apparently. So this always produces "default", regardless of the value of b:
function test(b) {
var f = function() { alert("default") };
if (!b) {
f();
return;
}
function f() { alert("overridden") }
f()
}
Second, nested function declarations appear to be hoisted to the top of the function body. So the following example always produces "overridden":
var f = function() { alert("default") };
function test3(b) {
if (!b) {
f();
return;
}
function f() { alert("overridden") }
f()
}
Curious. At any rate, the behavior of popular JavaScript engines (Firefox and Rhino) seem to be a generalization of ECMAScript (by allowing FunctionDeclarations to appear nested within Statements) that is consistent with the spec (by obeying the behavior specified by section 13).

Now I need to confirm these two more properties of the language with the spec.

Gotcha gotcha

In "a huge gotcha with Javascript closures," Keith Lea describes an example function written in JavaScript with surprising behavior. But he misattributes the unexpected results to the language spec's discussion of "joining closures." The real culprit, rather, is JavaScript's rules for variable scope. Let me explain.

Here's Keith's example:
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
We might expect the function to produce "a", "b", "c", "a", "b", "c", but surprisingly, it displays "a", "b", "c", "c", "c", "c"! Wha' happened?

The answer is that, in JavaScript, all var declarations are hoisted to the top of the nearest enclosing function definition. So while x appears to be local to the for-loop, it is in fact allocated once at the top of the loadme function and in scope throughout the function body. In other words, the above function is equivalent to:
function loadme() {
var x;
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
In this version, it's clear why the function behaves as it does: the closure is mutating a global variable every time it's called!

To get the desired behavior, you need to use nested functions instead of loops, e.g.:
function loadme() {
var arr = ["a", "b", "c"]
var fs = [];
(function loop(i) {
if (i < arr.length) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
loop(i + 1);
}
})(0);
for (var j in fs) {
fs[j]();
}
}
Now x is truly local to the loop body, and the function produces "a", "b", "c", "a", "b", "c" as expected.

Update: My suggested fix is really bad advice for ECMAScript Edition 3! I should never recommend using tail recursion in a language that is not properly tail recursive. If all goes according to plan for proper tail calls in Edition 4, my suggestion would be fine for that language. But today, you can't use tail recursion. Instead, you should use any of the existing lexical scope mechanisms in ECMAScript for binding another variable to the loop variable i.

Solution 1 - wrap the closure in another closure that gets immediately applied:
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = (function() {
var y = x;
return function() { alert(y) }
})();
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
Solution 2 - wrap the loop body in a closure that gets immediately applied (suggested in the comments by Lars):
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
(function(i) {
var x = arr[i];
var f = function() { alert(x) };
f();
fs.push(f);
})(i);
}
for (var j in fs) {
fs[j]();
}
}
Solution 3 - wrap the closure in a let-binding (in JavaScript 1.7, currently implemented only in Firefox 2):
function loadme() {
var arr = ["a", "b", "c"];
var fs = [];
for (var i in arr) {
var x = arr[i];
var f = let (y = x) function() { alert(y) };
f();
fs.push(f);
}
for (var j in fs) {
fs[j]();
}
}
Solution 4 - ditch var entirely and only use let (again, currently specific to Firefox 2):
function loadme() {
let arr = ["a", "b", "c"];
let fs = [];
for (let i in arr) {
let x = arr[i];
let f = function() { alert(x) };
f();
fs.push(f);
}
for (let j in fs) {
fs[j]();
}
}
The moral of the story is that in ECMAScript Edition 4, let is a "better var", behaving more like you'd expect. But until you can rely on let as a cross-browser feature, you'll have to use solutions like 1 and 2.

Tuesday, December 13, 2005

Semanticist's cookbook

Is this a dumb idea? I'd like to create a semantic modeling wiki where people can describe various techniques for modeling different kinds of programs, programming languages, and programming language features mathematically. Sort of a clearinghouse or cookbook for semantics. It could feature links to literature, and various categorizations so that people could find recipes by following different paths. For example, a reader might discover the exception monad by alternatively searching for control features, or error handling, or denotational modeling, or monads, etc.

Perhaps a better-designed database would be more organized, but I think wikis are good for continuously reorganizing, as well as for community contribution.

The Semanticist's Cookbook. I like it!

Update: Stubbed at http://semanticscookbook.org.

Update: Maybe I'll install a wiki another time but I don't have time for this now.