Friday, January 30, 2009

Quiz answer

Reader Mitch (not my advisor, apparently) comes closest to the answer to my quiz question:
Some kind of alignment thing? I can't think of how code would be affected by that, though.
As it happens, changing the code was affecting the particular size of the activation record allocated by the GCC-generated code. Somehow the extra code ended up with a slightly smaller activation record. And then it turned out that the slight difference (something like 8 or 16 bytes) was pushing the activation record over the edge of a page boundary. Boom!

It's still a mystery to me exactly how throwing in an asm("noop\n") can affect the size of the activation record. Such are the arcane mysteries of GCC code generation. I had to catch my flight home before Dave had worked out all the details but by now I bet he has an even better understanding of it. It was fun and inspiring to watch Dave's detective skills in action.

Quiz question

Here's a fun puzzle I watched Dave Mandelin track down yesterday. Why would adding a guaranteed no-op such as
if (FALSE) { printf("can't happen\n"); }
or even
asm("noop\n");
to the body of a procedure cause a program to run measurably and consistently faster? I'll post the answer later today. (Or as much as I understand, anyway.)

Tuesday, January 13, 2009

Java, we don't care about you

Joel Spolsky pretty much nails it.

I actually grabbed a screen shot of this very same dialog box some time last year because it was so absurd. The part that always amazes me is why people think customers are going to care about their product as much as they do. It seems to me that I'm more likely to be loyal to a technology brand that shows a little humility and restraint.

Friday, January 09, 2009

A page of the dissertation

Ron Garcia suggested that I should get in the practice of writing a little bit for my dissertation each day, so that when crunch time comes, I'll already have a lot of material to work with. This being New Year's, I should also go to the gym, eat healthier, clean my home and my office, make a budget, fix more bugs, and organize my time.

But I just wrote my first page of content for the dissertation, and that feels pretty good. There's even a lemma!

Thursday, January 08, 2009

Fexprs? in Scheme?

“The top level is hopeless.”
     -- Matthew Flatt [1] [2] [3] [4] [5] [6] [7] [8] [9]
Scheme's top level allows for interactive evaluation such as at the REPL, where global definitions can be dynamically evaluated one by one. To allow for forward references, free variables are not a static error at the top level. But global definitions may be macros, and there's no way to know whether a forward reference is eventually going to turn out to be a value binding or a macro binding. So one simple semantics is just to assume that forward references are always value bindings and compile them as such. But then you get unsatisfying interactions like this:
> (define (f) (display (g 42)))
> (define-syntax g (syntax-rules () ((g x) (quote x))))
> (f)
reference to an identifier before its definition: g
It's worse with macro-defining macros. For example, a define/inline form would bind its definition as a macro, but the user could generally treat it as a value binding. But you'd still run into the same problem:
> (define/inline (f) (g 42))
> (define/inline (g n) (add1 n))
> (f)
reference to an identifier before its definition: g
What's in conflict in Scheme is the desire for macros to be a static, compile-time entity and the REPL to allow dynamic, incremental update to the global environment. In fact, the top level seems to be the one place in Scheme where programmers expect behavior something like Lisp's antiquated fexprs. Bear with me...

Imagine if the compiler treated any form (x . s) where x is a top-level variable reference--bound or unbound--as an uninterpreted syntax object waiting to be parsed. Then only when the form is evaluated would x be looked up in the global environment to determine how to compile the form. If x is unbound, it's a dynamic error; if it's a value, it's compiled and evaluated as a function application; if it's a macro, it's macro-expanded, compiled, and evaluated. This would accomodate any sequence of definitions and redefinitions of top-level bindings, either as value bindings, macro bindings, or both. For example:
> (define (foo x) (f x))
> (define (f x) (add1 x))
> (foo 41)
42
> (define-syntax f (syntax-rules () ((f e) (quote e))))
> (foo 41)
x
This has a lot of the similarities to fexprs; it means a function like foo is subject to dynamic reparsing: at one point it calls a function on x and at another point it applies a macro. But whereas with fexprs, any application expression may be dynamically reparsed, here this would only be the case for applications of top-level variables.

Fexprs are bad for two reasons: they make the language hard to compile efficiently and they make programs hard to understand by subjecting the basic program definition to dynamic reinterpretation. To be sure, making Scheme's top-level fexpr-like would make efficient compilation harder. But the top-level is the part of Scheme where programs are dynamically defined. Users are constantly surprised at the behavior of top-level forward references and macro definitions. What they seem to naturally expect is, in some sense, fexprs.