Sunday, March 27, 2005

XML in Scheme

There has been some incredible work on XML in Scheme, with several wonderful languages and libraries available. I'll probably describe some of them in the future, and I'm thinking about writing a tutorial. For now, here's the high-level description.

Abstract: XML is a framework for defining data languages, particularly tree-shaped data.

What do you do with a data language?
  1. Construction: you create data.
  2. Validation: you describe the valid types of data in the language and perform static analysis of data to test for conformance.
  3. Serialization: you write the data in some convenient representation format, and read data back in from that serialized format.
  4. Queries: given a set of data, you mine it for some subset of information.
  5. Transformation: given data in one language, you transform it into data in another language.
All of these are supported by the various Scheme libraries for XML. Some of the most breathtaking examples are the transformation languages of SXSLT and WebIt!, and Neil van Dyke's astonishing permissive HTML parsing library HTMLPrag. There are also some cool query languages, like SXPath and Jim Bender's query library.

With a little taste and some fluency with standard tools of language specification, these authors have managed to describe data manipulation languages that are as powerful as their mainstream counterparts, with specs a fraction the size of the monstrosities produced by the W3C.

Friday, March 25, 2005

April

Reasons to look forward to April:
  • Ryan will be done with his papers and we can start talking about research projects
  • the new Hitchiker's Guide movie
  • spring!
  • nice weather means I can bike to work
  • I'll be done with the musical and can get my life back in order
  • read my lips, no new taxes
Sorry for the personal post. Back to the regularly scheduled brilliant flashes of insight...

Update: How could I forget OS X 10.4?

Hygiene is weirder than referential transparency

Referential transparency (despite its damned awful, confusing, misleading name) is a simple concept: whatever existing bindings a macro definition makes reference to should never be violated, no matter how the macro is used. End of story.

But hygiene is weirder, for two reasons. First, new bindings introduced by a macro don't really exist until the macro is used, and then they should be totally new every time it's used. Because the macro template is really just that--a template, not actual code--it represents any number of instantiations of itself, and consequently any number of new bindings. And hygiene means that when these new--as of yet non-existent--bindings are created, they can't violate existing bindings in the call site.

Second, hygiene in Scheme is particularly weird because we don't even know which identifiers are binding occurrences until we fully expand a macro, so various macro expansion algorithms have to retroactively apply or revert renamings once expansion is complete! This is what Ryan refers to as the "decay" model, where hygienic renamings are invertible.

Sources of binding polymorphism

As an after-thought, Sam's example leads us to categorize a second family of sources of binding polymorphism. We now have that binding polymorphism is a result of:
  1. Higher-order macros, where an operator can be passed to a macro that is either an ordinary expression or a macro keyword, and when the macro generates code that applies the operator, it may or may not bind identifiers in other subexpressions.
  2. Local expansion contexts, where a macro may force the expansion of a particular subexpression (via, e.g., local-expand) to allow new macros to extend the behavior of the macro, potentially binding identifiers in different ways.
Although lazy macro expansion is important and powerful, it turns out that being able to force eager expansion is important, too. It allows programmers to specify new kinds of expansion contexts to allow points of extensions within a macro. When macros become sufficiently large, the need to extend them with new subforms is inevitable. Examples in mzscheme where this either exists, desparately needs to exist, or will soon exist:
  • the class macro
  • the require and provide macros
  • the match library

Binding abstractions, binding polymorphism

I think I understand better what Clinger and Rees meant in Macros that Work about binding abstractions. Here's their quote again:
The distinction between binding occurrences and other uses of identifiers should be determined by a particular syntactic abstraction, not predetermined by the parser.
I thought they were claiming that the binding properties of identifiers in a macro could never be determined statically. As I've said, I believe that in the vast majority of cases, the binding properties of identifiers are a part of the static interface between a macro definition and its clients, so I would disagree with such a claim. (It's worth noting that you probably couldn't infer binding properties in general, since syntax-rules is Turing-complete. But you could certainly imagine a type system for macros that captured these properties, and/or a restricted sub-language of syntax-rules where inference is possible.)

But that's not their claim. They are simply motivating the need for delaying the renaming decisions. Because they don't have any static mechanism for determining the binding properties of identifiers, and because macros can abstract over other macros that bind identifiers, the algorithm is simply incapable of determining the binding properties until the very end of expansion.

Incidentally, Sam has given me a perfect example of a useful macro that is binding polymorphic. He has recently enhanced the PLT Scheme match library, which provides ML-like pattern matching, to allow arbitrary extensions to the pattern matching clauses. Because various subexpressions of a matching clause may or may not contain binding occurrences of variables, the match macro must be polymorphic with respect to the binding properties of its arguments. Beautiful!

Tuesday, March 22, 2005

Abstract syntax isn't?

Abstract syntax is so called because it eliminates the elements of syntactic sugar that are irrelevant to the meaning of an expression. For example, in the following expression:
let x = 2 in (x + 3) end
the = symbol has no informational content; it's only there for legibility. So an AST is more abstract than a concrete syntax in the sense that it strips out details that are irrelevant to the meaning of a program.

But in another sense, abstract syntax is less abstract than so-called concrete syntax: in meta-programming, for example, when we write programs that pattern match on the surface syntax, we treat the surface syntax as the interface to the AST, and our code is oblivious to the particular representation of the code. As I have pointed out before, pattern matching is more abstract than code with explicit selectors.

I'm surprised out how infrequently the software engineering papers I read talk explicitly about abstraction. Bravenboer and Visser have written a couple of papers on embedding languages with concrete syntax. I like their work, and it might strengthen their point to discuss the notion of concrete syntax as an interface for the AST representation.

S-expressions are a sweet spot in these two aspects of the abstractness of syntax: there is very little syntactic sugar, so going from the surface syntax of a Lisp or Scheme program to an AST is trivial; on the other hand, macros can operate by pattern matching directly on surface syntax and splicing back into templates written in surface syntax. In a sense, it's a pun between object- and meta-language. But it allows for very powerful macros. Otherwise, when your surface syntax distinguishes similar types of expressions, you have to write multiple, near-identical macros to operate on the different forms.

Monday, March 21, 2005

"Everything is an object"

When you've been raised under the aegis of object-oriented programming and design, the only building blocks of the universe you can imagine are data structures that respond to messages. As a result, the word "object" becomes synonymous with "value." I remember being disgusted as an undergrad with Sun's capitulation to the bit-twiddlers in making Java primitives non-objects. Indeed, it was a capitulation, and there are techniques like automatic boxing/unboxing and tagged fixnum representations that allow for fast common cases. But my implicit assumption was that every value should be an object.

I don't personally believe anymore that it's a natural or even useful metaphor to think of 17 as an autonomous agent interacting with other entities by sending and receiving messages. But others much smarter than I am have wrestled with this question, and the jury is still out. But at least now I'm aware of my assumption.

It's useful to keep this assumption in mind when talking to OO people, though. Because they will often assume that the only element of data is the object, they will use the word interchangeably with "value." This leads to confusing claims like this one from a promotional video for Alan Kay's latest endeavor, the Viewpoints Research Institute:
Because everything in Squeak is an object, students can write a simple script to control their experiment.
This has become a running joke in the office, because what on earth does being an object have to do with being scriptable? John Clements made it make sense by translating "object" to "first-class value" -- by making the elements of their workspace values, students were able to manipulate them programmatically. Whether the values are agents, processes, algebraic datatypes, or numbers is irrelevant, but Squeakers think of everything in terms of objects, so that's the terminology they use.

Thursday, March 17, 2005

Mean Java

public class Bitchy {
public final Object clone() {
return new Bitchy();
}
...
}

Wednesday, March 16, 2005

The Story of Denotational Semantics

My Cliff Notes for the story of denotational semantics:
  1. We want to model programming language elements with their natural analogs in mathematical sets: numbers as numbers, functions as functions, etc.
  2. We want recursive functions in our language. But recursive definitions aren't really definitions, but rather constraint specifications. For example, × should be the least function such that 0 × n = 0 and (m + 1) × n = n + (m × n).
  3. So we want a fixpoint operator fix that takes a constructor function f, which tells us how to build successive approximations of the recursive function, and gives us the actual recursive function.
  4. The constructor function tells us how to improve an approximation of the solution to the constraint. Thus, the actual solution, which should be the limit of the sequence of approximations, should be a fixed point of the constructor: applying the solution to f should result in the same solution. Hence the specification of the fixpoint operator fix f = f (fix f).
  5. The fixpoint operator should have the property that it will always give us a unique solution for any reasonable recursive function definition.
  6. It so happens that if we restrict our model to allow only continuous functions, this fixpoint operator exists. The continuous functions are functions where these sequences of approximations exist, i.e., the limit of the sequence behaves consistently with each of the approximations. This is natural: in order for the partial results to be approximations of the real thing, we expect them to be getting closer to the result.
  7. So because the existence of the fixpoint operator depends on our functions being continuous, we always have a proof obligation that any constructor in the denotational semantics that produces a function in fact produces a continuous function.
Sam pointed out that all this shows the importance of Turing's proof of the undecidability of the halting problem: all of these problems arise when we have to worry about the behavior of functions at infinity. If we could simply disallow ⊥ from the language, and static analysis could always prove a function to be terminating, then we wouldn't have to deal with any of this.

Tuesday, March 15, 2005

Dreams of PLaneT

I submitted a feature request to have a mzscheme command-line option to load a PLaneT package. Then I could do cool one-liners, for example:
mzscheme -P dherman proxy.plt 1 -e '(http-proxy 9000)'
would run an HTTP proxy at the command line. Wouldn't that be cool?

Lazy Macro Expansion

Macro expansion could be called "lazy" when macro invocations are expanded without expanding their arguments, analogous to call-by-name or call-by-need lambda calculus. In CBN, arbitrary expressions can be passed as arguments to functions, not just values, which means that it isn't necessary to evaluate an argument to a value before the function can be invoked. Similarly with macros: if we define values in macro expansion as expressions that have no macro invocations left, then a strict, or "expand-by-value" macro semantics would be forced to fully expand a subexpression in a macro invocation before expanding the macro. A non-strict macro semantics passes subexpressions to a macro untouched.

We can demonstrate the non-strictness property in the usual way:
(define-syntax Ω
(syntax-rules ()
[(_) (Ω)]))
(define-syntax constant-3
(syntax-rules ()
[(_ exp) 3]))
(constant-3 (Ω))
=>
3

This non-strictness is critical to Scheme in a couple of ways--and the reason is always that we need to see the original syntactic structure of an expression. The first case is for uses of quote: if (foo exp) expands to (quote exp), then (foo (or a b)) should expand to (quote (or a b)), not (quote (let ((tmp a)) (if tmp tmp b))). The second is for macro-defining macros. If a macro expands into a definition of a new macro that needs to pattern match on one of the original macro's arguments, then new macro should see that argument as it originally appeared, not some expansion thereof.

I believe that the latter would not be an issue in so-called generative macro systems, where macros are not allow to deconstruct their argument expressions.

Sunday, March 13, 2005

Multiple Dispatch as Pattern Matching

I've always found explanations of multiple dispatch confusing for some reason. I like to divide it into two concepts: pattern matching and static vs. dynamic type information.

Static dispatch resolves a reference to a method name at compile-time using nothing but static type information. Haskell's type classes are an example. Dynamic (single) dispatch uses runtime "type" information to resolve the reference at runtime and select which method to apply based on the most specific "type" of the receiver.

If we want to make this description more consistent with the type theorists' view that types are static entities (roughly--let's ignore dependent types) and that so-called runtime type information is really just tags, then we can instead describe overriding as a sort of distributed syntax for pattern matching with implicit predicates checking the runtime tag of the receiver. The clauses of the pattern match are automatically ordered by the language implementation in order of specificity, so the most specific tag that matches wins. (Of course, this matching can be implemented by crawling up the class hierarchy, but with certain static guarantees like absence of class loaders, you could optimize this to a case dispatch on only those classes that implement a version of the method.)

If we extend pattern matching to compare the runtime tags of both the receiver and all the arguments, and pick the most specific match of all the arguments, we have multiple dispatch.

Although Java's overloading looks like multiple dispatch, it's a restricted form, because method references are resolved at compile-time based only on static type information. We might call it "static multiple dispatch." But apparently according to the standard terminology, "multiple dispatch" is assumed to be dynamic.

Saturday, March 12, 2005

Aspects vs. Macros

Both aspects and macros are a kind of metaprogramming that allow programmers to specify "hooks" in a program: "whenever you see X, insert the code Y." There is naturally some overlap in the way they are used. But there are key differences.

Aspects are added separately, on top of already-meaningful code. By contrast, since macro invocations are explicit and local, they are meaningless without their corresponding definitions. This difference is a consequence of Filman and Friedman's characterization of aspects as quantification + obliviousness: in AOP, programmers specify join points and advice separately from the code they are advising, which changes the behavior of the advised code without modifying the code itself. Macros, on the other hand, must explicitly ask for code to be inserted at their point of insertion.

It's easy for macrologists to claim the moral high ground: obliviousness is the controversial part of AOP. It violates abstraction boundaries and thwarts local reasoning. And I've now heard more than one researcher advocate a view of aspects as modular units of conjunctive partial specification. I'm game, but I'm also sympathetic to the view that if you go too far in this direction you don't have aspects anymore.

And sometimes, this lack of obliviousness leaves macros wanting. I'm not sure if the following example is truly illustrative, but it just came up and it's what got me thinking about all this. Consider a macro called define-struct-hierarchy, which allows a programmer to define a collection of record types and subtypes arranged in a hierarchy by expanding to corresponding define-struct invocations:
(define-struct-hierarchy
(parent (foo bar)
(child-1 (grunch))
(child-2 (frotz))))
=>
(define-struct parent (foo bar))
(define-struct (child-1 parent) (grunch))
(define-struct (child-2 parent) (frotz))
Now if we plan to document these types, we will almost certainly want to automate the document generation. (Trust me.) But how? The define-struct-hierarchy macro shouldn't be generating documentation every time we perform an expansion, and besides, that functionality is totally irrelevant to the main functionality of defining the datatypes. We'd like to be able to write a separate program that finds the declaration inside the program and generates the corresponding documentation. And we don't want to have to walk the entire program. We should be able to say, "for all macro invocations of the form (define-struct-hierarchy etc), generate the documentation." Otherwise, we end up performing painful contortions like putting the definitions in a separate file and textually including them in the program.

This is a simple and tiny example. Maybe obliviousness just doesn't scale. I sure don't want to go on record as an enemy of abstraction. But I'd also like to understand the middle ground.

Thursday, March 10, 2005

Perlis: Lexical Scope

Alan Perlis on lexical scope:
One man's constant is another man's variable.

Making something variable is easy. Controlling duration of constancy is the trick.

As Will Rogers would have said, "There is no such thing as a free variable."
I remember trying to understand this very idea in high school calculus, but my teacher had no idea what I was getting at. Who knew all I needed was the lambda calculus?

There is no difference between a variable and a constant except the lexical context in which you are looking at it. In this function:
(lambda (n)
(+ n k))
the variable k is a constant, right?
(define (make-adder k)
(lambda (n)
(+ n k)))

Perlis

Wow. I've seen so many Perlis epigrams quoted in various contexts, but I hadn't taken the time to read through the entire SIGPLAN Notices article from 1982.
Epigrams scorn detail and make a point: They are a superb high-level documentation.
The best teachers know what not to say.

Wednesday, March 09, 2005

On Second Thought (Binding Polymorphism)

Maybe I spoke too soon. While Shriram's automaton language does in fact provide an example of binding abstraction, in the sense that the client doesn't know what the binding properties are, it is not an example of "binding polymorphism." That is, the macro writer still knows exactly what the binding properties of each subexpression are.

So as far as I know, the ability to delay the decision about the binding properties of names to the last possible moment is not in fact necessary for the automaton language example, which means that it has not in fact answered my earlier question.

I think what led to my confusion is that Macros that Work claims that the ability to delay the discovery of binders until the last possible moment is necessary to allow binding abstractions. But I don't think that's quite true. Rather, it's only necessary to allow the property I've been calling binding polymorphism, such as my contrived apply-syntax example.

Shriram's Automaton Language

Shriram's educational pearl describing the construction of a macro for an automaton language has the answer to my question last week of whether binding abstractions are really an important property of macros.

Matthias has long promoted the four primary families of applications for macros as
  1. providing cosmetics
  2. introducing binding constructs
  3. implementing control operators (i.e., altering the order of evaluation)
  4. defining data languages
Shriram's paper mentions these as well, and I'm glad he does. Macros are so universally misunderstood, even by their proponents, that it's important to know just what is and isn't important about them. (In particular, inlining code for performance gain is not a killer app of macros!)

His paper deals particularly with the fourth use, defining data languages. And he makes the following insight:
The sub-terms of a macro invocation can contain arbitrary phrases, some of which may not even be legitimate Scheme code. In such cases, the macro construct is essentially creating an abstraction over an implementation choice. For instance, suppose a particular datum can be represented either as a procedure or as a structure.
When an embedded language is implemented as a macro, the user may in fact have no idea what parts are implemented as binding forms in the implementation. The automaton example is a perfect demonstration: you might choose to implement the names of states as symbols in an association list, or you might choose to implement them as binding mutually (tail-)recursive local procedures. In this example, your user shouldn't know what your implementation decision was, and certainly shouldn't have to know.

This is one place where the distinction between macros as language extensions and macros as embedded languages becomes significant. When you add a feature to Scheme, your user needs to know about its binding properties. By contrast, when you implement a different language in Scheme, your user thinks about the semantics of the embedded language, rather than the semantics of its implementation language.

Tuesday, March 08, 2005

set-car! by example

Some remedial semantics while I'm feeling sick as a dog...

What's happening in this program?
(define x (cons 'a 'b))
(define y (cons 'c 'd))
(define z (cons x y))
(set-car! z y)
(set-car! y 'e)
z
; => ((e . d) e . d)
First remember that expressed values are the values the expressions can evaluate to, and denoted values are the values that are stored in environments--in Scheme, denoted values are locations in the store, and expressed values include mutable pairs, which are pairs of store locations.

Let's be the interpreter. After the first two definitions, we have the following configuration:
env: [x -> 2][y -> 5]
store:
0 : 'a
1 : 'b
2 : (0, 1)
3 : 'c
4 : 'd
5 : (3, 4)
Next we define the pair (cons x y). The evaluation of x and y gives us expressed values (0, 1) and (3, 4). Evaluation of the cons duplicates these pair values, and the new pair points to these two new copies:
env: [x -> 2][y -> 5][z -> 8]
store:
0 : 'a
1 : 'b
2 : (0, 1)
3 : 'c
4 : 'd
5 : (3, 4)
6 : (0, 1)
7 : (3, 4)
8 : (6, 7)
Now, for the first set-car!, the first argument evaluates to the pair associated with z, i.e., the expressed value (6, 7). The second argument evaluates to the expressed value (3, 4). Because the pair is really a pair of pointers, we don't touch its value but rather change the value in the cell pointed to by the car of the pair, i.e. cell 6:
0 : 'a
1 : 'b
2 : (0, 1)
3 : 'c
4 : 'd
5 : (3, 4)
6 : (3, 4)
7 : (3, 4)
8 : (6, 7)
Now the second set-car! evaluates its first argument to the pair (3, 4) and again mutates the cell pointed to by the car of the pair:
0 : 'a
1 : 'b
2 : (0, 1)
3 : 'e
4 : 'd
5 : (3, 4)
6 : (3, 4)
7 : (3, 4)
8 : (6, 7)
By the end, we have multiple copies of y, but they all point to the same location 3, which has been mutated.

My kingdom for ref-cells!

Monday, March 07, 2005

Pattern Matching Macros

If I were to teach macros, I would want to separate some of the pieces and introduce them one at a time. Consequently, I think syntax-rules would not be the best way to start. I think the first conceptual hurdle is understanding the phase distinction, and learning to read a program that contains embedded meta-programming. But pattern matching is a whole different topic, and one that people may not be familiar with. For ML and Haskell programmers, it might be no problem. But I remember being terrified by those mysterious ellipses when I first saw Scheme macros.

I might even start with defmacro, except I'm not sure how I feel about starting people with a broken macro system. At any rate, I'd want a system that simply allowed the novice to get used to metaprogramming without having to learn pattern matching at the same time.

But pattern matching has a clear software engineering advantage: it's more abstract than explicit selectors. When an expression contains 5 subexpressions, and you add another one at the beginning, you have to change every single selector for all subsequent subexpressions; with pattern matching, by giving names to the subexpressions and referring to them by name from then on, only the matching code needs to be changed. In other words, patterns are abstract with respect to the positions of their elements.

This reminds me of one of my favorite properties of abstraction. Traditionally people talk about being able to change the implementation without affecting the clients. But another way to put this is making software more amenable to change. The more abstract your code, the greater your ability to refactor.

Sunday, March 06, 2005

Tests and Assertions

The other day, Will claimed that you should have one assertion for every test case in a test suite. Ryan protested, what if you have an imperative sequence that you want to check for various properties at different stages of the computation? Shouldn't this be conceptually a single test with multiple assertions? According to Will, this should be several tests, each with one assertion; the parts of the script that are common to each of the tests should of course be abstracted (with macros where necessary), but each of these properties is a separate test.

We all agree that this should be abstracted. But the question is, where does the abstraction belong? Should it be custom-made in each case by the user, or should there be a library?

It seems like a common pattern that you'd like to be able to say
(define-script
(do-something-1 ...)
(assert property-1 ...)
(do-something-2 ...)
(assert property-2 ...)
(do-something-3 ...)
(assert property-3 ...)
...)
Now, according to Will, this should be generating multiple test cases, with one test for each assertion. In that case, the above pseudocode would ostensibly produce a test-suite value, with multiple test-case values embedded within it.

Another way to look at it, which Ryan suggested, is that you could have multiple assertions per test case. This brings up other questions: what happens if you never reach an assertion because an earlier part of the test raised an exception? Do you count it as a failure? (If so, do you need a static analysis that can predict all the places where assertions might be made?) Do you just ignore it and say it doesn't exist?

Either way, I still think this is an important enough abstraction to belong in the SchemeUnit library. I'm just not quite sure how it should look.