The Little Calculist: February 2005

Sunday, February 27, 2005

The idea of topology

A topology is defined by one thing: the definition of openness. Formally, a topology is a set X (the "universe") and a collection T of subsets of X (the open sets). A simple enough definition, but what's the intuition?

The most straightforward topologies to use are the real topologies, where the space is Rⁿ (often R² or R³) and the open sets are the open intervals (x, y) or open solids like discs -- which have blurry edges, as opposed to circles which have hard edges. These serve as a good visualization for the basic concept that an open set doesn't contain hard boundaries; for any particular point x in the set, there is a little more "wiggle room" left between x and the edge of the set.

Topology deals with transformations that preserve certain aspects of a set; you'll often hear people use the intuition that you can stretch, twist, and bend an object, but you can't tear or glue it. Homeomorphisms are transformations that preserve the relative "nearness" of points; when you stretch something, you might make two points farther from each other, but they'll still be closer to each other than points that have been stretched even farther away.

In particular, the definition of continuity in topology is a nice generalization of the definition in calculus: a function is (topologically) continuous if the inverse image of an open set is open. Restricting this to R², for example, it fits with the intuition from calculus that there are no "edge points" or "breaks" in the curve of the function wherever there are no breaks in the domain. In other words, a function on R² takes open intervals on the x-axis and "stretches" them to the shape of the function's curve, but never breaks them.

I've never been very good at the geometric visualizations of topological concepts, but what's cool about it is that abstract constraints about the preservation of structure of an object lead to striking visualizations with two very different objects that yet have similarity that you can still make out. To see that a coffee mug and donut are somehow "the same" requires what the philosopher Walter Benjamin would call the human "mimetic faculty". But this really isn't surprising if you realize that abstract concepts like homeomorphism ignore many of the properties that distinguish the two objects, and only focus on those properties they both share.

(I love reading math entries in Wikipedia. Peter has pointed out glaring mistakes to me before, but I remain loyal. They tend to strike a great balance between formality and intuition, and I often find their entries more useful than Mathworld.)

Wednesday, February 23, 2005

Point-free style

There's a lot of argument about point-free style. I think it's a red herring -- the style is not good or bad per se. The problem is with a conflation of two different aspects of abstraction:

generalization -- a function is an abstraction in the sense that it's a generalization of an algorithm; n + 1 will give you the successor of n when n is 17, but it will in fact give you the successor of n for any value of n (under certain restrictions, of course). So λ is a universal quantifier, meaning &lambda n → n + 1 is a successor function for all integers.
setting the level of detail -- abstraction is also about forgetting irrelevant details and focusing on important details. In terms of engineering, we eliminate the irrelevant details so that they can be changed without having any observable effect on the parts that depend on them, and we keep in the important details so that all the configuration options are available. But in terms of communication, we set the level of abstraction at an intuitive level to make sense to the reader.

There may not be a single right way to set the appropriate level of abstraction, but there are also plenty of ways to set an inappropriate level of abstraction. While there's a lot of attention given to setting a high enough level of abstraction, it's important to think about the ramifications of setting too high a level of abstraction.

Since, as we all know, programs are written for human readers, the level of abstraction is the most important tool for communicating a system effectively to another person. Putting in too much detail (i.e., a low level of abstraction) overwhelms the reader with excessive amounts of detail. People are stupid, so that's no good. But it's also well-known that people understand in terms of examples, and they can generalize much better once they have particular examples to start with. Setting too high a level of abstraction is a sure way to confuse the reader, make them feel stupid, and make them dislike you. (But hey, if that's your thing..)

How does this relate to point-free style? Well, for one thing, an argument name gives someone a name to perform hypothetical examples on. When you have &lambda n → n + 1, you can think about this entity n and what you're going to do with it. When all you have is succ, the only entity you have to work with is the operation itself (and oh, how the category theorists jump with glee!).

So? This may be just fine! It's all about talking at the right level of abstraction. If I'm thinking about a pipeline of conversions, having too many arguments may be a total distraction. I'd rather think of name->number as the composition

string->number ⋅ symbol->string ⋅ name->symbol

than as λ x → string->number(symbol->string(name->symbol(x))) because I really am thinking about this as a compound operation. The entities are functions, and I'm building a compound function from smaller ones. Watching the marble go into the pipe isn't relevant or interesting.

But when you generalize as absolutely far as you can, you sometimes create strange beasts that have no simple intuitions. This is why fold is so much harder to understand than map. Again, though, there isn't a single conclusion to draw. Sometimes these abstract monstrosities turn out to be so powerful that we get used to them out of necessity. People use fold a lot, so it just becomes one of those things you have to learn. It doesn't have an easy intuition, but it sure is useful.

But if you fill your program with such abstractions, good luck getting anyone to read it.

Abstracting over second-class language forms

Macros are good for abstracting over second-class language forms, which can't be done with ordinary mechanisms like functional abstraction or inheritance. But people are often used to scanning for literal occurrences of these second-class forms in order to understand a program; I noticed this when Sam was trying to read my code and couldn't figure out at first where a name I'd exported had come from, because I'd written a quick macro that took a bunch of names and defined and provided them all in the same way.

Stylistically, one approach to mitigate this is to keep the name as close to the code that defines it as possible. Here's a macro version of for-each that allows you to place an anonymous transformer right near the list of expression arguments it will receive:

(define-syntax syntax-for-each
  (syntax-rules ()
    [(_ transformer (arg ...))
     (begin
       (define-syntax anonymous transformer)
       (anonymous arg)
       ...)]))

(syntax-for-each (syntax-rules ()
                   [(_ name)
                    (begin
                      (define name #|whatever|#)
                      (provide name))]
                 (foo bar baz bat razzle frizzle fratz))

At least if you were doing a text search for the identifier name, it would show up immediately below the code that defines it.

Incidentally, notice the use of the internal define-syntax. This is necessary, because if you use let-syntax, the code inside of it is no longer at the top level, so any definitions produced by the macro would be treated as internal defines. Yuck.

Monday, February 21, 2005

Why compositionality matters

The idiom of program design as language design can be seen as taking the robustness or "viscosity" of a solution seriously. PL designers can't get away with saying, "oh, this corner case doesn't matter because nobody would possibly ever use the language that way." For one thing, people always use languages in way you don't expect. For another, meta-programming tools like compilers that generate code in the language you've designed will use your language in ways that no human being would be caught dead doing. In other words, compilers can be seen as combinatorial test-case designers.

By contrast, building software by case analysis doesn't scale. When configuration options X, Y, and Z don't play well together in various combinations, you have to make a million special cases, the code becomes brittle, and the users must be careful never to combine features wrong.

Compositional program design (or the "closure property," in SICP-speak) is a way of building in correctness from the start, avoiding the combinatorial explosion of proof obligation¹ as more features or configuration options get added. You start with simple program constructs that are well-behaved, and then only allow means of combination that preserve the good behavior. This is a common trick in PL, ranging from the design of syntax to proof by logical relations.

¹I don't love this phrase, because in my experience people talk about proving properties of programs much more than they actually do it. But at least informally, we always reason about the correctness of programs, and anything that simplifies this reasoning is usually seen as an improvement, and not just by theoreticians.

Higher-order macros

One of the benefits of Macros that Work over syntactic closures as a macro system is the ability to discover the binding relationships of macro arguments after they get expanded, not before.

If we first parsed completely before doing macro expansion, could we not replace all occurrences of names by pointers to symbol table entries and thereby eliminate the capturing problem? The answer is no. One of the most important uses of macros is to provide syntactically convenient binding abstractions. The distinction between binding occurrences and other uses of identifiers should be determined by a particular syntactic abstraction, not predetermined by the parser.

I've puzzled over this statement for quite some time now. The only instance I have thought of where binding relationships can't be predetermined is in higher-order macros, where an operator can be passed to a macro as an argument that may then be used to determine the binding relationships of other arguments:

(define-syntax apply-syntax
 (syntax-rules ()
   [(_ operator var)
    (operator var)]))

If we apply this macro to a normal expression, var does not get bound:

(apply-syntax add1 x)
=> (add1 x)

If we apply it to a macro that binds its argument, var does get bound:

(apply-syntax make-abstraction x)
=> (lambda (x) (+ x 17))

If we apply it to quote, var gets treated as a symbolic literal:

(apply-syntax quote x)
=> (quote x)

With syntactic closures, you have to commit to all the binding relationships up front in the definition of apply-syntax, which means you can't pass it arguments that don't adhere to those binding relationships.

How important is higher-order macro programming? Particularly, how important is this "binding-polymorphic" style of higher-order macro programming? I'm not sure, but I have a feeling that syntactic closures may be more useful than it's given credit for.

Friday, February 18, 2005

Compile-time environments

PLT maintains a phase separation between macro expansion and program evaluation. The first phase is known as macro elaboration-time, and the latter as runtime. The runtime contains a normal environment mapping variables to values, but at elaboration-time, there are two environments at work:

the syntactic environment, which binds identifiers to their static behavior for the macro expansion algorithm, i.e., whether the identifier is mapped to lambda, let-syntax, or a macro (etc.) -- identifiers are bound in this environment by define-syntax, for example.
the transformer environment, which is a regular Scheme evaluation environment, but exists during the elaboration phase, not a runtime -- identifiers are bound in this environment by define-for-syntax, for example.

The syntax-local-value procedure allows you to get at the mapping of an identifier in the syntactic environment. If you define a structure type my-struct, the define-struct macro expands into a new define-syntax that binds my-struct to some static information about the structure type that you can get at with (syntax-local-value #'my-struct). You can't really change this information (short of hacks with macro-redefinition, which tend to be brittle), because the syntactic environment is rather jealously guarded by the macro expansion algorithm. If you want to maintain more flexible and complicated information at elaboration time, that should go in the transformer environment.

Thursday, February 17, 2005

More testing

Yep, programmers love writing tests! It's especially fun in a high-level language, where you can whip up abstractions with break-neck speed. I've found macros especially useful for testing. For one thing, macros are great for abstracting over before/after patterns (so much better than the setUp/tearDown idiom in JUnit). Another thing they're great at is data languages, which can be useful for describing an expected result or interaction sequence.

Some of the tools available for PLT Scheme that make testing such a joy -- aside from higher order functions and macros, I mean -- include SchemeUnit and PLaneT. I don't think I can underestimate the importance of PLaneT. Without a release mechanism, code reuse is nothing but a pleasant fantasy. And now that SchemeUnit is in PLaneT, I can actually release my tests with my code, so anyone can try them out.

An up-and-coming beauty is MzTake, a scriptable debugger based on event monitoring. Tonight Sam and I debugged a concurrent procedure, and I automated our tests with asynchronous channels. This is very cool, and a big improvement over the brittle, timing-based test scripts I used to write in Perl. But I still had to instrument my code by hand (thereby polluting the primary functionality of the code -- an AOP no-no). I'll try some MzTake scripting soon to see if I can do any better.

Monday, February 14, 2005

Doing infinity with parametricity

After talking about using laziness to simulate infinite quantities, I've started thinking about parametricity.

In this Dr. Dobbs interview, Phil Wadler talks about how you can avoid code explosion in languages with parametric polymorphism by exploiting parametricity results: the exact same algorithm applies at all types, so you don't have to duplicate the code at each instantiation of the type parameter.

This property applies at the value level, too: the same algorithm works for all inputs (of the required type), so you can reuse the same code without duplication. Kenichi's paper with Matsuoka and Yonezawa, which is on using selective duplication to simulate infinite towers of reflective interpreters with a finite number of copies of a meta-circular interpreter, probably offers some insight into the interaction between laziness and sharing in simulating infinities. Kenichi just gave me a copy of that paper today, and it looks very cool.

Computers do infinity with laziness

I'm realizing that using laziness to simulate infinite quantities in finite resources is a fundamental trick in computer science. The obvious examples are coalgebraic structures in Haskell like streams, or stream processing in any language, for that matter (e.g., Java InputStreams).

But you can see this everywhere. A fascinating example is that functions represent infinite entities but can be represented in finite space by only computing their results on arguments by need. (Don't confuse this with call-by-need semantics; I mean that you only calculate the result of a function for any given input when you happen to apply it to that input.) In fact, the whole undecidability of function equality is essentially just the rather obvious observation that you can't compare two infinite quantities in finite time or space.

Contract checking for higher-order languages uses the same trick. Since the contract system can't test a function to see if it adheres to a contract in general, it just tests its particular inputs and outputs as they appear.

This is also how we describe functions in denotational semantics: a function is the limit of an ω-chain of successive finite approximations of the function. The length function, which is an infinite set of ordered pairs of inputs and outputs, can be simulated by only generating a finite approximation of recursion depth k for any particular input list of length k. And to prevent calculating all such approximations, which wouldn't terminate, we use the lazy Z-combinator in a call-by-value language, or we just use a lazy language.

Macros that paint

Aha! I'm starting to have a clue what Ryan has been talking about all these years with "painting" and macro expansion. In Macros that Work, every renaming that happens in a particular transcription is distinct; there aren't any relationships between renamings of different identifiers in a single transcription.

But in Dybvig et al, the programmed macro system has library procedures that allow the programmer to apply the renamings that were used for one identifier to another one, so that you can say things like "this occurrence of x is the same as that occurrence of x." Furthermore, all the identifiers in a single transcription step get the same renaming applied to them, so that you can say "this occurrence of x has the same renaming as that step." In certain cases this allows you to say "this x is the same as that x" without actually pointing to the particular occurrence of x; you only need any syntax object from the same transcription step. Not a particularly abstract facility! Yikes.

I think it's time I read about syntax-case.

Sunday, February 13, 2005

Nice filesystem testing pattern

I've come across a useful pattern for unit testing with the filesystem: frequently I want to create some "sandbox" directory where I'll stick a bunch of temporary files and subdirectories for testing, and when the test is over, I want to delete the whole thing. But any intermediate files or directories I create during testing should not be deleted, because I may still need them later in the test. So only the first directory I create should be automatically deleted.

This requires two things: 1) a macro for creating a new directory and evaluating subexpressions in the context of that directory, and 2) a facility for enabling and disabling the automatic deletion of temporary directories.

(define keep-new-directories? (make-parameter #f))

(define-syntax in-new-directory
 (syntax-rules ()
   [(_ dir-e e1 e2 ...)
    (let ([dir dir-e])
      (rm-rf dir)
      (make-directory dir)
      (let ([result (parameterize ([current-directory dir]
                                   [keep-new-directories? #t])
                      e1 e2 ...)])
        (unless (keep-new-directories?)
          (rm-rf dir))
        result))]))

Another nice property of this pattern is that if I want to test my test, i.e., to inspect the files it's creating, I can temporarily disable the automatic deletion of the sandbox directory by wrapping the whole thing in a (parameterize ([keep-new-directories? #t]) ...).

Meta-conclusion: dynamic scope is useful.

Friday, February 11, 2005

Modules and shadowing

In the lambda calculus, you can simulate modules with let; you just stick all your modules in one big let expression. But one of the major differences between modules and ordinary functions is that you don't want to allow shadowing between modules, because when you require a bunch of modules, you require them at the same level of nesting, whereas nesting is always apparent in the lambda calculus.

PLT Scheme almost gets this right; you can't require two modules that export the same name without explicitly resolving the conflict (either by excluding one of the conflicting bindings, or by prefixing all bindings imported from one of the modules).

But this isn't true for the literals list in a macro definition. The patterns in a macro definition match a subexpression of a macro invocation not against the symbolic name of a syntactic literal, but against its binding. This way even keywords have lexical scope. So if you shadow the binding of a particular identifier, you can no longer use it as a literal for a macro. For example:

(let ([else #f))
  (cond
    [#f 'false]
    [else 'else]))

falls off the end of the cond expression, because the keyword else does not have the same binding as it did in the lexical context of the macro definition.

This is the right behavior for ordinary lexical scope, but not for modules. What you want is for a module to export its keyword names as well -- this will force the client module to resolve conflicts explicitly just as with ordinary identifiers.

Otherwise you get annoying errors like this: requiring SRFI-1, which exports a procedure any (which is essentially ormap), and the standard contract.ss library, which has several macros that allow a special literal any, representing the trivial contract, causes the SRFI's binding of any to shadow the literal's binding. Writing contracts with any then cause very confusing errors:

->: expected contract or procedure of arity 1, got #<procedure:any>

Ryan has suggested that authors of macros with non-empty literals lists should export the literal names as well by hand. Here's a macro that does essentially that:

(module keywords mzscheme
  (define-syntax provide-keywords
    (syntax-rules ()
      [(_ name ...)
       (begin
         (begin
           (define-syntax (name stx)
             (raise-syntax-error
              'name
              "cannot directly use literal keyword"
              stx))
           (provide name))
         ...)]))
  (provide provide-keywords))

The macro writer can then implement a macro, export it as usual, and then export its literals, like so:

(provide cond)
(provide-keywords else)

Then clients that import a module that tries to bind else will be forced to resolve the contract explicitly. (And trying to use the keyword outside the context of the macro it was intended for results in a syntax error.)

The Little Concurrent Calculist

I still need to figure out how to multitask. I can write code, read papers, or develop a semantics, but I can't seem to do any two in the same day (or even week). I think part of the trick will be tangible organizational practices. I like the idea of organizing as much as possible on the computer, because I just can't cope with things that take up physical space. In particular, stacks of paper are pure evil.

Here are some of the practices I'm picking up:

keeping my notes in a wiki
keeping a research journal as a blog (yer lookin at at)
keeping my research library on CiteULike

Yet to resolve:

calendar
planning out my days
making (realistic) long-term plans

I'm not quite dead, sir

I finally managed to corner both Mitch and Will at the same time yesterday and we had a very useful little meeting about the macro system I'm working on. He was familiar with the snags we've hit, and has thought about them enough to have ideas about how to overcome them. He was also very optimistic about the prospects for this project. I'm excited to get working on this again.

Code code code...

Been working on some PLaneT packages the last few days. Sam's current project is a tool for analyzing Java programs that some of the libraries I wrote last summer could be useful for, so I'm updating them for v300 of PLT Scheme. Naturally I've discovered ways that the code could be cleaned up. I'm getting happier with the libraries all the time.

Research blog

I'm going to experiment with writing a research blog to document my progress in grad school. I got the idea from Phil Wadler's students.

Perhaps Mitch can use this as a way to spy on me, too...

The Little Calculist