Wednesday, February 23, 2005

Point-free style

There's a lot of argument about point-free style. I think it's a red herring -- the style is not good or bad per se. The problem is with a conflation of two different aspects of abstraction:
  1. generalization -- a function is an abstraction in the sense that it's a generalization of an algorithm; n + 1 will give you the successor of n when n is 17, but it will in fact give you the successor of n for any value of n (under certain restrictions, of course). So λ is a universal quantifier, meaning &lambda n → n + 1 is a successor function for all integers.
  2. setting the level of detail -- abstraction is also about forgetting irrelevant details and focusing on important details. In terms of engineering, we eliminate the irrelevant details so that they can be changed without having any observable effect on the parts that depend on them, and we keep in the important details so that all the configuration options are available. But in terms of communication, we set the level of abstraction at an intuitive level to make sense to the reader.
There may not be a single right way to set the appropriate level of abstraction, but there are also plenty of ways to set an inappropriate level of abstraction. While there's a lot of attention given to setting a high enough level of abstraction, it's important to think about the ramifications of setting too high a level of abstraction.

Since, as we all know, programs are written for human readers, the level of abstraction is the most important tool for communicating a system effectively to another person. Putting in too much detail (i.e., a low level of abstraction) overwhelms the reader with excessive amounts of detail. People are stupid, so that's no good. But it's also well-known that people understand in terms of examples, and they can generalize much better once they have particular examples to start with. Setting too high a level of abstraction is a sure way to confuse the reader, make them feel stupid, and make them dislike you. (But hey, if that's your thing..)

How does this relate to point-free style? Well, for one thing, an argument name gives someone a name to perform hypothetical examples on. When you have &lambda n → n + 1, you can think about this entity n and what you're going to do with it. When all you have is succ, the only entity you have to work with is the operation itself (and oh, how the category theorists jump with glee!).

So? This may be just fine! It's all about talking at the right level of abstraction. If I'm thinking about a pipeline of conversions, having too many arguments may be a total distraction. I'd rather think of name->number as the composition
than as λ x → string->number(symbol->string(name->symbol(x))) because I really am thinking about this as a compound operation. The entities are functions, and I'm building a compound function from smaller ones. Watching the marble go into the pipe isn't relevant or interesting.

But when you generalize as absolutely far as you can, you sometimes create strange beasts that have no simple intuitions. This is why fold is so much harder to understand than map. Again, though, there isn't a single conclusion to draw. Sometimes these abstract monstrosities turn out to be so powerful that we get used to them out of necessity. People use fold a lot, so it just becomes one of those things you have to learn. It doesn't have an easy intuition, but it sure is useful.

But if you fill your program with such abstractions, good luck getting anyone to read it.


Anton van Straaten said...

An additional point which these sort of discussions raise for me is the fact that our language editing technology is still so primitive. I want the option on my editor to toggle the view between using point-free style and not, etc. All those wonderful source transformations that go on during compilation are wasted on compilers; I want some of the benefits myself!

Tangential point: the example you chose caused me a double take: the arrows in operator names like "symbol->string" point left to right, but the composition operator binds right to left. The arrows tricked me into wanting to compose the expression the wrong way around. Abstraction clash...

BTW, you have some super high-quality blog entries here. I thought I'd better post something to bring the tone down a notch! ;)

Felix said...

The arrows are in the wrong direction because we read from left to right but for some reason we view functions as taking their inputs from their right side.

If Scheme (or Math in general) were written like Forth (postfix), the code would look like:

(((nm name->symbol) symbol->string) string->number)

or rather:

(n (compose name->symbol symbol->string string->number))

In fact, Category Theorists have switched from using circle as composition and instead use semicolon, except that it applies in the other direction! So that
f ; g ; h == h o g o f

It makes sense, once you wrap your head around it.