Thursday, April 27, 2006

Jewels of the lambda calculus: abstraction as generalization

Since there are few things as likely to inspire in me a sense of religious reverence as the lambda calculus, I've decided to start a new series on some of the deep principles of computer science lurking in the lambda calculus.

Let's start with the crown jewel, abstraction:
This simple form represents the first rule of programming. Ruby hackers call it the "don't repeat yourself" rule. Perl hackers call it laziness: good programmers avoid repetitive, manual tasks. No matter the size, shape, or form, any time you write a program, you create an abstraction, a piece of code that automates a common task that otherwise would have to be repeated by a human.

Lambda abstractions represent the generalization of repeated patterns by parameterizing over their differences. If you perform several computations that are similar but not identical:
(9/5 × 23 + 32)
(9/5 × 40 + 32)
(9/5 × 13 + 32)
then each one of these computations is a special case of a more generalized procedure. The abstraction of the three is represented by parameterizing over the portion where they differ:
λc.(9/5 × c + 32)
From functions to abstract classes to Microsoft Office macros to parametric polymorphism, every form of abstraction is at its heart this same phenomenon.


Anonymous said...

So I would argue that the concept of "abstraction" as captured by the lambda binding has been around since the 17th century in the form of mathematical functions, so LC is certainly not responsible for this. But yea, syntactic generalization over input/output pairs (curves) was a grand discovery back then, and has influenced everything since.

On a somewhat related note, there is work in HCI challenging the long-held assumption that abstraction is a Good Thing. This is backed by evidence that large projects are massively duplicated, despite the fact that duplication is seen as a bad thing... Linux is something like 40% duplicated, some commercial projects a lot worse. Some may argue that it's a case of bad software engineers, argument which, although compelling, doesn't change the state of affairs.

So there is work that embraces duplication but spares repetitive work by providign clever mechanisms for working with it. (E.g. linked editing, simultanous editing.)


Anonymous said...

Marius wrote:

So I would argue that the concept of "abstraction" as captured by the lambda binding has been around since the 17th century in the form of mathematical functions, so LC is certainly not responsible for this.

Indeed, but Dave did never claim that the lambda-calculus is responsible for this. He only recognizes abstraction as a 'deep principle of computer science' and identifies it as an essential feature of the lambda-calculus.

Anonymous said...

True. But then saying that "LC is *related to* CS concepts" (as opposed to "responsible for") doesn't say much, does it? LC is a ultra-general calculus of computable functions, so it captures everything about functions and computation. Anyhow :)

adrassi said...

From a formal/structural perspective, there is a tremendous difference between mathematical functions and lambda abstractions.

Mathematical functions are sets of pairs, no more, no less. Although we say "let f be the function such that f(x) = x^2", this is a meta-level "abbreviation" for a set of pairs { (0, 0), (1, 1), (2, 4), ... }. As formal objects, mathematical functions are purely extensional in nature: structurally, they pair an input with its output, without revealing the actual (intensional) connection between the two. However, since we generally specify functions intensionally in our metalangauge, we can reason about them as such, and say for example that if (x, y) is in f, then y = x^2.

Mathematical functions, as sets of pairs, say nothing interesting about computation or abstraction. Those aspects are only present in the metalanguage (say, English) in which they are discussed.

The point is that the lambda calculus formalizes the metalanguage we use to talk about functions. Lambda abstractions are mathematical structures with which and through which we can reason about how a function arrives at an output from an input. In particular, variables and binding become formal objects, rather than informal notation. This comes at the cost of some machinery, e.g. capture-avoiding subsitution.

To see the difference succinctly, consider that the mathematical functions f(x) = x + x and g(x) = 2*x are structurally indistuinguishable. But the lambda abstractions f = λ x. x+x and g = λ x. 2*x are quite distinct, and we have a formal framework within which to understand and reason about the distinction.