The Little Calculist: types

Showing posts with label types. Show all posts

Friday, February 12, 2010

Eich's Law

Found this gem in a C++ comment while digging in the SpiderMonkey codebase:

After much testing, it's clear that Postel's advice to protocol designers ("be liberal in what you accept, and conservative in what you send") invites a natural-law repercussion for JS as "protocol":

"If you are liberal in what you accept, others will utterly fail to be conservative in what they send."

The comment is unsigned, but it sounds like Brendan.

Friday, February 22, 2008

I am a Super Big Fan of disjoint union datatypes in programming languages, but there are places where they are really inconvenient. Having to inject and project data between related types can be prohibitively cumbersome, especially when dealing with large data definitions such as the AST definitions for realistic programming languages. I know of a couple of languages where "true" unions are used instead, including Typed Scheme and the draft ECMAScript Edition 4. In both cases, unions are being added to a language where the runtime is already paying the price for runtime type tags, so keeping the variants distinct where they don't overlap doesn't introduce any extra burden.

But I was thinking this morning, what would happen if you tried to add true unions to Hindley-Milner languages? For concreteness, let's imagine extending Haskell. So instead of writing

data Foo = I Int
         | S String

you could instead write

type Foo = Int | String

Now if you want to pattern match on such a construct, you have to explicitly mention the type; there's no other way to distinguish the variants. So you could imagine type-annotating the patterns in a match expression:

showFoo :: Foo -> String
showFoo foo = case foo of
                n :: Int -> show n
                s :: String -> s

Note that because there's nothing prevent overlap in the variants of a true union, the order in which you match the patterns is significant. Alternatively, you could write the definition of showFoo in pattern-definition style:

showFoo :: Foo -> String
showFoo (n :: Int) = show n
showFoo (s :: String) = s

Consider the possible drawbacks to such a feature:

Cost of tagging:

One of the benefits of (sound) static typing is the ability to compile to efficient runtimes that avoid tagging runtime objects with their datatype. The Foo type, by contrast would require its Ints and Strings to be tagged. But this is the wrong way of looking at it; the language requires you to tag them anyway if you use the disjoint union type, so there's no additional cost over what you would already have been paying. For non-union types, you can still do the same amount of erasure.

Possible overlap:

You can express overlapping types like (Int | (Int | String)) with true unions, which makes the order of patterns significant, could result in surprising (order-dependent) logical relationships between case dispatches, and could generally lead to messy type definitions. Maybe a more principled way of looking at it is a disjoint union can be thought of as a type abstraction, whereas with a true union you might have to know its full definition to use it effectively. But hey, the same is true of type-definitions but they're still useful; and besides, nothing's preventing you from using disjoint unions when they're more appropriate.

Case exhaustiveness:

Standard ML's pattern matching facility is carefully designed to allow the compiler to prove or disprove case exhaustiveness. I tend to think this is a bad trade-off; the language is significantly crippled to enable a very weak theorem prover to prove a theorem that's of limited utility. Sure, it has its uses, but when you know things the theorem prover doesn't, you have to abandon pattern matching entirely. Other language designers seem to agree with me, since Haskell and I think Ocaml also don't check for case exhaustiveness.

Inference problems:

I have no expertise in Hindley-Milner inference, but I'm sure true unions are not inferrable. But one thing recent Haskell research has demonstrated is that convenient but uninferrable extensions to Hindley-Milner type systems can often be safely added with the requirement that they be explicitly annotated. I bet you could add true unions, require explicit type annotations when they're used, and get along just fine.

Function types and polymorphic types:

This is where it gets tougher. How do you tag these kinds of types? Do you do runtime subtyping checks for things like the more-polymorphic-than relation? I don't have an answer. A simple one is to restrict unions to types with simple tags, although those kinds of arbitrary restrictions lead to lack of compositionality in language design. But perhaps it's possible to make this work with the entire type system.

Some of this design space has been explored with Ocaml's polymorphic variants. I should take a look at the literature on that.

Thursday, March 01, 2007

'a typing

There are a lot of typing paradigms running around these days without clear definitions. One of the difficulties is that any given phrase carries a number of different meanings, including (but not necessarily limited to) the phrase's apparent meaning from its constituent words, the phrase's common usage in any one of a number of different communities, the originator's intended meaning of the phrase, the original intended application domain of the paradigm, or the subsequent historical application domain of the paradigm.

Here's what I tend to mean by the following phrases. Disclaimer: I try to keep a balance between the aforementioned categories of meaning, although my community's common usage generally takes precedence. In any case there's a lot of vagueness and variability in the terminology, not to mention controversy. Read this at your own risk.

static typing: systems for reasoning about and enforcing inherent language invariants algorithmically and without involving program evaluation. In order to guarantee the invariants, the runtime semantics is only allowed to evaluate statically well-typed programs.

dynamic typing: systems for enforcing inherent language invariants incrementally during program evaluation via runtime checks.

soft typing: systems for detecting possible violations of imposed (i.e., not inherent) language invariants algorithmically and without involving program evaluation. The runtime semantics does not depend on programs being well-typed according to the soft typing algorithm.

optional typing: systems for detecting possible violations of imposed language invariants algorithmically and without involving program evaluation. The runtime semantics does not depend on program being well-typed according to the optional typing algorithm. (Come to think of it, I can't come up with a distinction between soft typing and optional typing off-hand.)

pluggable typing: systems of optional typing that admit any number of different static checking algorithms.

hybrid typing: systems for enforcing inherent language invariants algorithmically both via static reasoning and dynamic checks. This paradigm was originally introduced as a way to combine the static guarantees of statically typed languages with the more expressive annotation languages of dynamic contract systems.

gradual typing: systems for enforcing inherent language invariants algorithmically both via static reasoning and dynamic checks. This paradigm was originally introduced to facilitate program evolution by permitting incremental annotation of programs.

strong/weak typing: your guess is as good as mine.

The Little Calculist

Friday, February 12, 2010

Eich's Law

Friday, February 22, 2008

True unions

Thursday, March 01, 2007

'a typing

Blog Archive

Recommended Reading

Haunts

About Me