Thursday, February 19, 2009

PLT System Facilities (Software Components): Modules

Lexical scope in general and lambda in particular go a long way towards supporting modular, separate development. As I mentioned before, PLT Scheme builds a number of first-class, module-like component systems on top of lambda. But it also contains a more primitive module system in which modules are not first-class. This serves a number of purposes.

First of all, static modules provide systematic support for packaging, compiling, and deploying code. First-class modules are flexible and expressive, but they don't have anything to say about compilation and deployment. Somewhere along the line there has to be a notion of what the compiler takes as input, and when you have separate development, you need separate deployment and ideally separate compilation.

Another critical purpose of static modules is the ability to modularize static entities in a language. In ML, for example, modules can import and export types. Scheme is of course more [ed: dynamic] than ML, but it still has its own crucial compile-time abstractions: macros. With dynamic modules, there's no straightforward way to import and export static entities like macros. A secondary benefit of static modules is that you can import all the bindings from another module at once without having to spell them all out; this is admittedly less important but still very convenient.

Finally, PLT Scheme was designed to support multiple languages. For pedagogical purposes, this has allowed them to design multiple, concentric subsets of the language tailored to the How to Design Programs curriculum. This has also facilitated language research by making it easy to design and implement new languages (by macro-compiling them to Scheme) and to research language interactions in a multi-language environment. The relevant piece of the module system is a single hook at the beginning of a module definition: the grammar of a module is an S-expression containing the symbol module, a symbol naming the module, and an S-expression indicating the language of the body:
(module foo scheme body ...)
Typically the language chosen is the special built-in language scheme, as above, or the somewhat leaner scheme/base. But the language position works by simply importing another module that implements the required macros for compiling the body. In place of scheme, you can put in a module path for any module installed on the system.

It's also possible to specify a custom reader for a PLT language so it doesn't even have to be restricted to an S-expression syntax. The initial reader allows you to specify a language with the special shebang-like #lang syntax:
#lang scheme/base
body ...
From that point on, the language's reader has access to the input stream to parse it any way it likes.

Anyone can implement a language module, which means people have developed PLT implementations of Algol, Java, ML, and JavaScript, to name a few.

Update: Added in the missing word "dynamic" above. Also, this is a better link for module paths.

Also, Sam is right in the comments: the scheme language really isn't special or built-in in any significant way. It's simply another module provided in the standard PLT collections.

1 comment:

Sam Tobin-Hochstadt said...

Note that `scheme' is not at all built-in. That's one of the reasons that we made the switch for what identifiers in module requires meant - now `scheme' is just a library, whereas before `mzscheme' had to be built-in. There are a few built-in modules, of which `'#%kernel' is the most important, but it's very rare to interact with them.