Saturday, April 02, 2005

Three ways to label expressions

Static analyses like CFA compute results for each subexpression of a program. Consequently the result of such an analysis is a map from individual subexpressions to answers. But consider the following program:
((lambda (x) x) (lambda (x) x))
Since a program may contain identical subexpressions, there needs to be some way to distinguish seemingly identical terms. But the program already contains enough information to distinguish all subexpressions: their distinction is precisely the program context in which they occur. So one way to label subexpressions is to make the result of the analysis a map from program contexts to results:
ContextVal
But comparison of two expressions by recursively comparing their entire program context is inefficient. We can optimize this by representing each position in the program as a unique label, and then implement these labels with a datatype that has a fast comparison operation (such as integers or symbols):
LabelVal
But there's another possible representation hack. Sometimes we happen to be working with a language whose expressions themselves contain information that distinguishes them from all other expressions. A program in A-normal form binds every subexpression to a local variable. If we make sure that each such variable has a unique name*, then our analysis can simply map the variable of each let-expression to a result:
VarVal
*As they point out on p.142 of Principles of Program Analysis, there's not actually a requirement that the labels be unique, but an analysis that identifies two distinct expressions will potentially be less precise.

No comments: