The Little Calculist: Haskell's hidden mutation

Friday, July 01, 2005

Haskell's hidden mutation

Here's a clearer version of the Haskell function I described earlier today:

minTree t = t'
    where (n, t') = replaceMin t n

replaceMin (Leaf n) m = (n, Leaf m)
replaceMin (Node t1 t2) m = (n, Node t1' t2')
    where (n1, t1') = replaceMin t1 m
          (n2, t2') = replaceMin t2 m
          n = min n1 n2

In a sense, the binding of a new variable n introduces an implicit box that will get filled in at some point. So because we have letrec binding semantics, we can use the same box as both an output and an input to the replaceMin function. But the box can only be filled once; if, in the process of computing the value to be placed in the box, we end up having to use the value in the box, we end up with a dependency cycle and then loop. This program works because the box gets set at the end of the computation, and doesn't get used until after that point.

We can explicitly encode the order of the computation using continuations. I'll use Scheme now just to show there's no hidden reliance on laziness anymore.

(define (min-tree t n)
  (let ([m (box #f)])
    (match-let ([(n . t*) (replace/min t m)])
      (set-box! m (n identity))
      t*)))

(define (replace/min t m)
  (match t
    [($ Leaf n)
     (cons (lambda (k) (k n)) (make-Leaf m))]
    [($ Node t1 t2)
     (match-let ([(k1 . t1*) (replace/min t1 m)]
                 [(k2 . t2*) (replace/min t2 m)])
       (cons (lambda (k)
               (k1 (lambda (n1)
                     (k2 (lambda (n2)
                           (k (min n1 n2)))))))
             (make-Node t1* t2*)))]))

Now you can see exactly when we set the value of the box--after the computation finishes--and that we don't do anything with the returned "min-computations" except embed them in larger min-computations until after the traversal, when we run the final computation.