Thursday, November 19, 2015

Simple construction of fixed-point data types

(This post is literate Haskell, available here.)

This post will show a trick for making it easier to work with terms representated as fixed-points of functors in Haskell. It is assumed that the reader is familiar with this representation.

First some extensions:

{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeFamilies #-}

Representing terms as fixed-points

A common generic term representation is as a fixed-point of a functor:

newtype Term f = In { out :: f (Term f) }

A simple example functor is the following, representing arithmetic expressions:

data Arith a
    = Num Int
    | Add a a
  deriving (Functor)

One problem with the Term representation is that it’s quite cumbersome to construct with it. For example, the following is the representation of the expression (1+2)+3(1+2)+3:

term1 :: Term Arith
term1 =
    In (Add
      (In (Add
        (In (Num 1))
        (In (Num 2))
      ))
      (In (Num 3))
    )

This post will show a simple trick for inserting the In constructors automatically, allowing us to write simply:

term2 :: Term Arith
term2 = deepIn10 $ Add (Add (Num 1) (Num 2)) (Num 3)

Deep conversion of terms

First, let’s have a look at the type of the above expression:

_ = Add (Add (Num 1) (Num 2)) (Num 3) :: Arith (Arith (Arith a))

We see that the type has three levels of Arith because the tree has depth three. So in general, we need a function with the following type scheme:

f (f (f ... (f (Term f)))) -> Term f

The purpose of this function is to insert an In at each level, from bottom to top. The reason for having Term f in the argument type (rather than just a) is that In expects an argument of type Term f.

This function has to be overloaded on the argument type, so we put it in a type class:

class Smart f a
  where
    deepIn  :: f a -> Term f
    deepOut :: Term f -> f a

Here f is the functor and a can be Term f, f (Term f), f (f (Term f)), etc. The class also contains the inverse function deepOut for reasons that will be explained later.

First we make the base instance for when we only have one level:

instance Smart f (Term f)
  where
    deepIn  = In
    deepOut = out

Next, we make a recusive instance for when there is more than one level:

instance (Smart f a, Functor f) => Smart f (f a)
  where
    deepIn  = In . fmap deepIn
    deepOut = fmap deepOut . out

And that’s it!

Ambiguous types

Now there’s only one problem. If we try to use deepIn $ Add (Add (Num 1) (Num 2)) (Num 3), GHC will complain:

No instance for (Smart Arith a0) arising from a use of ‘deepIn’

As said earlier, the type of the given expression is Arith (Arith (Arith a)), which means that we get an ambiguous type a which means that it’s not possible to know which of the above instances to pick for this a.

So we need to constrain the type before calling deepIn. One idea might be to use a type function for constraining the a. Unfortunately, as far as I know, it’s not possible to make a type function that maps e.g.

Arith (Arith ... (Arith a))

to

Arith (Arith ... (Arith (Term Arith)))

This is because a is a polymorphic variable, and type functions cannot case split on whether or not a type is polymorphic.

But there is one thing we can do. The type Arith (Arith (Arith a)) shows that the expression has (at most) three levels. But there is nothing stopping us from adding more levels in the type.

Why not make a version of deepIn10 that forces 10 levels in the type?

type Iter10 f a = f (f (f (f (f (f (f (f (f (f a)))))))))

deepIn10 :: (Functor f, a ~ Iter10 f (Term f)) => f a -> Term f
deepIn10 = deepIn

This function can be used to insert In constructors in terms of at most depth 10.

Constructing terms from existing terms

The above example, term2, was constructed from scratch. We might also want to construct terms that contain other, already constructed, terms. This is where deepOut comes into play:

term3 = deepIn10 $ Add (deepOut term2) (Num 4)