Tuesday, September 29, 2015

Strictness can fix non-termination!

(This post is literate Haskell, available here.)

I’ve always thought that strictness annotations can only turn terminating programs into non-terminating ones. Turns out that this isn’t always the case. As always, unsafePerformIO changes things.

import Data.IORef
import System.IO.Unsafe

Consider the following function for turning an IORef into a pure value:

unsafeFreezeRef :: IORef a -> a
unsafeFreezeRef = unsafePerformIO . readIORef

(I explain why I’m interested in this function below.)

This function is of course unsafe, but we may naively expect it to work in the following setting where it’s used inside a function that returns in IO:

modRef :: (a -> a) -> IORef a -> IO ()
modRef f r = writeIORef r $ f $ unsafeFreezeRef r

However, this doesn’t work, because the readIORef inside unsafeFreezeRef happens to be performed after writeIORef, leading to a loop.

So, strangely, the solution is to use a strictness annotation to force the read to happen before the write:

modRef' :: (a -> a) -> IORef a -> IO ()
modRef' f r = writeIORef r $! f $ unsafeFreezeRef r

This is the first time I’ve seen a strictness annotation turn a non-terminating program into a terminating one.

Here is a small test program that terminates for modRef' but not for modRef:

test = do
    r <- newIORef "hello"
    modRef' tail r
    putStrLn =<< readIORef r

I’m not sure this tells us anything useful. We should probably stay away from such ugly programming anyway… But at least I found this example quite interesting and counterintuitive.

Why on earth would anyone write unsafeFreezeRef?

In case you wonder where this strange example comes from, I’m working on a generic library for EDSLs that generate C code. Among other things, this EDSL has references that work like Haskell’s IORefs.

The purpose of unsafeFreezeRef in the library is to avoid temporary variables in the generated code when we know they are not needed. The idea is that the user of unsafeFreezeRef must guarantee that result is not accessed after any subsequent writes to the reference. Reasoning about when a value is used is quite easy in the EDSL due to its strict semantics. For example, we don’t need any strictness annotation in the definition of modRef due to the fact that setRef (the equivalent of writeIORef) is already strict.

The reason I noticed the problem of laziness and unsafeFreezeRef was that the EDSL evaluator turned out to be too lazy when writing to references. So I had a program which produced perfectly fine C code, but which ran into a loop when evaluating it in Haskell. The solution was to make the evaluator more strict; see this commit and this.