# Getting close to the conceptual metal

Posted on January 27, 2020

A few years ago I watched the *Structure and interpretation of computer programs* video lectures, kindly posted publicly by MIT OpenCourseWare. There’s so much interesting material in them but one part really struck me, in *5B: Computational Objects* Gerald Jay Sussman defines a pair “in terms of nothing but air, hot air”.

```
(define (cons x y)
(λ (m) (m x y)))
(define (car x)
(x (λ (a b) a)))
(define (cdr x)
(x (λ (a b) b)))
```

Previous lectures had established pairs as a sort of fundamental thing. After all, pairs could be used to make lists and in Lisp once you’ve got pairs and lists you’ve got *a lot*. Sussman, however, was revealing that there was a level below that, there was an even more fundamental thing: functions.

Later, through learning Haskell, I became familiar with the concept of algebraic data types. It was astonishing to me that you could make so much out of so little. You seemingly only need types that are “pair-like” (*products*) and types that are “either-like” (*sums*).

```
data Product a b
= P a b
data Sum a b
= A a
| B b
newtype Maybe a
= Maybe (Sum () a)
newtype List a
= List (Maybe (Product a (List a)))
newtype NonEmpty a
= NonEmpty (Product a (List a))
```

When Sussman chalked up that pair made of nothing but “hot air” he attributed the encoding to Alonzo Church. Appreciating that many other types can be made from pair-like things and either-like things, I wondered if there existed a Church encoding for `Either`

. After all, with `Either`

and `Pair`

it sure seemed like I’d be able to construct anything else I needed. I found Church encodings for lots of other interesting things, like booleans and natural numbers, but nothing quite like `Either`

.

Eventually I’d happen across Scott encoding, named for Dana Scott to whom it is attributed. It’s super interesting.

## Scott encoding

With Scott encoding we have a consistent method for representing any algebraic datatype in the untyped lambda calculus. It goes something like this: say we have a datatype *D*, with *N* constructors, (*d*_{1} … *d _{N}*), such that constructor

*d*has arity

_{i}*A*. The Scott encoding for constructor

_{i}*d*of

_{i}*D*would be:

If you’ve never seen this sort of notation before don’t worry we’ll be moving to mostly Haskell soon enough. Until then we could have a couple of swings at stating this in words.

- Each constructor,
*d*_{1}through*d*, is responsible for accepting the arguments needed to call its chosen continuation, any one of_{N}*c*_{1}through*c*._{N} - There is one continuation per constructor (
*c*is the continuation for_{i}*d*), and each continuation needs to be passed all the arguments its constructor was passed (_{i}*d*was passed_{i}*x*_{1}…*x*and so_{Ai}*c*will passed them as well)._{i}

In this way values are held in the closure of each constructor. These internal values are *used* when the consumer of a value of type *D* passes in a continuation which will accept them as arguments.

To develop an intuition for how this actually works we can try converting some well-known Haskell data types to this encoding.

### Pair

To Scott encode any one constructor for a given datatype we need to know how many other constructors it has, and how many arguments each of them accept. Haskell’s pair type looks like syntax, however by asking GHCi for some information about it we can see that its much like any other user-defined datatype.

```
$ ghci
> :info (,)
data (,) a b = (,) a b -- Defined in `GHC.Tuple'
```

It has one constructor which happens to share the name of its type.

```
> :type (,)
(,) :: a -> b -> (a, b)
```

For our purposes we’ll rename the `(,)`

type to *Pair* and the `(,)`

constructor to *pair*. As *Pair’s* only constructor, *pair* need only accept one continuation *c*. The arguments that *pair* accepts, such that it can call *c* with them, are the resulting *Pair’s* first and second elements.

But what is *Pair*? How might we write the type of a function that accepts a *Pair*? First we might rewrite *pair* in Haskell, initially as a fairly direct translation.

Then, without the first explicit lambda.

That can be loaded into GHCi or into a Repl.it Haskell session in order to find its inferred type.

```
$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/brad/.ghci
> pair x1 x2 = \c -> c x1 x2
> :type pair
pair :: t1 -> t2 -> (t1 -> t2 -> t3) -> t3
```

You might be wondering why these lambdas are arranged the way they are. Why not use any of the following, equivalent, arrangements.

The answer is that I’m sneakily trying to reveal what the *Pair* component of `pair`

’s type is, and, as it ends up, what the datatype component of any Scott encoded constructor’s type happens to be.

The equivalent Haskell type, `(,) a b`

has one less type variable. It’s interesting to think about what it would mean for `Pair`

to ditch `t3`

somehow.

One way of looking at this is that it’s up to the continuation (*c*) what it does with *x*_{1} and *x*_{2}, the constructor (*pair*) doesn’t get a say. This means that whatever *pair* returns must work *for anything* its *consumer* may want to produce. In Haskell types we might say that `Pair`

must work *for all* possible types its provided continuation could return.

Is this equivalent to `(,)`

? How might we tell? One way might be to write a function which converts `(,)`

s into `Pair`

s and one which converts `Pair`

s into `(,)`

s. Then, given those two functions, we should be able to observe that their composition doesn’t *do* anything. Or put another way: for these two functions:

Their composition in either direction is equivalent to an identity function.

If these two functions can be written, and they have this property, it means that `Pair`

and `(,)`

are isomorphic.

So then, is it possible to write `from`

?

Which is sort of asking: given an `a`

and a `b`

, is it possible to construct a `Pair a b`

?

How about `to`

?

Recall that `p`

is a function waiting to be passed a continuation. That continuation will be passed both an `a`

and a `b`

.

So now, given an `a`

and a `b`

can we construct an `(a, b)`

?

With that done: is *to* ∘ *from* an identity function? It does seem so.

```
> (to . from) (1, 2)
(1,2)
> (to . from) (True, "Nifty")
(True,"Nifty")
```

Without a `Show`

or `Eq`

instance for `(->)`

it’s hard to observe the same for *from* ∘ *to*. We can at least observe what *to* ∘ *from* ∘ *to* does.

```
> (to . from . to) (pair 1 2)
(1,2)
> (to . from . to) (pair True "Very nifty")
(True,"Very nifty")
```

Well, I’m convinced. Are you? For further tangible evidence you could try re-implementing everything in `Data.Tuple`

in terms of `Pair`

. It’s a pretty fun exercise and helped me to become more comfortable with these lambda-only pairs.

### Either

In Haskell `Either`

is defined like this:

For the purposes of Scott encoding we can say that *Either* has two constructors which each accept one argument. This means that whatever each constructor returns will accept two continuations which themselves each accept one argument. Each constructor will accept an argument to pass to their chosen continuation.

The first constructor *left*, then, will accept one argument to pass to the first continuation.

And the second will accept one to pass to the second continuation.

Each continuation can accept an argument of a different type (say *c _{1}* takes an

*α*and

*c*takes a

_{2}*β*) but both should return something of the same type (say

*γ*). The reason for this is how values of the

*Either*type are used, this may become evident later. Using a polymorphic lambda calculus rather than an untyped lambda calculus we can write out how those types line up.

For me at least, it’s a little easier to see in Haskell.

```
left :: a -> (a -> c) -> (b -> c) -> c
left x = \c1 c2 -> c1 x
right :: b -> (a -> c) -> (b -> c) -> c
right x = \c1 c2 -> c2 x
```

As we did with *Pair*, we might ask: where’s *Either* in all of that? And, as with *Pair*, I have sneakily positioned certain lambdas to try and point it out. However, in this case another method might be to put *left* and *right* side by side and view their shared return type.

Which might yield the following.

As with *Pair* we can make sure that the choice of `c`

is left solely up to the consumer of an `Either`

.

```
type Either a b
= forall c. (a -> c) -> (b -> c) -> c
left :: a -> Either a b
left x = \c1 c2 -> c1 x
right :: b -> Either a b
right x = \c1 c2 -> c2 x
```

Can we write *to* and *from* for converting between our *Either* and Haskell’s?

For *from* we can be given a `Left`

containing an `a`

or a `Right`

containing a `b`

.

So we have two questions to answer. Given an `a`

can I construct an `Either a b`

?

Given a `b`

can I construct an `Either a b`

?

For *to* it would be nice if we could use a `case`

expression as we did with *from* and in a sense we can.

This was less clear when looking at `Pair`

and `(,)`

but Scott-encoded datatypes are almost ready-made `case`

expressions. Each continuation is like a constructor-only pattern-match.

As with *to* we again have two questions to answer, and they’re very similar. First, given an `a`

can I construct a `Prelude.Either a b`

?

Next, given a `b`

can I construct a `Prelude.Either a b`

?

Are `Prelude.Either`

and `Either`

equivalent?

```
> to (from (Left "Hrm?"))
Left "Hrm?"
> to (from (to (left "Ahhhhh")))
Left "Ahhhhh"
> to (from (Right 2))
Right 2
> to (from (to (right 14)))
Right 14
```

Looks likely. Re-implementing everything in `Data.Either`

would sure be convincing.

## From hot air to solid ground

This is all well and good but I’m still left wondering what could actually be *built* with this. Knowing, intellectually, that Haskell’s `Either`

and the lambdas-only *Either* are equivalent isn’t as satisfying as using *Either* and *Pair* to write real(ish) programs.

We *could* use *Either* and *Pair* to write a little REPL which evaluates Lisp-like expressions.

```
$ runhaskell HotAir.hs
Enter an expression to evaluate. E.G.
(+ (* (+ 12 7) 100) 58)
> (+ 1 2)
3
>
```

Yeah, let’s do that.

To make things less onerous this little language could be limited to only adding and multiplying natural numbers. No `define`

, no `λ`

. Nothing other than positive integer literals (`0`

, `1`

, `2`

…), addition (`+`

) and multiplication (`*`

).

```
> (define (sqr x) (* x x))
Error: Unexpected char: 'd'
>
```

Before we begin: this is going to move quite fast. Don’t worry if you miss some of the details of the implementation. It’s more important to be *generally* aware of what the program does and to keep in mind that we’re building it as a bit of a stress test for our lambda-only datatypes. If you’d like to learn more about how to build programs in Haskell I’d reccomend all the material produced by Typeclasses and the Data61 Functional Programming Course.

Now, the first thing we should do is redefine *Pair* and *Either* using `newtype`

, rather than `type`

(GHC is pretty great, but sometimes we need to cut it some slack).

```
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE RankNTypes #-}
{-# OPTIONS_GHC -Wall -Werror #-}
module HotAir where
import Data.Function (($))
newtype Pair a b
= Pair
{ runPair :: forall c. (a -> b -> c) -> c
}
pair :: a -> b -> Pair a b
pair x1 x2 = Pair $ \c -> c x1 x2
newtype Either a b
= Either
{ runEither :: forall c. (a -> c) -> (b -> c) -> c
}
left :: a -> Either a b
left x = Either $ \c1 _ -> c1 x
right :: b -> Either a b
right x = Either $ \_ c2 -> c2 x
```

Next, we might define a monadic parser.

```
import Data.String (String)
newtype Parser a
= Parser
{ parse :: String -> Either String (Pair a String)
}
```

A `satisfy`

function for creating a `Parser`

which compares the next character using given `Char -> Bool`

predicate would be handy.

```
{-# LANGUAGE LambdaCase #-}
import Data.List ((++))
import Data.Bool (Bool)
import Data.Char (Char)
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser $ \case
[] -> left "Unexpected EOF"
(c : cs) ->
if p c
then right (pair c cs)
else left ("Unexpected char: " ++ show c)
```

We can use it to define a `char`

function which matches a specific `Char`

.

`Functor`

, `Applicative`

and `Alternative`

instances for `Parser`

will make composing and manipulating them nicer.

```
import Control.Applicative
( Alternative ((<|>), empty),
Applicative ((<*>), pure)
)
import Data.Function (const)
import Data.Functor (Functor (fmap))
instance Functor Parser where
fmap :: (a -> b) -> Parser a -> Parser b
fmap f pa =
Parser $ \str ->
runEither (parse pa str)
left
(\p -> right (runPair p (pair . f)))
instance Applicative Parser where
pure :: a -> Parser a
pure a = Parser (right . pair a)
(<*>) :: Parser (a -> b) -> Parser a -> Parser b
pf <*> pa =
Parser $ \str ->
runEither (parse pf str)
left
(\p -> runPair p (\f -> parse (fmap f pa)))
instance Alternative Parser where
empty :: Parser a
empty = Parser $ const (left "Empty")
(<|>) :: Parser a -> Parser a -> Parser a
Parser l <|> Parser r =
Parser $ \str ->
runEither (l str)
(const (r str))
right
```

A convenient way to ignore unused input will be useful, too.

```
fst :: Pair a b -> a
fst = (`runPair` const)
execParser :: Parser a -> String -> Either String a
execParser p str =
runEither (parse p str)
left
(right . fst)
```

That should be enough to write an `eval`

function for our little language.

```
import Control.Applicative ((*>), (<*), some)
import Data.Char (isDigit)
import Data.Functor ((<$), (<$>))
import GHC.Num ((*), (+))
import Numeric.Natural (Natural)
eval :: String -> Either String Natural
eval = execParser expression
where
expression :: Parser Natural
expression = natural <|> application
natural :: Parser Natural
natural = read <$> some (satisfy isDigit)
application :: Parser Natural
application =
char '('
*> operator
<* space
<*> expression
<* space
<*> expression
<* char ')'
operator :: Parser (Natural -> Natural -> Natural)
operator = (+) <$ char '+' <|> (*) <$ char '*'
space :: Parser Char
space = char ' '
```

Finally, a `main`

function that lets a user input expressions and view the result of evaluating them.

```
module HotAir
( main
)
where
import Control.Monad (forever)
import System.IO
( IO,
getLine,
print,
putStr,
putStrLn
)
main :: IO ()
main = do
putStrLn "Enter an expression to evaluate. E.G."
putStrLn "(+ (* (+ 12 7) 100) 58)"
forever $ do
putStr "> "
line <- getLine
runEither (eval line)
(putStrLn . ("Error: " ++))
print
```

And here it is, a little REPL made of *mostly* hot air.

```
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE RankNTypes #-}
{-# OPTIONS_GHC -Wall -Werror #-}
module HotAir
( main
)
where
import Control.Applicative
( (*>),
(<*),
Alternative ((<|>), empty),
Applicative ((<*>), pure),
some
)
import Control.Monad (forever)
import Data.Bool (Bool)
import Data.Char (Char, isDigit)
import Data.Eq ((==))
import Data.Function (($), (.), const)
import Data.Functor ((<$), (<$>), Functor (fmap))
import Data.List ((++))
import Data.String (String)
import GHC.Num ((*), (+))
import Numeric.Natural (Natural)
import System.IO
( IO,
getLine,
print,
putStr,
putStrLn
)
import Text.Read (read)
import Text.Show (Show (show))
newtype Pair a b
= Pair
{ runPair :: forall c. (a -> b -> c) -> c
}
pair :: a -> b -> Pair a b
pair x1 x2 = Pair $ \c -> c x1 x2
fst :: Pair a b -> a
fst = (`runPair` const)
newtype Either a b
= Either
{ runEither :: forall c. (a -> c) -> (b -> c) -> c
}
left :: a -> Either a b
left x = Either $ \c1 _ -> c1 x
right :: b -> Either a b
right x = Either $ \_ c2 -> c2 x
newtype Parser a
= Parser
{ parse :: String -> Either String (Pair a String)
}
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser $ \case
[] -> left "Unexpected EOF"
(c : cs) ->
if p c
then right (pair c cs)
else left ("Unexpected char: " ++ show c)
char :: Char -> Parser Char
char = satisfy . (==)
instance Functor Parser where
fmap :: (a -> b) -> Parser a -> Parser b
fmap f pa =
Parser $ \str ->
runEither (parse pa str)
left
(\p -> right (runPair p (pair . f)))
instance Applicative Parser where
pure :: a -> Parser a
pure a = Parser (right . pair a)
(<*>) :: Parser (a -> b) -> Parser a -> Parser b
pf <*> pa =
Parser $ \str ->
runEither (parse pf str)
left
(\p -> runPair p (\f -> parse (fmap f pa)))
instance Alternative Parser where
empty :: Parser a
empty = Parser $ const (left "Empty")
(<|>) :: Parser a -> Parser a -> Parser a
Parser l <|> Parser r =
Parser $ \str ->
runEither (l str)
(const (r str))
right
execParser :: Parser a -> String -> Either String a
execParser p str =
runEither (parse p str)
left
(right . fst)
eval :: String -> Either String Natural
eval = execParser expression
where
expression :: Parser Natural
expression = natural <|> application
natural :: Parser Natural
natural = read <$> some (satisfy isDigit)
application :: Parser Natural
application =
char '('
*> operator
<* space
<*> expression
<* space
<*> expression
<* char ')'
operator :: Parser (Natural -> Natural -> Natural)
operator = (+) <$ char '+' <|> (*) <$ char '*'
space :: Parser Char
space = char ' '
main :: IO ()
main = do
putStrLn "Enter an expression to evaluate. E.G."
putStrLn "(+ (* (+ 12 7) 100) 58)"
forever $ do
putStr "> "
line <- getLine
runEither (eval line)
(putStrLn . ("Error: " ++))
print
```

You can save that code into a file and execute it with `runhaskell`

if you want to see it in action.

```
$ runhaskell HotAir.hs
Enter an expression to evaluate. E.G.
(+ (* (+ 12 7) 100) 58)
> (+ (* (+ 12 7) 100) 58)
1958
>
```

That was quite a whirlwind.

I *know* I really shouldn’t be surprised that I could write my program with (more or less) only lambdas. Lambda calculus is “a universal model of computation” after all. I also shouldn’t be surprised that translating lambda calculus terms into Haskell is so natural as Haskell based on a lambda calculus. But in this post I really wanted to focus on the wonder I feel when being able to *use* abstract notions like lambda calculus to write runnable programs. My hope is that by writing this I might help someone find something they find wonderful, and encourage them to explore it.

One possible avenue, which I might go into in a future post, is this: the above program still makes use of some pre-defined Haskell datatypes like `Bool`

, `Natural`

, `[a]`

and `Char`

. Does it have to? Could we build those entirely out of lambdas too? How far can we go?

Huge thanks to Brad Bow for his clarifying suggestions.