Getting close to the conceptual metal
Posted on 27 January, 2020
A few years ago I watched the Structure and interpretation of computer programs video lectures, kindly posted publicly by MIT OpenCourseWare. There's so much interesting material in them but one part really struck me, in 5B: Computational Objects Gerald Jay Sussman defines a pair "in terms of nothing but air, hot air".
define (cons x y)
(
(λ (m) (m x y)))
define (car x)
(
(x (λ (a b) a)))
define (cdr x)
( (x (λ (a b) b)))
Previous lectures had established pairs as a sort of fundamental thing. After all, pairs could be used to make lists and in Lisp once you've got pairs and lists you've got a lot. Sussman, however, was revealing that there was a level below that, there was an even more fundamental thing: functions.
Later, through learning Haskell, I became familiar with the concept of algebraic data types. It was astonishing to me that you could make so much out of so little. You seemingly only need types that are "pair-like" (products) and types that are "either-like" (sums).
data Product a b
= P a b
data Sum a b
= A a
| B b
newtype Maybe a
= Maybe (Sum () a)
newtype List a
= List (Maybe (Product a (List a)))
newtype NonEmpty a
= NonEmpty (Product a (List a))
When Sussman chalked up that pair made of nothing but "hot air" he attributed the encoding to Alonzo Church. Appreciating that many other types can be made from pair-like things and either-like things, I wondered if there existed a Church encoding for Either
. After all, with Either
and Pair
it sure seemed like I'd be able to construct anything else I needed. I found Church encodings for lots of other interesting things, like booleans and natural numbers, but nothing quite like Either
.
Eventually I'd happen across Scott encoding, named for Dana Scott to whom it is attributed. It's super interesting.
Scott encoding
With Scott encoding we have a consistent method for representing any algebraic datatype in the untyped lambda calculus. It goes something like this: say we have a datatype D, with N constructors, (d1 … dN), such that constructor di has arity Ai. The Scott encoding for constructor di of D would be:
If you've never seen this sort of notation before don't worry we'll be moving to mostly Haskell soon enough. Until then we could have a couple of swings at stating this in words.
- Each constructor, d1 through dN, is responsible for accepting the arguments needed to call its chosen continuation, any one of c1 through cN.
- There is one continuation per constructor (ci is the continuation for di), and each continuation needs to be passed all the arguments its constructor was passed (di was passed x1 … xAi and so ci will passed them as well).
In this way values are held in the closure of each constructor. These internal values are used when the consumer of a value of type D passes in a continuation which will accept them as arguments.
To develop an intuition for how this actually works we can try converting some well-known Haskell data types to this encoding.
Pair
To Scott encode any one constructor for a given datatype we need to know how many other constructors it has, and how many arguments each of them accept. Haskell's pair type looks like syntax, however by asking GHCi for some information about it we can see that its much like any other user-defined datatype.
$ ghci
> :info (,)
data (,) a b = (,) a b -- Defined in `GHC.Tuple'
It has one constructor which happens to share the name of its type.
> :type (,)
(,) :: a -> b -> (a, b)
For our purposes we'll rename the (,)
type to Pair and the (,)
constructor to pair. As Pair's only constructor, pair need only accept one continuation c. The arguments that pair accepts, such that it can call c with them, are the resulting Pair's first and second elements.
But what is Pair? How might we write the type of a function that accepts a Pair? First we might rewrite pair in Haskell, initially as a fairly direct translation.
= \x1 x2 -> \c -> c x1 x2 pair
Then, without the first explicit lambda.
= \c -> c x1 x2 pair x1 x2
That can be loaded into GHCi or into a Repl.it Haskell session in order to find its inferred type.
$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/brad/.ghci
> pair x1 x2 = \c -> c x1 x2
> :type pair
pair :: t1 -> t2 -> (t1 -> t2 -> t3) -> t3
You might be wondering why these lambdas are arranged the way they are. Why not use any of the following, equivalent, arrangements.
= \x1 -> \x2 -> \c -> c x1 x2
pair = \x1 -> \x2 c -> c x1 x2
pair = \x1 x2 c -> c x1 x2 pair
The answer is that I'm sneakily trying to reveal what the Pair component of pair
's type is, and, as it ends up, what the datatype component of any Scott encoded constructor's type happens to be.
type Pair t1 t2 t3
= (t1 -> t2 -> t3) -> t3
The equivalent Haskell type, (,) a b
has one less type variable. It's interesting to think about what it would mean for Pair
to ditch t3
somehow.
type Pair t1 t2
= (t1 -> t2 -> _) -> _
One way of looking at this is that it's up to the continuation (c) what it does with x1 and x2, the constructor (pair) doesn't get a say. This means that whatever pair returns must work for anything its consumer may want to produce. In Haskell types we might say that Pair
must work for all possible types its provided continuation could return.
type Pair t1 t2
= forall t3. (t1 -> t2 -> t3) -> t3
pair :: Pair a b
= \c -> c a b pair a b
Is this equivalent to (,)
? How might we tell? One way might be to write a function which converts (,)
s into Pair
s and one which converts Pair
s into (,)
s. Then, given those two functions, we should be able to observe that their composition doesn't do anything. Or put another way: for these two functions:
from :: (a, b) -> Pair a b
= _
from
to :: Pair a b -> (a, b)
= _ to
Their composition in either direction is equivalent to an identity function.
If these two functions can be written, and they have this property, it means that Pair
and (,)
are isomorphic.
So then, is it possible to write from
?
from :: (a, b) -> Pair a b
= _ from (a, b)
Which is sort of asking: given an a
and a b
, is it possible to construct a Pair a b
?
from :: (a, b) -> Pair a b
= pair a b from (a, b)
How about to
?
to :: Pair a b -> (a, b)
= _ to p
Recall that p
is a function waiting to be passed a continuation. That continuation will be passed both an a
and a b
.
to :: Pair a b -> (a, b)
= p (\a b -> _) to p
So now, given an a
and a b
can we construct an (a, b)
?
to :: Pair a b -> (a, b)
= p (\a b -> (a, b)) to p
With that done: is to ∘ from an identity function? It does seem so.
> (to . from) (1, 2)
(1,2)
> (to . from) (True, "Nifty")
(True,"Nifty")
Without a Show
or Eq
instance for (->)
it's hard to observe the same for from ∘ to. We can at least observe what to ∘ from ∘ to does.
> (to . from . to) (pair 1 2)
(1,2)
> (to . from . to) (pair True "Very nifty")
(True,"Very nifty")
Well, I'm convinced. Are you? For further tangible evidence you could try re-implementing everything in Data.Tuple
in terms of Pair
. It's a pretty fun exercise and helped me to become more comfortable with these lambda-only pairs.
Either
In Haskell Either
is defined like this:
data Either a b
= Left a
| Right b
For the purposes of Scott encoding we can say that Either has two constructors which each accept one argument. This means that whatever each constructor returns will accept two continuations which themselves each accept one argument. Each constructor will accept an argument to pass to their chosen continuation.
The first constructor left, then, will accept one argument to pass to the first continuation.
And the second will accept one to pass to the second continuation.
Each continuation can accept an argument of a different type (say c1 takes an α and c2 takes a β) but both should return something of the same type (say γ). The reason for this is how values of the Either type are used, this may become evident later. Using a polymorphic lambda calculus rather than an untyped lambda calculus we can write out how those types line up.
For me at least, it's a little easier to see in Haskell.
left :: a -> (a -> c) -> (b -> c) -> c
= \c1 c2 -> c1 x
left x
right :: b -> (a -> c) -> (b -> c) -> c
= \c1 c2 -> c2 x right x
As we did with Pair, we might ask: where's Either in all of that? And, as with Pair, I have sneakily positioned certain lambdas to try and point it out. However, in this case another method might be to put left and right side by side and view their shared return type.
left :: a -> ((a -> c) -> (b -> c) -> c)
right :: b -> ((a -> c) -> (b -> c) -> c)
Which might yield the following.
type Either a b c
= (a -> c) -> (b -> c) -> c
As with Pair we can make sure that the choice of c
is left solely up to the consumer of an Either
.
type Either a b
= forall c. (a -> c) -> (b -> c) -> c
left :: a -> Either a b
= \c1 c2 -> c1 x
left x
right :: b -> Either a b
= \c1 c2 -> c2 x right x
Can we write to and from for converting between our Either and Haskell's?
from :: Prelude.Either a b -> Either a b
= _
from
to :: Either a b -> Prelude.Either a b
= _ to
For from we can be given a Left
containing an a
or a Right
containing a b
.
from :: Prelude.Either a b -> Either a b
=
from e case e of
Left a -> _
Right b -> _
So we have two questions to answer. Given an a
can I construct an Either a b
?
from :: Prelude.Either a b -> Either a b
=
from e case e of
Left a -> left a
Right b -> _
Given a b
can I construct an Either a b
?
from :: Prelude.Either a b -> Either a b
=
from e case e of
Left a -> left a
Right b -> right b
For to it would be nice if we could use a case
expression as we did with from and in a sense we can.
to :: Either a b -> Prelude.Either a b
=
to e
e-> _)
(\a -> _) (\b
This was less clear when looking at Pair
and (,)
but Scott-encoded datatypes are almost ready-made case
expressions. Each continuation is like a constructor-only pattern-match.
As with to we again have two questions to answer, and they're very similar. First, given an a
can I construct a Prelude.Either a b
?
to :: Either a b -> Prelude.Either a b
=
to e
e-> Left a)
(\a -> _) (\b
Next, given a b
can I construct a Prelude.Either a b
?
to :: Either a b -> Prelude.Either a b
=
to e
e-> Left a)
(\a -> Right b) (\b
Are Prelude.Either
and Either
equivalent?
> to (from (Left "Hrm?"))
Left "Hrm?"
> to (from (to (left "Ahhhhh")))
Left "Ahhhhh"
> to (from (Right 2))
Right 2
> to (from (to (right 14)))
Right 14
Looks likely. Re-implementing everything in Data.Either
would sure be convincing.
From hot air to solid ground
This is all well and good but I'm still left wondering what could actually be built with this. Knowing, intellectually, that Haskell's Either
and the lambdas-only Either are equivalent isn't as satisfying as using Either and Pair to write real(ish) programs.
We could use Either and Pair to write a little REPL which evaluates Lisp-like expressions.
$ runhaskell HotAir.hs
Enter an expression to evaluate. E.G.
(+ (* (+ 12 7) 100) 58)
> (+ 1 2)
3
>
Yeah, let's do that.
To make things less onerous this little language could be limited to only adding and multiplying natural numbers. No define
, no λ
. Nothing other than positive integer literals (0
, 1
, 2
…), addition (+
) and multiplication (*
).
> (define (sqr x) (* x x))
Error: Unexpected char: 'd'
>
Before we begin: this is going to move quite fast. Don't worry if you miss some of the details of the implementation. It's more important to be generally aware of what the program does and to keep in mind that we're building it as a bit of a stress test for our lambda-only datatypes. If you'd like to learn more about how to build programs in Haskell I'd reccomend all the material produced by Typeclasses and the Data61 Functional Programming Course.
Now, the first thing we should do is redefine Pair and Either using newtype
, rather than type
(GHC is pretty great, but sometimes we need to cut it some slack).
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# OPTIONS_GHC -Wall -Werror #-}
module HotAir where
newtype Pair a b = Pair
runPair ::
{forall c.
-> b -> c) ->
(a
c
}
pair :: a -> b -> Pair a b
= Pair \c -> c x1 x2
pair x1 x2
newtype Either a b = Either
runEither ::
{forall c.
-> c) ->
(a -> c) ->
(b
c
}
left :: a -> Either a b
= Either \c1 _ -> c1 x
left x
right :: b -> Either a b
= Either \_ c2 -> c2 x right x
Next, we might define a monadic parser.
import Data.String (String)
newtype Parser a = Parser
parse ::
{String ->
Either String (Pair a String)
}
A satisfy
function for creating a Parser
which compares the next character using given Char -> Bool
predicate would be handy.
{-# LANGUAGE LambdaCase #-}
import Data.Bool (Bool)
import Data.Char (Char)
import Data.List ((++))
satisfy :: (Char -> Bool) -> Parser Char
= Parser \case
satisfy p -> left "Unexpected EOF"
[] : cs) ->
(c if p c
then right (pair c cs)
else left ("Unexpected char: " ++ show c)
We can use it to define a char
function which matches a specific Char
.
import Data.Eq ((==))
char :: Char -> Parser Char
= satisfy . (==) char
Functor
, Applicative
and Alternative
instances for Parser
will make composing and manipulating them nicer.
import Control.Applicative
Alternative (empty, (<|>)),
( Applicative (pure, (<*>)),
)import Data.Function (const)
import Data.Functor (Functor (fmap))
instance Functor Parser where
fmap :: (a -> b) -> Parser a -> Parser b
fmap f pa =
Parser \str ->
runEither (parse pa str)
left-> right (runPair p (pair . f)))
(\p
instance Applicative Parser where
pure :: a -> Parser a
pure a = Parser (right . pair a)
(<*>) :: Parser (a -> b) -> Parser a -> Parser b
<*> pa =
pf Parser \str ->
runEither (parse f str)
left-> runPair p (\f -> parse (fmap f pa)))
(\p
instance Alternative Parser where
empty :: Parser a
= Parser (const (left "Empty"))
empty
(<|>) :: Parser a -> Parser a -> Parser a
Parser l <|> Parser r =
Parser \str ->
runEither (l str)const (r str))
( right
A convenient way to ignore unused input will be useful, too.
fst :: Pair a b -> a
fst = (`runPair` const)
execParser :: Parser a -> String -> Either String a
=
execParser p str
runEither (parse p str)
left. fst) (right
That should be enough to write an eval
function for our little language.
import Control.Applicative (some, (*>), (<*))
import Data.Char (isDigit)
import Data.Functor ((<$), (<$>))
import GHC.Num ((*), (+))
import Numeric.Natural (Natural)
eval :: String -> Either String Natural
= execParser expression
eval where
expression :: Parser Natural
= natural <|> application
expression natural :: Parser Natural
= read <$> some (satisfy isDigit)
natural application :: Parser Natural
=
application '(' *> operator <* space
char <*> expression <* space
<*> expression <* char ')'
operator :: Parser (Natural -> Natural -> Natural)
= (+) <$ char '+' <|> (*) <$ char '*'
operator space :: Parser Char
= char ' ' space
Finally, a main
function that lets a user input expressions and view the result of evaluating them.
module HotAir (main) where
import Control.Monad (forever)
import System.IO
IO,
( getLine,
print,
putStr,
putStrLn,
)
main :: IO ()
= do
main putStrLn "Enter an expression to evaluate. E.G."
putStrLn "(+ (* (+ 12 7) 100) 58)"
$ do
forever putStr "> "
<- getLine
line
runEither (eval line)putStrLn . ("Error: " ++))
(print
And here it is, a little REPL made of mostly hot air.
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# OPTIONS_GHC -Wall -Werror #-}
module HotAir (main) where
import Control.Applicative
Alternative (empty, (<|>)),
( Applicative (pure, (<*>)),
some,*>),
(<*),
(
)import Control.Monad (forever)
import Data.Bool (Bool)
import Data.Char (Char, isDigit)
import Data.Eq ((==))
import Data.Function (const, ($), (.))
import Data.Functor (Functor (fmap), (<$), (<$>))
import Data.List ((++))
import Data.String (String)
import GHC.Num ((*), (+))
import Numeric.Natural (Natural)
import System.IO
IO,
( getLine,
print,
putStr,
putStrLn,
)import Text.Read (read)
import Text.Show (Show (show))
newtype Pair a b = Pair
runPair ::
{forall c.
-> b -> c) ->
(a
c
}
pair :: a -> b -> Pair a b
= Pair \c -> c x1 x2
pair x1 x2
fst :: Pair a b -> a
fst = (`runPair` const)
newtype Either a b = Either
runEither ::
{forall c.
-> c) ->
(a -> c) ->
(b
c
}
left :: a -> Either a b
= Either \c1 _ -> c1 x
left x
right :: b -> Either a b
= Either \_ c2 -> c2 x
right x
newtype Parser a = Parser
parse ::
{String ->
Either String (Pair a String)
}
satisfy :: (Char -> Bool) -> Parser Char
= Parser \case
satisfy p -> left "Unexpected EOF"
[] : cs) ->
(c if p c
then right (pair c cs)
else left ("Unexpected char: " ++ show c)
char :: Char -> Parser Char
= satisfy . (==)
char
instance Functor Parser where
fmap :: (a -> b) -> Parser a -> Parser b
fmap f pa =
Parser \str ->
runEither (parse pa str)
left-> right (runPair p (pair . f)))
(\p
instance Applicative Parser where
pure :: a -> Parser a
pure a = Parser (right . pair a)
(<*>) :: Parser (a -> b) -> Parser a -> Parser b
<*> pa =
pf Parser \str ->
runEither (parse pf str)
left-> runPair p (\f -> parse (fmap f pa)))
(\p
instance Alternative Parser where
empty :: Parser a
= Parser (const (left "Empty"))
empty
(<|>) :: Parser a -> Parser a -> Parser a
Parser l <|> Parser r =
Parser \str ->
runEither (l str)const (r str))
(
right
execParser :: Parser a -> String -> Either String a
=
execParser p str
runEither (parse p str)
left. fst)
(right
eval :: String -> Either String Natural
= execParser expression
eval where
expression :: Parser Natural
= natural <|> application
expression natural :: Parser Natural
= read <$> some (satisfy isDigit)
natural application :: Parser Natural
=
application '(' *> operator <* space
char <*> expression <* space
<*> expression <* char ')'
operator :: Parser (Natural -> Natural -> Natural)
= (+) <$ char '+' <|> (*) <$ char '*'
operator space :: Parser Char
= char ' '
space
main :: IO ()
= do
main putStrLn "Enter an expression to evaluate. E.G."
putStrLn "(+ (* (+ 12 7) 100) 58)"
$ do
forever putStr "> "
<- getLine
line
runEither (eval line)putStrLn . ("Error: " ++))
(print
You can save that code into a file and execute it with runhaskell
if you want to see it in action.
$ runhaskell HotAir.hs
Enter an expression to evaluate. E.G.
(+ (* (+ 12 7) 100) 58)
> (+ (* (+ 12 7) 100) 58)
1958
>
That was quite a whirlwind.
I know I really shouldn't be surprised that I could write my program with (more or less) only lambdas. Lambda calculus is "a universal model of computation" after all. I also shouldn't be surprised that translating lambda calculus terms into Haskell is so natural as Haskell based on a lambda calculus. But in this post I really wanted to focus on the wonder I feel when being able to use abstract notions like lambda calculus to write runnable programs. My hope is that by writing this I might help someone find something they find wonderful, and encourage them to explore it.
One possible avenue, which I might go into in a future post, is this: the above program still makes use of some pre-defined Haskell datatypes like Bool
, Natural
, [a]
and Char
. Does it have to? Could we build those entirely out of lambdas too? How far can we go?
Huge thanks to Brad Bow for his clarifying suggestions.