A while back, an email from a collaborator proposed a coalgebraic view

of the grammar for a new process calculus. This prompted me to do

some reading up on algebras versus coalgebras – something I had been

meaning to do for a long while.

Luckily, there is a very good [tutorial by Bart Jacobs and Jan

Rutten][tutorial]. At 38 pages, it’s a bit intimidating but it’s very

accessible for the practicing computer scientist (essentially, exactly

what it says in its introduction). The examples are helpful but it

still deserves several read-throughs to really get the most out of it

— each time I’ve scanned through it, I’ve found something new!

So what’s it about? For a few decades now, computer scientists have

been familiar with tools such as [BNF][bnf] for defining language

grammars, and

1. defining functions over such grammars by cases, and

2. proving their properties by structural induction.

Viewed algebraicly, the BNF grammar induces a functor `T`, for which

there exists an initial algebra `a: T X –> X`. [Initiality][initial]

of `a` gives us the `fold` function, which allows us to define

functions *by cases*, as well as the proof-by-induction proof

principle. These algebraic concepts are, in general, better known

than the coalgebraic ones.

In this article, I’d like to share with you, a concrete illustration

of the first of the two points above, namely `fold`, and its dual

`unfold`.

Let’s turn to a concrete example: lists over `int`. In this case, we

have the grammar

list ::= nil | cons(n, list)

This induces a functor `T` defined by `T X = 1 + int * X`, where `1`

denotes the singleton set, `+` denotes disjoint union and `*` denotes

cartesian product.

An algebra for this `T` (also called a `T`-algebra) is a carrier set

`X` with function `a: 1 + int * X –> X`. For the remainder of this

example, for convenience, we’ll work with the equivalent *pair* of

functions of types `1 –> X` and `int * X –> X`.

The functional programmers amongst you will instantly recognise these

type signatures and think of the algebra `X = list`, the type of

integer lists, with functions `nil: 1 –> X` and `cons: int * X –>

X`. In fact, `list`, `nil` and `cons` make up the initial algebra for

this `T`.

Let’s consider another algebra `int`, `knil`, `kons`:

knil: 1 –> int

kons: int * int –> int

Since `nil`, `cons` is an initial algebra, and `knil`, `kons` is also

an algebra, by initiality, there is a unique function `f` of type

`list –> int` which makes the following diagram commute.

T f

T list = 1 + int * list ————–> 1 + int * int = T int

| |

| |

| nil,cons | knil,kons

| |

| |

V V

list ———————> int

f

This `f` is usually called `fold knil kons`. (This `fold` is actually

`fold-right` rather than `fold-left`.)

Now if it so happens that

knil = 0

kons x y = x + y

then `fold knil kons` is precisely what you’d expect, namely the *list

sum* function of type `list –> int`. Initiality allows us to define

*list sum* by cases, aka structural recursion: we give the nil case

and the cons case.

A coalgebra is a carrier set `X` with a function `c` of type `X –> 1

+ int * X`, i.e. `X –> T X` (note that the `T` has moved to the other

side of the arrow). Function `c` takes an element of `X` and returns

either something in the left of the disjoint union (i.e. the element

in `1`) or something in the right (i.e. a pair in `int * X`). One

intuition is to think of `c` as a destructor — it takes apart

elements from `X`.

The function `c` can equivalently be considered as three functions `p:

X –> bool`, `f: X –> int` and `g: X –> X` related to `c` as

follows: if `p x` is true then `c` maps `x` to the element in the left

of the disjoint union; if `p x` is false then `c` maps `x` to the pair

`(f x, g x)` in the right. We assume without loss of generality that

`p`,`f`,`g` is equivalent to `c`.

The final coalgebra (which is the dual of initial algebra) for our

example is the type of possibly infinite lists, `stream`, and the

triple `isnil?`,`head`,`tail`:

isnil?: stream –> bool

head: stream –> int

tail: stream –> stream

Now consider `nat`, the type of natural numbers, as a coalgebra with

functions `iszero?`, `odd`, `rshift`:

iszero?: nat –> bool

odd: nat –> int

rshift: nat –> nat

Finality of `isnil?`, `head` and `tail` means there is a unique

function `f` of type `nat –> stream` which makes the following

diagram commute.

f

nat ———————> stream

| |

| |

| iszero?,odd,rshift | isnil?,head,tail

| |

| |

V V

T nat = 1 + int * nat ————> 1 + int * stream = T stream

T f

The function denoted by the top horizontal arrow in the commuting

square above is usually called `unfold iszero? odd rshift`.

Now if it so happens that `odd` returns `1` on odd inputs and `0`

otherwise, and `rshift` returns the binary right shift of its input,

then `unfold iszero? odd rshift` is the function which takes a natural

number and returns its binary digits in reverse (but note that `0`

gives the empty list). (Compare this example to Tony’s [base

conversion code][baseconv].)

So wrapping up, we’ve illustrated the origins of `fold` and `unfold`

for integer lists from the algebraic perspective. It is possible to

generalise this example and thus derive `fold` and `unfold` for other

recursive data types such as binary trees.

[tutorial]: http://www.cs.ru.nl/B.Jacobs/PAPERS/JR.pdf “A Tutorial on (Co)Algebras and (Co)Induction”

[bnf]: http://en.wikipedia.org/wiki/Backus-Naur_form “Backus-Naur form”

[initial]: http://en.wikipedia.org/wiki/Initial_object “Initial Object”

[baseconv]: https://www.lshift.net/blog/2006/03/08/base-conversions-in-scheme “Base conversions in scheme”