Books, Badgers and Big Numbers
Posted on 21 June, 2025
I stumbled into what might be some combinatorial esoterica and I can't stop thinking about it.
Books
Up until April 2023 we had a pretty special second hand book shop up the road, next to the local café1. While it was there my mid morning coffee runs usually involved having a browse and I often picked something up.
A lot of the books I was buying were non-fiction, and it's this I'd like to blame on my over-thinking their organisation. I wanted everything grouped by subject, rather than author or title or year or anything like that. So in the end I signed up for a LibraryThing account and used their codes to sort everything.
Badgers
Some time last year LibraryThing let me know they'd made me a Year In Review. You know, like Spotify does. It's not really my jam, but I clicked through anyway and ended up reading this sentence:
And make sure your floors will support the added weight of 4.90 adult badgers!
I'd scanned in all our books at once, so this doesn't represent how many books we added to our collection in 2024, but rather the whole lot. At any rate, it's common now that when I look at our bookshelves I find myself trying to imagine five adult badgers somehow sitting in them. Which is hard, given that I have no idea how big an adult badger is.
Big Numbers
Despite the badgers, it's not all that many books really. But even still I felt the need to choose some sort of ordered arrangement. Now, without worrying about the usefulness of any particular arrangement, I wonder how many options there really are? How many ways could I have arranged these books?
In my last post I talked about enumerating possibilities, but in all the examples I used I permitted every possible option. Nothing was excluded, even if two coin tosses or dice throws resulted in the same outcome I still counted them as a unique possibility. What I was talking about there were Cartesian Products. That is: given a set of options and a set of options the total range of possibilities of combing both is , the Cartesian Product of the sets and .
The total number of possibilities is the size of each input set multiplied together.
What if we imagine that arrangements are the same as combined sets of possibilities? That we can choose an element (any element) from the set 'number of elements in ' (a.k.a. ) times? We might then say that the total number of arrangements of the elements in , is
More concretely.
But we know that's not right. There are many things in there which aren't arrangements. Arranging a collection won't duplicate any of its elements, so things like and shouldn't be counted.
What we need is a method to count arrangements that accounts for elements being used only once. Because that's how we might lay out an arrangement of some collection. We'd select an item to start with, then select the next item from what's left, then the next from what's left, and so on until we've placed every item and run out of options to select from.
For example, given the following collection of items.
We could arrange them like so.
Step | Options | Selection | Result |
---|---|---|---|
1 | ![]() |
![]() |
![]() |
2 | ![]() |
![]() |
![]() |
3 | ![]() |
![]() |
![]() |
At each step the set of options reduces by one. Once 'b' is put down, only 'a' and 'c' are left. On the next step once 'a' is put down only 'c' is left. So, in general, rather than this.
What we have instead is this.
For the 'a', 'b', 'c' example then we'd have.
And here they all are.






The result you get from multiplying all the numbers less than some number is called the factorial of that number. It's written like this: , and can be defined like this:
Right now we've got 419 books which can be sorted by a code. So how many ways could I have arranged them? What's the factorial of 419?
It's this extremely large number.
2,
And here's the kind of question that led to my current cosy combinatorial corner: given a sorted list of all possible arrangements, what is the (deep breath) 2,
Lexicographic order
How might arrangements be sorted? To answer that I'd need to define when one arrangement should come before another. Here's a scheme.
- Step through pairs of elements from both arrangements. I.E. First look at the first element of arrangement and the first element from arrangement , then the second element from arrangement and the second element from arrangement , and so on.
- For the first pair with mismatching elements, if the element from should come before the element from then the whole arrangement should come before the arrangement
.
For example, here's two arrangements of the first three letters of the alphabet.
- Compare the first element of each ordering: and . They match, so move onto the next pair of elements.
- Compare the second element of each ordering: and . They don't match and comes before , therefore comes before .
Sorting elements this way will result in them being in Lexicographic order, and it might seem like the way mostly everything is sorted in day to day life … because it is! This is how we sort words in the dictionary, hence the name.
Indexing
The example above sorts arrangements with reference to a pre-existing ordering of their elements: the alphabet. But it didn't have to. We could've used some other ordering, and ended up with a different result. If we lived in an alternate timeline where the Latin alphabet began rather than then the order of our example arrangements would be reversed and would come before
The alphabet we have is one possible arrangement. One of 403,
When we can use a sequence of numbers to refer to all the elements of a collection we're able to say that the collection is indexed by those numbers. The numbers point to elements in the collection.2
For our books I'll use Alphabetical by title as the conventional ordering. This then defines the Lexicographic order for the list of possible arrangements.
Lehmer codes
Right, here we go: Lehmer codes index all possible arrangements of collections of elements, and they do so in Lexicographic order.
To see how we can revisit my earlier example of laying out the first three letters of the alphabet.
Step | Options | Selection | Result |
---|---|---|---|
1 | ![]() |
![]() |
![]() |
2 | ![]() |
![]() |
![]() |
3 | ![]() |
![]() |
![]() |
Lehmer codes end up recording the steps in this procedure as a sequence of choices of index. Like so:
- Select the second option.
- Select the first option from what's left.
- Select the first option from what's left.
Now, the indexes in Lehmer codes all begin at 0, rather than 1, so for the arrangement: , we have this sequence of choices:
Here are all the choices someone could make when laying out the first three letters of the alphabet.
Step 1 | Step 2 | Step 3 |
---|---|---|
0 from (a, b, c) ![]() |
0 from (b, c) ![]() |
0 from (c) ![]() |
1 from (b, c) ![]() |
0 from (b) ![]() |
|
1 from (a, b, c) ![]() |
0 from (a, c) ![]() |
0 from (c) ![]() |
1 from (a, c) ![]() |
0 from (a) ![]() |
|
2 from (a, b, c) ![]() |
0 from (a, b) ![]() |
0 from (b) ![]() |
1 from (a, b) ![]() |
0 from (a) ![]() |
Which can be condensed into this list of sequences.
And here's the nifty trick: this is counting, but it's counting in a number system quite unlike the one we use day to day.
At the first step there are (a.k.a ) possible outcomes, then at the second step there are . I could also say that at the second step there are choices. It's the same thing, but by saying it that way I'm trying to emphasise that the first choice puts the rest of the sequence into one of three branches.
I could also try to emphasize that this way: when you choose 'b' you skip over the possibilities you might've reached if you'd chosen 'a'. When you choose 'c' you skip over the possibilities from choosing 'a', and also the from 'b'. So when choosing 'c' you skip over possibilities.
When viewed this way it's is not completely unlike ordinary counting. If asked to find the 243rd something you could skip over the first 200 things, then you could skip over the next 40 things, finally arriving at the 3rd thing after that. By choosing 2 in the hundreds place this number goes down the branch where the range of possibilities is within .
What's different in the case of these sequences of index choices is that instead of hundreds, tens and ones we have
What happens then if we multiply each index in the above list of sequences by its factorial place value? (Just as we do with ordinary numbers.)
Interesting, and now if we add the places together? (Again, just as we do with ordinary numbers.)
What we're looking at here is the Factorial number system, and how factorial numbers can be converted into ordinary, decimal numbers.
Now, factorial numbers are not just factorial numbers, they are also Lehmer codes. The number 2 is the decimal representation of the Lehmer code . Which is a sequence of choices made when laying out an arrangement of elements. It's possible to transform that sequence of choices into indexes into the original collection. If we do so with the sequence we get these indexes: . Following those indexes back to the letters they point to we get
This gives me everything I need to answer my silly question: What is the 2,
> BS.writeFile "sorted_books.json" (Aeson.encode (decodeLehmer books (toFactorial n)))
(0.02 secs, 17,994,448 bytes)
[
{"author": "Tufte, Edward R.",
"code": "001.4226",
"title": "The Visual Display of Quantitative Information"
,
}
{"author": "Weizenbaum, Joseph",
"code": "001.64",
"title": "Computer Power and Human Reason"
,
}
{"author": "Gleick, James",
"code": "003.857",
"title": "Chaos: Making a New Science"
,
}
{"author": "Galloway, Alexander",
"code": "004.09",
"title": "Uncomputable: Play and Politics In the Long Digital Age"
// ...
I think that's really something.
Having written this I can think of a couple of reasons I've found Lehmer codes so interesting.
Firstly, there's something indescribably cool about how they work. Embedded in the numbers of this very strange number system is the procedure for encoding and decoding them as arrangments of elements. The strange number system is perfectly suited to encoding and decoding arrangements, it's factorial! This, to me, feels simultaneously obvious and obscure and I love it.
Secondly, when I first came across Lehmer codes I was unaware of so many of the foundations upon which they're built, and it left me a bit awe struck. I was overwhelmed thinking about all the epistemic iterations3 it would take to arrive at something like this. All the large and small thoughts, and all the people who thought them.
At the end of the day I have no idea if Lehmer codes will ever prove to be practically useful to me. Regardless, I do know that I find them wonderful.
We were very sad to see Logical Unsanity close. The ABC did a nice feature at the time: Quirky tin shed bookshop 'born out of laziness' offers booklovers' sanctuary↩︎
… from indicare "to point out," …
This is a term I've just come across from reading Beyond Measure by James Vincent (2022). It was coined by Hasok Chang in his book Inventing Temperature (2004).↩︎