Books, Badgers and Big Numbers

Posted on 21 June, 2025

I stumbled into what might be some combinatorial esoterica and I can't stop thinking about it.

Books

Up until April 2023 we had a pretty special second hand book shop up the road, next to the local café1. While it was there my mid morning coffee runs usually involved having a browse and I often picked something up.

The cover the book Chaos by James Gleick

The cover the book Linear Algebra and Its Applications by Gilbert Strang

The cover of the book The Wonderful World of Steam Locomotives by P. B. Whitehouse

The cover of the book Dancing With Cats by Burton Silver and Heather Busch

A lot of the books I was buying were non-fiction, and it's this I'd like to blame on my over-thinking their organisation. I wanted everything grouped by subject, rather than author or title or year or anything like that. So in the end I signed up for a LibraryThing account and used their codes to sort everything.

Badgers

Some time last year LibraryThing let me know they'd made me a Year In Review. You know, like Spotify does. It's not really my jam, but I clicked through anyway and ended up reading this sentence:

And make sure your floors will support the added weight of 4.90 adult badgers!

I'd scanned in all our books at once, so this doesn't represent how many books we added to our collection in 2024, but rather the whole lot. At any rate, it's common now that when I look at our bookshelves I find myself trying to imagine five adult badgers somehow sitting in them. Which is hard, given that I have no idea how big an adult badger is.

Big Numbers

Despite the badgers, it's not all that many books really. But even still I felt the need to choose some sort of ordered arrangement. Now, without worrying about the usefulness of any particular arrangement, I wonder how many options there really are? How many ways could I have arranged these books?

In my last post I talked about enumerating possibilities, but in all the examples I used I permitted every possible option. Nothing was excluded, even if two coin tosses or dice throws resulted in the same outcome I still counted them as a unique possibility. What I was talking about there were Cartesian Products. That is: given a set of options AA and a set of options BB the total range of possibilities of combing both is A×BA \times B, the Cartesian Product of the sets AA and BB.

A={a,b,c}B={1,2,3}A×B={(a,1)(a,2)(a,3)(b,1)(b,2)(b,3)(c,1)(c,2)(c,3)} \begin{align*} A &= \{\text{a}, \text{b}, \text{c}\} \\ B &= \{1, 2, 3\} \\ A \times B &= \begin{Bmatrix} (\text{a}, 1) & (\text{a}, 2) & (\text{a}, 3) \\ (\text{b}, 1) & (\text{b}, 2) & (\text{b}, 3) \\ (\text{c}, 1) & (\text{c}, 2) & (\text{c}, 3) \end{Bmatrix} \end{align*}

The total number of possibilities is the size of each input set multiplied together.

|A|=3|B|=3|A×B|=|A|×|B|=9 \begin{align*} |A| &= 3 \\ |B| &= 3 \\ |A \times B| &= |A| \times |B| \\ &= 9 \end{align*}

What if we imagine that arrangements are the same as combined sets of possibilities? That we can choose an element (any element) from the set AA 'number of elements in AA' (a.k.a. |A||A|) times? We might then say that the total number of arrangements of the elements in AA, is |A|A|||A^{|A|}|.

𝑁𝑜. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐴=|A|×|A|×...×|A||A| \mathit{\text{No. arrangements of } A} = \overbrace{|A| \times |A| \times ... \times |A|}^{|A|}

More concretely.

A={a,b,c}A|A|=A×A×A={(a,a,a)(a,a,b)(a,a,c)(a,b,a)(a,b,b)(a,b,c)(a,c,a)(a,c,b)(a,c,c)(b,a,a)(b,a,b)(b,a,c)(b,b,a)(b,b,b)(b,b,c)(b,c,a)(b,c,b)(b,c,c)(c,a,a)(c,a,b)(c,a,c)(c,b,a)(c,b,b)(c,b,c)(c,c,a)(c,c,b)(c,c,c)}|A|A||=|A|×|A|×|A|=27 \begin{align*} A &= \{\text{a}, \text{b}, \text{c}\} \\ A^{|A|} &= A \times A \times A \\ &= \begin{Bmatrix} (\text{a}, \text{a}, \text{a}) & (\text{a}, \text{a}, \text{b}) & (\text{a}, \text{a}, \text{c}) \\ (\text{a}, \text{b}, \text{a}) & (\text{a}, \text{b}, \text{b}) & (\text{a}, \text{b}, \text{c}) \\ (\text{a}, \text{c}, \text{a}) & (\text{a}, \text{c}, \text{b}) & (\text{a}, \text{c}, \text{c}) \\ (\text{b}, \text{a}, \text{a}) & (\text{b}, \text{a}, \text{b}) & (\text{b}, \text{a}, \text{c}) \\ (\text{b}, \text{b}, \text{a}) & (\text{b}, \text{b}, \text{b}) & (\text{b}, \text{b}, \text{c}) \\ (\text{b}, \text{c}, \text{a}) & (\text{b}, \text{c}, \text{b}) & (\text{b}, \text{c}, \text{c}) \\ (\text{c}, \text{a}, \text{a}) & (\text{c}, \text{a}, \text{b}) & (\text{c}, \text{a}, \text{c}) \\ (\text{c}, \text{b}, \text{a}) & (\text{c}, \text{b}, \text{b}) & (\text{c}, \text{b}, \text{c}) \\ (\text{c}, \text{c}, \text{a}) & (\text{c}, \text{c}, \text{b}) & (\text{c}, \text{c}, \text{c}) \end{Bmatrix} \\ |A^{|A|}| &= |A| \times |A| \times |A| \\ &= 27 \end{align*}

But we know that's not right. There are many things in there which aren't arrangements. Arranging a collection won't duplicate any of its elements, so things like (a,a,a)(\text{a}, \text{a}, \text{a}) and (c,b,b)(\text{c}, \text{b}, \text{b}) shouldn't be counted.

What we need is a method to count arrangements that accounts for elements being used only once. Because that's how we might lay out an arrangement of some collection. We'd select an item to start with, then select the next item from what's left, then the next from what's left, and so on until we've placed every item and run out of options to select from.

For example, given the following collection of items.

The letters, 'a', 'b' and 'c'

We could arrange them like so.

Step Options Selection Result
1 The letters 'a', 'b' and 'c' The letter 'b' The letter 'b'
2 The letters 'a' and 'c' The letter 'a' The letter 'b'
3 The letter 'c' The letter 'c' The letter 'b'

At each step the set of options reduces by one. Once 'b' is put down, only 'a' and 'c' are left. On the next step once 'a' is put down only 'c' is left. So, in general, rather than this.

𝑁𝑜. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐴=|A|×|A|×...×|A||A| \mathit{\text{No. arrangements of } A} = \overbrace{|A| \times |A| \times ... \times |A|}^{|A|}

What we have instead is this.

𝑁𝑜. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐴=|A|×(|A|1)×(|A|2)...×1 \mathit{\text{No. arrangements of } A} = |A| \times (|A| - 1) \times (|A| - 2) ... \times 1

For the 'a', 'b', 'c' example then we'd have.

A={a,b,c}𝑁𝑜. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐴=|A|×(|A|1)×1=3×2×1=6 \begin{align*} A &= \{\text{a}, \text{b}, \text{c}\} \\ \mathit{\text{No. arrangements of } A} &= |A| \times (|A| - 1) \times 1 \\ &= 3 \times 2 \times 1 \\ &= 6 \end{align*}

And here they all are.

The letters 'a', 'b' and 'c'
The letters 'a', 'c' and 'b'
The letters 'b', 'a' and 'c'
The letters 'b', 'c' and 'a'
The letters 'c', 'a' and 'b'
The letters 'c', 'b' and 'a'

The result you get from multiplying all the numbers less than some number is called the factorial of that number. It's written like this: n!n!, and can be defined like this:

n!=n×(n1)×(n2)×...×2×1 n! = n \times (n - 1) \times (n - 2) \times ... \times 2 \times 1

Right now we've got 419 books which can be sorted by a code. So how many ways could I have arranged them? What's the factorial of 419?

419×418×417×...×3×2×1 419 \times 418 \times 417 \times ... \times 3 \times 2 \times 1

It's this extremely large number.

2,908,421,057,896,340,474,361,403,432,093,534,061,335,411,576,803,237,495,067,620,711,879,592,887,139,553,095,110,943,225,026,262,406,188,778,320,680,189,451,711,755,153,962,269,093,225,791,463,391,282,309,784,673,974,891,888,521,144,472,954,435,530,212,518,938,132,427,719,386,733,483,011,170,577,979,186,407,238,784,514,071,038,335,200,586,427,023,669,168,853,348,715,855,934,206,590,666,886,642,210,446,229,488,444,697,075,683,231,304,763,674,216,898,857,425,205,633,114,990,510,459,902,307,876,760,425,583,148,615,724,632,751,661,133,203,237,905,432,668,028,736,816,553,087,322,961,245,249,766,243,189,634,620,342,650,104,310,228,160,720,464,432,084,583,406,587,932,793,842,882,817,312,083,420,305,011,614,174,094,373,347,653,327,314,151,733,173,716,990,349,453,642,618,332,498,813,131,703,848,734,139,461,889,356,833,404,468,988,162,294,586,549,217,704,913,958,653,349,142,669,023,627,237,073,393,370,291,667,739,306,611,350,729,516,327,327,902,034,218,650,157,031,351,111,738,938,367,328,649,342,699,827,177,333,219,328,746,468,751,048,295,766,233,367,259,458,382,317,867,840,592,639,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000.

And here's the kind of question that led to my current cosy combinatorial corner: given a sorted list of all possible arrangements, what is the (deep breath) 2,844,113,050,416,543,052,622,601,377,724,177,730,794,635,580,558,165,101,289,087,907,211,046,617,413,162,202,727,992,682,826,505,610,602,434,122,752,894,055,956,478,001,246,987,975,654,867,880,927,440,389,088,878,653,260,211,539,374,415,866,208,792,187,943,989,161,515,680,780,226,637,325,041,170,571,521,944,136,857,756,357,610,408,690,480,987,642,359,289,070,142,617,861,100,311,817,255,304,659,710,866,175,534,883,866,241,789,715,155,834,223,460,969,127,343,122,506,553,271,818,293,567,727,306,529,166,052,023,268,045,186,141,937,347,041,818,966,122,038,640,793,796,799,685,516,455,929,549,507,013,186,833,430,265,450,466,044,505,744,178,793,393,254,499,311,497,130,361,821,989,719,697,224,981,301,053,842,088,290,183,796,995,774,643,108,623,128,572,424,596,295,207,864,719,379,479,419,093,894,525,035,709,405,252,305,572,483,849,439,595,150,582,362,925,993,365,553,189,173,069,073,069,239,396,792,666,051,139,882,098,736,318,227,488,496,884,365,414,402,236,391,348,245,541,096,370,960,228,964,587,593,160,349,264,902,263,727,302,169,421,863,892,154,866,370,306,041,614,127,727,830,967,236,995,581,667,801,392,480,447,679,422,359,386,205,525,674,979,615,059,536,733,032,646,807,296,164,119,104,440,573,184,560,777,118,243rd one? You know … for example.

Lexicographic order

How might arrangements be sorted? To answer that I'd need to define when one arrangement should come before another. Here's a scheme.

  • Step through pairs of elements from both arrangements. I.E. First look at the first element of arrangement xx and the first element from arrangement yy, then the second element from arrangement xx and the second element from arrangement yy, and so on.
  • For the first pair with mismatching elements, if the element from xx should come before the element from yy then the whole arrangement xx should come before the arrangement yy.

For example, here's two arrangements of the first three letters of the alphabet.

x=(b,c,a)y=(b,a,c) \begin{align*} x &= (\text{b}, \text{c}, \text{a}) \\ y &= (\text{b}, \text{a}, \text{c}) \end{align*}

  1. Compare the first element of each ordering: x1=bx_1 = \text{b} and y1=by_1 = \text{b}. They match, so move onto the next pair of elements.
  2. Compare the second element of each ordering: x2=cx_2 = \text{c} and y2=ay_2 = \text{a}. They don't match and a\text{a} comes before c\text{c}, therefore yy comes before xx.

Sorting elements this way will result in them being in Lexicographic order, and it might seem like the way mostly everything is sorted in day to day life … because it is! This is how we sort words in the dictionary, hence the name.

Indexing

The example above sorts arrangements with reference to a pre-existing ordering of their elements: the alphabet. But it didn't have to. We could've used some other ordering, and ended up with a different result. If we lived in an alternate timeline where the Latin alphabet began (c,a,b)(\text{c}, \text{a}, \text{b}) rather than (a,b,c)(\text{a}, \text{b}, \text{c}) then the order of our example arrangements would be reversed and xx would come before yy.

The alphabet we have is one possible arrangement. One of 403,291,461,126,605,635,584,000,000 options in fact. To reduce confusion when talking about any alternative arrangements we can choose to ignore the specific elements of whatever collection we're arranging and instead refer only to their position in some select conventional ordering. The alphabet is the convention for letters, where 'a' is first, 'b' is second, 'c' is third, and so on. We can then discuss other arrangements by saying only where elements 1, 2, and 3 are. For example, the arrangement (b,a,c)(\text{b}, \text{a}, \text{c}) becomes (2,1,3)(2, 1, 3).

When we can use a sequence of numbers to refer to all the elements of a collection we're able to say that the collection is indexed by those numbers. The numbers point to elements in the collection.2

For our books I'll use Alphabetical by title as the conventional ordering. This then defines the Lexicographic order for the list of possible arrangements.

Lehmer codes

Right, here we go: Lehmer codes index all possible arrangements of collections of elements, and they do so in Lexicographic order.

To see how we can revisit my earlier example of laying out the first three letters of the alphabet.

Step Options Selection Result
1 The letters 'a', 'b' and 'c' The letter 'b' The letter 'b'
2 The letters 'a' and 'c' The letter 'a' The letter 'b'
3 The letter 'c' The letter 'c' The letter 'b'

Lehmer codes end up recording the steps in this procedure as a sequence of choices of index. Like so:

  1. Select the second option.
  2. Select the first option from what's left.
  3. Select the first option from what's left.

Now, the indexes in Lehmer codes all begin at 0, rather than 1, so for the arrangement: (b,a,c)(\text{b}, \text{a}, \text{c}), we have this sequence of choices: (1,0,0)(1, 0, 0).

Here are all the choices someone could make when laying out the first three letters of the alphabet.

Step 1 Step 2 Step 3

0 from (a, b, c)

The letter 'a'

0 from (b, c)

The letters 'a' and 'b'

0 from (c)

The letters 'a', 'b' and 'c'

1 from (b, c)

The letters 'a' and 'c'

0 from (b)

The letters 'a', 'c' and 'b'

1 from (a, b, c)

The letter 'b'

0 from (a, c)

The letters 'b' and 'a'

0 from (c)

The letters 'b', 'a', and 'c'

1 from (a, c)

The letters 'b' and 'c'

0 from (a)

The letters 'b', 'c' and 'a'

2 from (a, b, c)

The letter 'c'

0 from (a, b)

The letters 'c' and 'a'

0 from (b)

The letters 'c', 'a' and 'b'

1 from (a, b)

The letters 'c' and 'b'

0 from (a)

The letters 'c', 'b' and 'a'

Which can be condensed into this list of sequences.

  • (0,0,0)(0, 0, 0)
  • (0,1,0)(0, 1, 0)
  • (1,0,0)(1, 0, 0)
  • (1,1,0)(1, 1, 0)
  • (2,0,0)(2, 0, 0)
  • (2,1,0)(2, 1, 0)

And here's the nifty trick: this is counting, but it's counting in a number system quite unlike the one we use day to day.

At the first step there are 3!3! (a.k.a 3×2×13 \times 2 \times 1) possible outcomes, then at the second step there are 2!2!. I could also say that at the second step there are (3×2×1)/3(3 \times 2 \times 1) / 3 choices. It's the same thing, but by saying it that way I'm trying to emphasise that the first choice puts the rest of the sequence into one of three branches.

I could also try to emphasize that this way: when you choose 'b' you skip over the 2!2! possibilities you might've reached if you'd chosen 'a'. When you choose 'c' you skip over the 2!2! possibilities from choosing 'a', and also the 2!2! from 'b'. So when choosing 'c' you skip over 2×2!2 \times 2! possibilities.

When viewed this way it's is not completely unlike ordinary counting. If asked to find the 243rd something you could skip over the first 200 things, then you could skip over the next 40 things, finally arriving at the 3rd thing after that. By choosing 2 in the hundreds place this number goes down the branch where the range of possibilities is within 200n<300200 \leq n \lt 300.

What's different in the case of these sequences of index choices is that instead of hundreds, tens and ones we have 2!2!s, 1!1!s and 0!0!s.

What happens then if we multiply each index in the above list of sequences by its factorial place value? (Just as we do with ordinary numbers.)

  • (0×2!,0×1!,0×0!)=(0,0,0)(0 \times 2!, 0 \times 1!, 0 \times 0!) = (0, 0, 0)
  • (0×2!,1×1!,0×0!)=(0,1,0)(0 \times 2!, 1 \times 1!, 0 \times 0!) = (0, 1, 0)
  • (1×2!,0×1!,0×0!)=(2,0,0)(1 \times 2!, 0 \times 1!, 0 \times 0!) = (2, 0, 0)
  • (1×2!,1×1!,0×0!)=(2,1,0)(1 \times 2!, 1 \times 1!, 0 \times 0!) = (2, 1, 0)
  • (2×2!,0×1!,0×0!)=(4,0,0)(2 \times 2!, 0 \times 1!, 0 \times 0!) = (4, 0, 0)
  • (2×2!,1×1!,0×0!)=(4,1,0)(2 \times 2!, 1 \times 1!, 0 \times 0!) = (4, 1, 0)

Interesting, and now if we add the places together? (Again, just as we do with ordinary numbers.)

  • 0+0+0=00 + 0 + 0 = 0
  • 0+1+0=10 + 1 + 0 = 1
  • 2+0+0=22 + 0 + 0 = 2
  • 2+1+0=32 + 1 + 0 = 3
  • 4+0+0=44 + 0 + 0 = 4
  • 4+1+0=54 + 1 + 0 = 5

What we're looking at here is the Factorial number system, and how factorial numbers can be converted into ordinary, decimal numbers.

Now, factorial numbers are not just factorial numbers, they are also Lehmer codes. The number 2 is the decimal representation of the Lehmer code (1,0,0)(1, 0, 0). Which is a sequence of choices made when laying out an arrangement of elements. It's possible to transform that sequence of choices into indexes into the original collection. If we do so with the sequence (1,0,0)(1, 0, 0) we get these indexes: (1,0,2)(1, 0, 2). Following those indexes back to the letters they point to we get (b,a,c)(\text{b}, \text{a}, \text{c}).

This gives me everything I need to answer my silly question: What is the 2,844, … 243rd possible arrangement of our 5-ish badgers worth of books? I only have to take that number, subtract one, convert that from decimal to factorial, and then use those digits as instructions for laying out an arrangement. This all happens very quickly. It takes about two hundredths of a second to go from an enormous number to a list of books.

> BS.writeFile "sorted_books.json" (Aeson.encode (decodeLehmer books (toFactorial n)))
(0.02 secs, 17,994,448 bytes)
[
  {
    "author": "Tufte, Edward R.",
    "code": "001.4226",
    "title": "The Visual Display of Quantitative Information"
  },
  {
    "author": "Weizenbaum, Joseph",
    "code": "001.64",
    "title": "Computer Power and Human Reason"
  },
  {
    "author": "Gleick, James",
    "code": "003.857",
    "title": "Chaos: Making a New Science"
  },
  {
    "author": "Galloway, Alexander",
    "code": "004.09",
    "title": "Uncomputable: Play and Politics In the Long Digital Age"
    // ...

I think that's really something.


Having written this I can think of a couple of reasons I've found Lehmer codes so interesting.

Firstly, there's something indescribably cool about how they work. Embedded in the numbers of this very strange number system is the procedure for encoding and decoding them as arrangments of elements. The strange number system is perfectly suited to encoding and decoding arrangements, it's factorial! This, to me, feels simultaneously obvious and obscure and I love it.

Secondly, when I first came across Lehmer codes I was unaware of so many of the foundations upon which they're built, and it left me a bit awe struck. I was overwhelmed thinking about all the epistemic iterations3 it would take to arrive at something like this. All the large and small thoughts, and all the people who thought them.

At the end of the day I have no idea if Lehmer codes will ever prove to be practically useful to me. Regardless, I do know that I find them wonderful.


  1. We were very sad to see Logical Unsanity close. The ABC did a nice feature at the time: Quirky tin shed bookshop 'born out of laziness' offers booklovers' sanctuary↩︎

  2. … from indicare "to point out," …

    Index - Etymology, Origin & Meaning↩︎

  3. This is a term I've just come across from reading Beyond Measure by James Vincent (2022). It was coined by Hasok Chang in his book Inventing Temperature (2004).↩︎