Corporate espionage through post-it notes

Tomorrow, you (a corporate spy!) will be in the office across the street and I want to signal to you what I’ve been doing throughout the day. I will do so by putting up post-it notes on a window. I have three colours of post-its: 🟥, 🟨 and 🟩.1¹ If a lot of people have problem with their fonts not rendering these unicode characters as coloured squares, let me know.

We have just three colours, so we cannot assign one colour to each task – that would only let us communicate the three most common tasks. What if we use two colours? Still not quite there, since that only lets us communicate \(3^2 = 9\) different tasks. Naïvely, we might think we need three colours per task, resulting in the following codebook2² This is practically counting in ternary, except using coloured post-its instead of the digits 0,1,2.:

Codeword	Task
🟩 🟩 🟩	Reading
🟩 🟩 🟨	Writing tests
🟩 🟩 🟥	Meetings
🟩 🟨 🟩	Debugging
🟩 🟨 🟨	Code review
🟩 🟨 🟥	Toolmaking
🟩 🟥 🟩	Lunch
🟩 🟥 🟨	Slack
🟩 🟥 🟥	Standup
🟨 🟩 🟩	Programming
🟨 🟩 🟨	Documenting

Now if you look over at my window and you see this:

🟩 🟩 🟥 🟩 🟩 🟥 🟩 🟩 🟨 🟩 🟨 🟨 🟩 🟩 🟩

you know that I first spent quite some time in meetings (2× 🟩 🟩 🟥), then worked on tests (🟩 🟩 🟨), reviewed code (🟩 🟨 🟨) and finally did some reading (🟩 🟩 🟩). But that was a long sequence of post-its! If my boss walks past, he will think it looks very suspicious and tear them down. Using this codebook, communicating \(n\) tasks will always take \(3n\) post-it notes.

How can we approach creating a codebook that results in the shortest coded message on average, to avoid rousing suspicion?

Huffman codes are short on average

The key idea is that for a rare task, like programming or documentation, we can afford to use a long codeword. In fact, if it lets us use a short codeword for something common like reading, we should be fine with needing more than three post-it notes to encode something rare like documentation.

The Huffman code is an algorithm we can follow that generates an optimal codebook. Here’s how to do it.

First, list all tasks in order of frequency. Check – we already did.

Task	Fraction of time
Reading	21 %
Writing tests	17 %
Meetings	13 %
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %
Standup	5 %
Programming	3 %
Documenting	2 %

Assign post-it colours to the three least common tasks (to keep things simple, always assign colours in the same order):

Task	Fraction of time	Code
Reading	21 %
Writing tests	17 %
Meetings	13 %
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %
Standup	5 %	🟩
Programming	3 %	🟨
Documenting	2 %	🟥

Now, and this is going to be a little tricky to follow, but mentally group these three together into a supertask, that has frequency 5 + 3 + 2 = 10 %. Move this supertask up to its correct place in the frequency table.

Task	Fraction of time	Code
Reading	21 %
Writing tests	17 %
Meetings	13 %
(1) Standup	5 %	🟩
(1) Programming	3 %	🟨
(1) Documenting	2 %	🟥
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %

Assign again colours to the three least common tasks.

Task	Fraction of time	Code
Reading	21 %
Writing tests	17 %
Meetings	13 %
(1) Standup	5 %	🟩
(1) Programming	3 %	🟨
(1) Documenting	2 %	🟥
Debugging	10 %
Code review	8 %
Toolmaking	7 %	🟩
Lunch	6 %	🟨
Slack	5 %	🟥

Mentally group them together the same way, and the group has a combined frequency of 7+6+5=18 %, so it goes higher up in the table.

Task	Fraction of time	Code
Reading	21 %
(2) Toolmaking	7 %	🟩
(2) Lunch	6 %	🟨
(2) Slack	5 %	🟥
Writing tests	17 %
Meetings	13 %
(1) Standup	5 %	🟩
(1) Programming	3 %	🟨
(1) Documenting	2 %	🟥
Debugging	10 %
Code review	8 %

Assign post-its to the three least common – but treat the entire group (1) as a single entity this time. It means all three items in group (1) gets prefixed with one colour.

Task	Fraction of time	Code
Reading	21 %
(2) Toolmaking	7 %	🟩
(2) Lunch	6 %	🟨
(2) Slack	5 %	🟥
Writing tests	17 %
Meetings	13 %
(1) Standup	5 %	🟩 🟩
(1) Programming	3 %	🟩 🟨
(1) Documenting	2 %	🟩 🟥
Debugging	10 %	🟨
Code review	8 %	🟥

Then group these together – treating the group (1) as a unit again. Note that the entire group (1) has a probability of 10 %, so the sum for group (3) is 10+10+8 %, which launches it into the top of the table.

Task	Fraction of time	Code
(3) (1) Standup	5 %	🟩 🟩
(3) (1) Programming	3 %	🟩 🟨
(3) (1) Documenting	2 %	🟩 🟥
(3) Debugging	10 %	🟨
(3) Code review	8 %	🟥
Reading	21 %
(2) Toolmaking	7 %	🟩
(2) Lunch	6 %	🟨
(2) Slack	5 %	🟥
Writing tests	17 %
Meetings	13 %

Repeat again.

Task	Fraction of time	Code
(4) (2) Toolmaking	7 %	🟩 🟩
(4) (2) Lunch	6 %	🟩 🟨
(4) (2) Slack	5 %	🟩 🟥
(4) Writing tests	17 %	🟨
(4) Meetings	13 %	🟥
(3) (1) Standup	5 %	🟩 🟩
(3) (1) Programming	3 %	🟩 🟨
(3) (1) Documenting	2 %	🟩 🟥
(3) Debugging	10 %	🟨
(3) Code review	8 %	🟥
Reading	21 %

At this point, we only have three groups remaining: (4), (3), and reading. We prefix with one colour each again.

Task	Fraction of time	Code
(5) (4) (2) Toolmaking	7 %	🟩 🟩 🟩
(5) (4) (2) Lunch	6 %	🟩 🟩 🟨
(5) (4) (2) Slack	5 %	🟩 🟩 🟥
(5) (4) Writing tests	17 %	🟩 🟨
(5) (4) Meetings	13 %	🟩 🟥
(5) (3) (1) Standup	5 %	🟨 🟩 🟩
(5) (3) (1) Programming	3 %	🟨 🟩 🟨
(5) (3) (1) Documenting	2 %	🟨 🟩 🟥
(5) (3) Debugging	10 %	🟨 🟨
(5) (3) Code review	8 %	🟨 🟥
(5) Reading	21 %	🟥

That’s it! This is our Huffman code. With this new, more efficient codebook, the message from before becomes

🟩 🟥 🟩 🟥 🟩 🟨 🟨 🟥 🟥

This is indeed the same message: meetings (2× 🟩 🟥), tests (🟩 🟨), review (🟨 🟥) and finally reading (🟥). And we got the message across using just 9 post-it notes, compared to the 15 from before.

Here are two, perhaps obvious features of the Huffman code3³ There is also one thing that confused me at first, but after brief thought I realised how satisfying it is. There is only one task that gets assigned a single post-it note, namely reading (🟥). Instinctively, I would think the code would get shorter if also the next most common task, writing tests, would get a single post-it note. But! If both reading and writing tests got a single-post-it codeword, then we would not be able to have three other two-post-it codewords (writing tests (🟩 🟨), meetings (🟩 🟥), debugging (🟨 🟨), and code review (🟨 🟥)), and with the distribution given, that would result in a worse average message length..

Short codewords are reserved for common tasks. And indeed, it can assign longer-than-necessary codes to very rare tasks.
Codewords are immediately decodable – once you reach the end of a codeword, you’ll know it. There is no ambiguity and you don’t need to read ahead to understand a previous codeword.

Sometimes it doesn’t work out so nicely

The extremely sharp-eyed reader will have noticed that the percentages in the table don’t quite add up to 100 %. This is because in the real data, I have another entry for “others” at 3 %.

Task	Fraction of time
Reading	21 %
Writing tests	17 %
Meetings	13 %
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %
Standup	5 %
Programming	3 %
Miscellaneous	3 %
Documenting	2 %

If we tried the above process with this table, we would find at the end that we have two groups left and three colours of post-its to hand out. The solution is to pad out the table with dummy tasks that happen 0 % of the time. Here’s how it would start:

Task	Fraction of time	Code
Reading	21 %
Writing tests	17 %
Meetings	13 %
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %
Standup	5 %
(1) Miscellaneous	3 %	🟩
(1) Documenting	2 %	🟨
(1) Dummy	0 %	🟥
Programming	3 %

And then we continue, treating the group (1) as a unit:

Task	Fraction of time	Code
Reading	21 %
Writing tests	17 %
Meetings	13 %
(2) Standup	5 %	🟩
(2) (1) Miscellaneous	3 %	🟨 🟩
(2) (1) Documenting	2 %	🟨 🟨
(2) (1) Dummy	0 %	🟨 🟥
(2) Programming	3 %	🟥
Debugging	10 %
Code review	8 %
Toolmaking	7 %
Lunch	6 %
Slack	5 %

And after a few more steps:

Task	Fraction of time	Code
(5) (3) Toolmaking	7 %	🟩 🟩
(5) (3) Lunch	6 %	🟩 🟨
(5) (3) Slack	5 %	🟩 🟥
(5) Writing tests	17 %	🟨
(5) Meetings	13 %	🟥
(4) (2) Standup	5 %	🟩 🟩
(4) (2) (1) Miscellaneous	3 %	🟩 🟨 🟩
(4) (2) (1) Documenting	2 %	🟩 🟨 🟨
(4) (2) (1) Dummy	0 %	🟩 🟨 🟥
(4) (2) Programming	3 %	🟩 🟥
(4) Debugging	10 %	🟨
(4) Code review	8 %	🟥
Reading	21 %

We are now ready for the last assignment:

Task	Fraction of time	Code
(5) (3) Toolmaking	7 %	🟩 🟩 🟩
(5) (3) Lunch	6 %	🟩 🟩 🟨
(5) (3) Slack	5 %	🟩 🟩 🟥
(5) Writing tests	17 %	🟩 🟨
(5) Meetings	13 %	🟩 🟥
(4) (2) Standup	5 %	🟨 🟩 🟩
(4) (2) (1) Miscellaneous	3 %	🟨 🟩 🟨 🟩
(4) (2) (1) Documenting	2 %	🟨 🟩 🟨 🟨
(4) (2) (1) Dummy	0 %	🟨 🟩 🟨 🟥
(4) (2) Programming	3 %	🟨 🟩 🟥
(4) Debugging	10 %	🟨 🟨
(4) Code review	8 %	🟨 🟥
Reading	21 %	🟥

We still have that dummy task, which we can drop from the codebook. Now that we had 12 tasks to represent, the Huffman code introduced some four-post-it codewords in order to be able to keep short codewords for common tasks.

Huffman also handles skewed distributions

Imagine now that we had a much more skewed distribution, something like what might have happened very early in my career:

Task	Fraction of time
Reading	40 %
Programming	31 %
Lunch	6 %
Standup	5 %
Debugging	4 %
Slack	3 %
Miscellaneous	3 %
Meetings	2 %
Toolmaking	2 %
Documenting	2 %
Code review	1 %
Writing tests	1 %

For this distribution, we would expect the Huffman code to generate short codewords for reading and programming, and longer codewords for the rest. Let’s find out what happens!

Task	Fraction of time	Code
Reading	40 %	🟩
Programming	31 %	🟨
(5) (4) (2) Toolmaking	2 %	🟥 🟩 🟩 🟩
(5) (4) (2) Documenting	2 %	🟥 🟩 🟩 🟨
(5) (4) (2) (1) Code review	1 %	🟥 🟩 🟩 🟥 🟩
(5) (4) (2) (1) Writing tests	1 %	🟥 🟩 🟩 🟥 🟨
(5) (4) (2) (1) Dummy	0 %	🟥 🟩 🟩 🟥 🟥
(5) (4) Standup	5 %	🟥 🟩 🟨
(5) (4) Debugging	4 %	🟥 🟩 🟥
(5) (3) Slack	3 %	🟥 🟨 🟩
(5) (3) Miscellaneous	3 %	🟥 🟨 🟨
(5) (3) Meetings	2 %	🟥 🟨 🟥
(5) Lunch	6 %	🟥 🟥

Indeed! At this point, the value of reserving short codewords (single post-it notes) for the two most common tasks was high enough that the Huffman code even created some five-post-it codewords for the least common tasks.4⁴ Lunch apparently took enough of my time even as a junior that it deserved a two-post-it codeword.

What does a typical day look like with this codebook? Maybe something like

🟩 🟨 🟨 🟩 🟥 🟥 🟥 🟩 🟥 🟨 🟨 🟨 🟩

Despite the large variation in codeword length in this code, we still have that immediate decodability5⁵ There is no codeword that starts with 🟥 🟩 🟥 and then goes on to something else, so when we see that sequence, we know the codeword ends there and corresponds to debugging. which is convenient:

🟩 (reading) 🟨 (programming) 🟨 (programming) 🟩 (reading)

🟥 🟥 (lunch) 🟥 🟩 🟥 (debugging) 🟨 (programming)

🟨 (programming) 🟨 (programming) 🟩 (reading)

I don’t want to be a corporate post-it spy

I don’t either. The real reason I wrote this article is that I couldn’t figure out how to convert the Huffman coding algorithm into flashcards, so I wanted to write it down somewhere for future me.

If you have an idea on how to flashcard things like these, let me know!

Two Wrongs

Huffman Codes – How Do They Work?