# Advent of Code 2017 in R

# Table of Contents

I was really impressed by the way Peter Norvig showed his solutions to the 2016 Advent of Code puzzles in one, continuous iPython notebook. With my new powers of embedding R code in Org mode, I want to do something similar this year.

# Day 0: Preparations

I am *very* unprepared, but I’m going to pretend I’m not and update this section
as I go …

I have only just started learning R, so I’m not an expert on library choices
yet. For this article, I just import all of `tidyverse`

because parts of it are
needed here and there.^{1}^{1} The `tidyverse`

library contains, among other things,
the `magrittr`

library which provides basically two things: the `%>%`

pipe
operator and the `.`

implicit variable. Even though it’s a tiny library, it can
make the code so much easier to read and write! It also includes the `stringr`

,
`dplyr`

, `tidyr`

and `ggplot2`

libraries which have as their strongest
selling points that they provide a uniform grammar to talk about strings, data
transformation, data cleaning and plotting, respectively.

library(tidyverse) ## Note to self: dessa importer borde INTE behövas... library(tidyr) library(readr) library(stringr) library(purrr) library(dplyr) library(ggplot2) library(httr) library(testthat) get_input <- function(day) "https://two-wrongs.com/src/aux/aoc17/input%d.txt" %>% sprintf(day) %>% GET %>% content %>% str_trim as_lines <- . %>% textConnection %>% readLines

I also prepare a function to download the input for each day’s puzzle. The
`httr`

library provides the `GET`

function. I find this `%>%`

based `magrittr`

style very clean and easy to read. Without it, I would have written the
`get_input`

function as

get_input_nopipe <- function(day) { url <- "https://two-wrongs.com/src/aux/aoc17/input%d.txt" response <- GET(sprintf(url, day)) readLines(str_trim(content(response))) }

Sure, if you are used to C-like languages, then maybe you even prefer this verbose, nested tangle of spaghetti version. But if you have some experience with the Unix shell, or functional programming, you’ll probably see the structure more clearly in the first version.

# Day 1: Inverse Captcha

In this puzzle, we get as our input a long string of digits, and then we are supposed to pick all digits that match the next digit in the string (including the last digit, if it matches the first) and return the sum of those. So, e.g. 122311 should return 4, as in 2+1+1.

## Solution

offset <- function (v, n) c(tail(v, n), head(v, -n)) test_that("offset works", { expect_equal(offset(c(), 2), c()) expect_equal(offset(c(1), 5), c(1)) v <- c(5, 2, 9, 4, 1, 6) ## Offseting backward and forward should be inverses expect_equal(offset(v, -4), offset(v, 2)) }) repeated <- function(v) which(v == offset(v, 1)) test_that("repeated works", expect_equal( repeated(c(1, 2, 2, 3, 1, 1, 5, 1)), c(1, 3, 6))) solve_captcha <- function (...) str_split(list(...), "") %>% map(. %>% parse_integer %>% { sum(.[repeated(.)]) }) %>% as_vector test_that("day 1 examples pass", expect_equal( solve_captcha("1122", "1111", "1234", "91212129"), c(3, 4, 0, 9))) print(solve_captcha(get_input(1)))

And we appear to have the correct result!

[1] 1141

## Reasoning

Vector | 1 | 2 | 2 | 3 | 1 | 1 |

Shifted vector | 2 | 2 | 3 | 1 | 1 | 1 |

Matching elements | = | = | = | |||

Boolean result | F | T | F | F | T | T |

`which` result |
2 | 5 | 6 | |||

Indexes into | v | v | v | |||

Vector again | 1 | 2 | 2 | 3 | 1 | 1 |

Sum | 2 | 5 | 6 |

To solve the captcha, I start by splitting the input string(s) into characters and convert each character to a number. Now I want the index for each number that matches the next number.

To get this, I think it will be convenient to do a vectorised comparison
(regular `==`

operator in R) between the vector of numbers and itself – offset
by one place^{2}^{2} I guess “cycled” or “rolled” one place ahead would be more
correct terminology, since the back comes onto the head.. However, this
`offset`

function is not part of any R library I use, so I had to define it as

offset <- function (v, n) c(tail(v, n), head(v, -n))

The vector of booleans we get from the `==`

operator can be
translated into indices with the `which`

function. In other words, the following
function returns the indices that contain elements which are repeated.

repeated <- function(v) which(v == offset(v, 1))

Finally, summing the elements referenced by those indices is a piece of cake – we can simply index the vector by the other vector and we get a new vector with only those elements.

# Day 2: Corruption Checksum

Today we have a bunch of numbers arranged in rows. We need to compute, and I bet
you will have to re-read this, *the sum of the differences of the largest and
smallest number of each row*.

## Solution

compute_checksum <- function(document) { read_tsv(document, col_names=FALSE) %>% group_by(row=rownames(.)) %>% gather("col", "val", -row) %>% summarise(span=diff(range(val, na.rm=TRUE))) %>% summarise(checksum=sum(span)) %>% .[["checksum"]] } test_that("day 2 example passes", { expect_equal(compute_checksum(" 5 1 9 5 7 5 3 2 4 6 8"), 18) }) print(compute_checksum(get_input(2)))

Resulting in

[1] 45351

## Reasoning

We can solve this the correct way, but we can also very quickly get a sense of
whereabouts the correct answer should be. One of the cool things about R is how
easy it is to just *look* at the numbers, and instantly get an intuitive sense
for what they are like.

day2_data <- gather( read_tsv(get_input(2), col_names=FALSE) %>% mutate(row=as.integer(rownames(.))), column, value, -row ) day2_plot <- ggplot() + theme_classic() + theme(legend.position="none", panel.background=element_blank(), plot.background=element_blank()) + scale_x_continuous(expand=c(0,0), breaks=seq(0, 6500, 500)) + scale_y_continuous(expand=c(0,0)) + scale_fill_gradient(low="gray50", high="black") day2_plot + geom_histogram( data=day2_data %>% select(row, value), aes(x=value, fill=row, group=row), breaks=seq(0,6500,100))

It looks as though most values hang out below 500. However – that doesn’t help us a whole lot because to solve this problem we need to look only at the extremes of each row.

Spontaneously, it looks like most rows will have a difference of about 5000
between their extremes. Maybe a standard deviation of 500? Just guessing; it’s
really hard to tell from this picture. If the guess was true, then we could use
the central limit theorem^{3}^{3} The central limit theorem says that when you sum
up a bunch of random values with mean \(\mu\) and standard deviation \(\sigma\),
your result will have a normal distribution with mean \(n\mu\) and standard
deviation \(\sigma\sqrt{n}\). to approximate the answer as

\[N(16 \times 5 000, 500 \times \sqrt{16}) = N(80 000, 2 000)\]

for which a 95% confidence interval would land in the range 76000–84000, a guess
which I happen to know is wildly incorrect.^{4}^{4} Why do the numbers overall
represent such a bad estimate for the answer? Probably because of the skewed
distribution with the long tail. It’s hard to tell from that plot where all rows
begin and end.

Instead, let’s annotate the data with its maximum and minimum across the groups, and plot those ranges instead.

day2_summary <- day2_data %>% group_by(row) %>% summarise(low=min(value), high=max(value)) %>% select(row, low, high) %>% arrange(high-low) %>% mutate(size_order=as.integer(rownames(.))) day2_plot + geom_tile(data=day2_summary, aes(x=size_order, y=(low+high)/2, width=1, height=high-low)) + scale_x_continuous("", expand=c(0,0), breaks=c()) + scale_y_continuous("value", expand=c(0,0), breaks=seq(0, 6200, 500)) + geom_hline(aes(yintercept=3500), linetype="dashed", color="chocolate2") + geom_hline(aes(yintercept=100), linetype="dashed", color="chocolate2")

This makes it more clear that maybe we’re looking at a real range of 100–3500,
and then the central limit theorem gives us the range 50000–58000.^{5}^{5} Which is
actually fairly close to the true value!

Anyway, just straight up computing the value gives us the correct answer with no problems.

# Day 3: Spiral Memory

This puzzle had me stumped for an embarassingly long time.

## Solution

spiral_distance <- Vectorize(function(i) { ## Compute how many rows we need to get to i rows <- rep(0:ceiling(sqrt(4*i)/4), each=4) ## Compute the values of the corners for rows corner <- matrix(4*rows^2 + 2*rows*(0:3 - 1), ncol=4, byrow=T) distance <- abs(corner - i) nearest <- which(distance == min(distance), arr.ind=TRUE) ## The distance is 2*r - d, but since we are ## 1-indexing we need to subtract by one. manhattan <- max(0, 2*(nearest[1,"row"] - 1) - distance[nearest][1] - 1) unname(manhattan, force=TRUE) }) test_that("day 3 examples pass", { examples <- c(1, 12, 23, 1024) solutions <- c(0, 3, 2, 31) expect_equal(spiral_distance(examples), solutions) }) print(spiral_distance(277678))

It is correct!

[1] 475

## Reasoning

At some point, I concluded that I couldn’t figure it out in my head; I needed to draw a grid on paper. In doing so, I discovered that if the memory addresses are zero-indexed instead, the index of the four corners of the rectangles formed appear to follow a pattern.

Radius | Upper-right (c=0) | Upper-left (c=1) | Lower-left (c=2) | Lower-right (c=3) |
---|---|---|---|---|

r=0 | 0 | 0 | 0 | 0 |

r=1 | 2 | 4 | 6 | 8 |

r=2 | 12 | 16 | 20 | 24 |

r=3 | 30 | 36 | 42 | 48 |

Figure 3: Two equivalent paths to the same destination

This is nice, because computing the minimum distance to the centre from any corner is trivial: it’s twice the radius: once to get to either axis, and then once again to get to the origin. This is also the farthest distance possible given a particular radius.

So if you have the distance to the corner (which we can compute by simply subtracting the memory addresses of the location of the data and the corner) and the radius the corner belongs to, then it’s trivial to compute the distance to the data location by subtracting the two.

```
corner <- matrix(
c(0, 0, 0, 0,
2, 4, 6, 8,
12, 16, 20, 24,
30, 36, 42, 48,
56, 64, 72, 80),
ncol=4, byrow=T)
print(corner)
```

[,1] [,2] [,3] [,4] [1,] 0 0 0 0 [2,] 2 4 6 8 [3,] 12 16 20 24 [4,] 30 36 42 48 [5,] 56 64 72 80

Assuming we have the matrix above, we can compute the distances to all corners for an index, say, 23.

address <- 23 proximity <- abs(corner - address) print(proximity)

[,1] [,2] [,3] [,4] [1,] 23 23 23 23 [2,] 21 19 17 15 [3,] 11 7 3 1 [4,] 7 13 19 25 [5,] 33 41 49 57

We use `abs`

because we’re only interested in the distance, not in which
direction the corner lies. We see already which corner is the closest – all we
have to do is convince the computer.

nearest <- which(proximity == min(proximity), arr.ind=TRUE) print(nearest)

row col [1,] 3 4

We ask the computer the same thing we asked ourselves. Which corner has the minimum proximity to our location? That’s the nearest one.

distance <- 2*nearest[1,"row"] - proximity[nearest][1] - 2 print(distance)

row 3

Then computing the distance from our location is done the way it’s supposed to. But how do we get the matrix of corners we started from? Well, with a little thinking we can arrive at the conclusion that the matrix is built from the recursive formula

\[ n(r,c) = \begin{cases} 0 & \mathrm{if}\quad{}r = 0 \\ n(r-1,3) + 2r + 2rc & \mathrm{otherwise} \end{cases} \]

Which basically says that for rectangles $r = 0, 1, 2, 3, … $ and corners \(c = 0,1,2,3\), we can find the address for the corner by taking the base address of the rectangle and add the corner number times the distance to the corner.

Shuffling things around, these leads to the closed (i.e. not recursive) formula

\[ n(r, c) = 4r^2 + 2r(c-1) \]

Solving this equation gives us, for any address \(n\), a maximum number of rows

\[ \left \lceil \frac{1 + \sqrt{4n + 1}}{4} \right \rceil \]

Which we simplify a bit (we’re not interested in exact results) and then use to compute the matrix with the closed formula.

At this point, we *could* go even further and eliminate the matrix entirely, by
simply calculating which two corners are candidates for being closest and then
discarding everything else. However, that is left as an exercise for the reader.

# Day 4: High-Entropy Passphrase

Finally! Something I can bang out in five minutes!

## Solution

unique_words = function(str) { words <- str_split(str, " ") duplicates <- map(words, anyDuplicated) ## Return boolean vector which is true only for ## elements that are not duplicated duplicates == 0 } count_valid_passphrases = function(passphrases) sum(unique_words(passphrases)) test_that("day 4 examples pass", { expect_equal(unique_words("aa bb cc dd ee"), TRUE) expect_equal(unique_words("aa bb cc dd aa"), FALSE) expect_equal(unique_words("aa bb cc dd aaa"), TRUE) }) print(count_valid_passphrases(get_input(4) %>% as_lines))

[1] 451

## Reasoning

There’s not much to say here. Pull out the words, look for duplicates, check if the amount of duplicates is zero. Count those lines.

This was, however, the first time I got an incorrect answer! Mostly because I
misread the question as “How many *invalid* passphrases are there?” rather than
how many *valid*.

# Day 5: Twisty Trampolines

This day was really fun! I decided early that I wanted a vectorised solution rather than the obvious way, so I had to rewire my brain a bit to get it working correctly.

## Solution

This is an optimised version, see the simpler version for an equivalent, but theoretically much more pleasing version.

twisty_trampolines <- function(relative) { n <- length(relative) ## Convert relative jumps to absolute vector indices memory <- matrix(relative + 1:n, n, 1) ## Initialise program counter pc <- matrix(diag(n)[1,]) ## Preallocate jump matrix target <- matrix(FALSE, n, n) ## While there are active program counters while (any(pc > 0)) { ## Boolean vector of active program counters, controls ## which values are updated this iteration active <- pc > 0 ## Construct program counter matrix with jump targets target[, active] <- outer(1:n, memory[active, 1], `==`) ## Increment value in this memory location memory[active, 1] <- memory[active, 1] + pc[active, 1] ## Compute new program counter locations from jump matrix pc[,1] <- target[, active, drop=FALSE] %*% pc[active, 1] } ## Figure out how much the memory has changed from initial value sum(memory - relative - 1:n) } test_that("day 5 example passes", { expect_equal(twisty_trampolines(c(0,3,0,1,-3)), 5) }) print(twisty_trampolines( get_input(5) %>% as_lines %>% as.integer ))

After a little while^{6}^{6} Unsurprisingly, this approach is slower than a naïve
loop concerned with only one-dimensional values., we get our result.

## Reasoning

The first thing I did, before I even decided on an approach, was to convert the
relative jump locations to absolute vector indices. It just seemed easier to
deal with. I then picked an approach where I represent the pc^{7}^{7} program
counter, i.e. index of current memory address as a one-hot vector. I had a weak
intuition this would let me play to the strengths of R and the built-in vector
operations. So when the program starts, the pc is

\[ \left( \begin{array}{cc} 1 \\ 0 \\ 0 \\ 0 \\ 0 \end{array} \right) \]

Then, after two very productive decisions, I hit a brick wall. I had no idea how to continue. I sat down and had a Deep Think. How can I compute the new one-hot pc vector based on the memory contents and the old pc? I decided to store the computation of the new pc as a matrix where the columns represent the current pc, and the rows represent the next pc. In other words, if the memory is

\[ \left( \begin{array}{cc} 3 & 1 & 4 & 3 & 2 \end{array} \right)^{\intercal} \]

then the matrix representation of this memory would be

\[ \left( \begin{array}{cc} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array} \right) \]

This representation of the memory has surprisingly many neat properties. The
most important one for our purposes is that when this matrix is multiplied by
the pc vector, we get the new pc. I want you to pause and think about what
just happened. By picking the right data structures, we managed to turn our
(rather complicated) problem into “perform a multiplication”.^{8}^{8} Generally,
before starting to answer a question, it may be helpful to think of alternative
ways to ask the same question. Surprisingly often, you just have to ask the
question in the right way to get an obvious, simple and clear answer with no
effort.

\[ \left( \begin{array}{cc} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array} \right) \dot{} \left( \begin{array}{cc} 0 \\ 0 \\ 0 \\ 1 \\ 0 \end{array} \right) = \left( \begin{array}{cc} 0 \\ 0 \\ 1 \\ 0 \\ 0 \end{array} \right) \]

Another neat property of this matrix is that if we want to have *multiple
parallel program counters*, all we have to do is put more “ones” into the pc.
We don’t have to change anything in our code, and all simultaneous program
counters will *Just Work*. In a similar fashion, we can also introduce
non-deterministic jumps^{9}^{9} Currently, each memory location contains a jump to a
single, specific other memory location. We can also imagine that, for example,
when the program counter points to location 7, it can jump to location 3 with a
75% probability, location 10 with a 10% probability, and stay where it is with a
15% probability. The resulting vector will then be the probabilities of the
program counter being a particular value. All for free; no changes to the code
required. We’re now starting to realise the power of representing things as
matrices. by putting more than one value in a column in the target matrix.

The `target`

matrix is constructed in R as the equality-based outer product of
the memory and its indices.

twisty_trampolines <- function(relative) { n <- length(relative) ## Convert relative jumps to absolute vector indices memory <- matrix(relative + seq(relative), n, 1) ## Initialise program counter pc <- matrix(diag(n)[1,]) ## While there are active program counters while (any(pc > 0)) { ## Construct program counter matrix with jump targets target <- outer(seq(memory), memory, `==`) ## Increment value in this memory location memory <- memory + pc ## Compute new program counter locations from jump matrix pc <- target %*% pc } ## Figure out how much the memory has changed from initial value sum(memory - relative - seq(relative)) }

However, there’s a big problem with this code.

Sure, it’s very general and handles multiple program counters and nondeterministic jump targets … when do we use that? We don’t. This puzzle only has a single program counter and jump target. So only one out of every thousand values we compute will actually be used. That’s bad.

Without loss of generality, we can optimise our code to handle the salient case
much faster: we simply refuse to update values that we know will not be used. We
compute the boolean vector `pc > 0`

, which is a filter of the *active* memory
locations. We then slice the vectors and matrices in our computations such that
only the active memory locations are updated.^{10}^{10} Other locations *will* contain
invalid values, but since they’re inactive, that doesn’t bother us. Once they
become active, they will be computed correctly again.

We make sure the `target`

matrix, which is the largest data structure in the
function, is pre-allocated at the start of the function, rather than allocated
anew each iteration. A micro-optimisation we also perform is that we replace
several calls to `seq`

by the nearly constant vector `1:n`

, which should be
slightly faster.

# Day 6: Memory Reallocation

## Solution

memory_realloc <- function(memory) { n <- length(memory) unseen <- function(history, memory) { history %>% group_by(time) %>% filter(bank == seq(memory), blocks == memory) %>% nrow(.) == 0 } history <- data_frame(time=integer(), bank=integer(), blocks=integer()) i <- 0 while (unseen(history, memory)) { ## Add this memory to history history <- bind_rows(history, data_frame(time=i, bank=seq(memory), blocks=memory)) from <- which.max(memory) to <- sort(((1:memory[from] + from - 1) %% n) + 1) memory[from] <- 0 memory[1:4] <- memory[1:4] + tabulate(to, nbins=n) i <- i+1 } print(history) history %>% summarise(m=max(time)) } memory_realloc(c(0,2,7,0))

## Reasoning

time bank blocks 0 1 0 0 2 7 0 2 2 3 3 3 3 3 3 3 0 2 0 0 2 2 4 5 6 7 8 9 10 (mod n) . 2 2 4 1 2 3 4 1 2 . . . . 1 1 2 2 2 2 3 4 4 . . 2 4 1 2 1 1 3 4 5 6 3 4 4 (mod n) 2 0 1 2 1 1 3 4 1 2 3 4 4 . . 1 1 1 2 3 3 4 4 4 . . 3 1 2 3 0 1 2 3 . . . 0 2 3 4 0 2 3 0 . . . . 1 3 4 1 1 3 0 1 . . . . 2 4 1 2