Two Wrongs

Debugging Common Lisp in Slime

Debugging Common Lisp in Slime

I was making a little “game” in Lisp to help me understand a concept better. Basically, it shows me a grid of X’s, O’s and _’s. I get to remove one each turn, and then the final configuration should follow some rules. The goal of the game is not actually to reach such a final configuration, but rather determine, in as few moves as possible, whether or not that configuration is possible.

However, the following happened.

Remove? e
_ (a)    O (b)    X (c)  

X (d)    _ (e)    X (f)  

X (g)    O (h)    X (i)  

Remove? 

See how the columns are all either _ or the same letter? That’s supposed to be a win, but it wasn’t counting it. It just let me keep playing.

Interactive Stack Trace

Debugging this the Lisp way was a pleasure, though. I hit C-c C-c to trigger a condition at the current point of evaluation, which naturally throws me into the debugger. It presented me with this screen11 Which I have abbreviated mostly because it is hard(er) to read without the correct typography ….

Interrupt from Emacs
   [Condition of type SIMPLE-ERROR]

Restarts:
 0: [CONTINUE] Continue from break.
 1: [ABORT-READ] Abort reading input from Emacs.
 2: [RETRY] Retry SLIME REPL evaluation request.
 3: [*ABORT] Return to SLIME's top level.
 4: [ABORT] abort thread (#<THREAD "repl-thread" RUNNING {1001A8FFA3}>)

Backtrace:
  1: (SWANK::DEBUG-IN-EMACS #<SIMPLE-ERROR "Interrupt from Emacs" {10057293D3}>)
  2: (SWANK:INVOKE-SLIME-DEBUGGER #<SIMPLE-ERROR "Interrupt from Emacs" {10057293D3}>)
  3: (SWANK:SIMPLE-BREAK "Interrupt from Emacs")
  4: (SWANK/BACKEND:CHECK-SLIME-INTERRUPTS)
 ...
 13: ((:METHOD STREAM-READ-CHAR (SWANK/GRAY::SLIME-INPUT-STREAM)) #<SWANK/GRAY::SLIME-INPUT-STREAM {1001997A13}>) [fast-method]
 14: (READ-CHAR #<SWANK/GRAY::SLIME-INPUT-STREAM {1001997A13}> NIL NIL #<unused argument>)
 15: (READ-LINE #<TWO-WAY-STREAM :INPUT-STREAM #<SWANK/GRAY::SLIME-INPUT-STREAM {1001997A13}> :OUTPUT-STREAM #<SWANK/GRAY::SLIME-OUTPUT-STREAM {1001A77973}>> T NIL #<unused argument>)
 16: (GAME-ROUND 3 3)
 17: (PLAY 3 3)
 18: (SB-INT:SIMPLE-EVAL-IN-LEXENV (PLAY 3 3) #<NULL-LEXENV>)
 19: (EVAL (PLAY 3 3))
 20: (SWANK::EVAL-REGION "(play 3 3) ..)"
 21: ((LAMBDA NIL :IN SWANK-REPL::REPL-EVAL))
 ...

I know the check for a winning game happens in the game-round function, so I expanded that stack frame in the backtrace seen above. It showed me the local variables, none of which were particularly surprising.

16: (GAME-ROUND 3 3)
     Locals:
       BOARD = #2A((_ O X) (X _ X) (X O X))
       M = 3
       N = 3
       REMOVALS = 2

To get to the root of the problem, I wanted to run the won-game check manually, with the values in that stack frame. Doing so is trivial in sldb: I simply put the cursor somewhere in the frame and pressed e for eval. I entered (print (unit-columns? board)) and it gave me back NIL, meaning it didn’t detect the situation as it should have.

A quick look at the function,

(defun unit-columns? (board)
  "If all columns are single-coloured, the game is won!"
  (loop for i from 0 below (array-dimension board 0)
     always
       (loop
          for j from 0 below (array-dimension board 1)
          for elem = (aref board i j)
          for previous = elem then
            (if (tile? previous) previous elem)
          always (same-colour previous elem))))

and it was clear that I had accidentally swapped the dimensions of the board for this check! I normally iterate the board in row-major order, but this was the one case where I needed to do it in column-major order. I swapped the two (loop for i … board 0) and (loop for j … board 1) lines, and pressed C-c C-c to recompile the function and replace the old one.

In the debugger, the old stack frame was still highlighted, so just to make sure I pressed e again, and evaluated the same print as before. This time it detected the situation correctly!

The only thing now that remains is to continue running the program with the correct function. Still with the cursor on the stack frame of interest, I press r and it restarts that specific stack frame but now with the correct definitions in place.

Then this happened.

Remove? e
_ (a)    O (b)    X (c)  

X (d)    _ (e)    X (f)  

X (g)    O (h)    X (i)  

You won the game! It took 2 removals.

I guess it works!

Epilogue

What makes this so pleasant is the interactive, keyboard-driven stack trace. With sldb, you don’t merely look at the stack trace; you play around with it. You inspect every part of the running state of the program, you evaluate expressions inside various stack frames, you reassign local variables, and then you restart execution at an arbitrary stack frame.

This is generally not possible in exception-driven languages, because by the time the expression has bubbled up to the debugger, the stack has already been unwound. It’s still there to observe, but it no longer has any connection to your program.

In Common Lisp, with its condition system, the traceback forms a live snapshot of the current running state of the program, which can be modified to your delight before continuing to run the program.