Entropic Thoughts

Types as Interfaces

Types as Interfaces

For the past few days, I have been toying with an idea for a board game. To test it out, I wanted to write a simple implementation of it. Here’s an example of a type we might need in a critical phase of the game.

-- | A quote for a proposal.
data Quote = Quote
  { _proposal :: Proposal
  , _premium :: Int
  , _share :: Int
  }

In that phase, values of this type need to be communicated back and forth between players in a unicast fashion, so we might want to add fields to indicate who sent it and who received it.

-- | A quote that has both owner and offering player.
data QuoteEtc = QuoteEtc
  { _proposal :: Proposal
  , _premium :: Int
  , _share :: Int
  , _owner :: PlayerId -- ^ Player that owns the process.
  , _offering :: PlayerId -- ^ Player that offers this quote.
  }

But then I realised there are quite a lot of these types where source-and-target fields are required, and that we could annotate them without modifying the underlying type, by creating another type to hold that data:

-- | Create a unicast message from any type a.
data Msg a = Msg
  { _sender :: PlayerId
  , _recipient :: PlayerId
  , _payload :: a
  }

With this, we can construct a value of type Msg Quote which represents the Quote annotated by sender and recipient.

Msg
  { _sender = PlayerId 2
  , _recipient = PlayerId 0
  , _payload = Quote
      { _proposal = undefined
      , _premium = 5
      , _share = 3
      }
  }

Functions that only operate on the _sender and _recipient fields can be written with the appropriate type signature and be generic over payload.1 This example uses object oriented style getters. If you’re unfamiliar with optics/lenses, think of the operator ^. as you would a regular . to get a property from a value in a language like Python or C#.

-- | Determines if a message is intended for player.
msg_for :: Player -> Msg a -> Bool
msg_for player msg =
  player^.name == msg^.recipient

This function does not care what the payload of the message is; it can be used on any list of messages, like

filter (msg_for broker) messages

to extract a list of messages intended for that player, regardless of whether the payload is Quote or anything else.2 Though beginners with Haskell beware: whenever we pass a list of messages into this funcion, it still needs to be a homogeneous list of just one concrete type. We cannot mix both Msg Proposal and Msg Quote in the same list.

Even though Msg is a plain type constructor, it acts an awful lot like an interface. That is the main point of this article; that’s a pattern that can be used to design simple code.


There is a common complaint against what we just did above. People say that types-as-interfaces are not composable. Let’s find out what they mean. Imagine we wanted some messages to be timestamped. We could add an optional _timestamp field to the message type:

-- | Create a message from any type a, with optional timestamp.
data MsgEtc a = MsgEtc
  { _sender :: PlayerId
  , _recipient :: PlayerId
  , _timestamp :: Maybe UTCTime
  , _payload :: a
  }

but what if we also wanted to timestamp some of the other objects we have, like Quote? Aha! We just learned how to do this! We create a new wrapper type:

-- | Annotate any value with a timestamp.
data Timestamped a = Timestamped
  { _timestamp :: UTCTime
  , _contents :: a
  }

Just as we before we can now tack on another layer of data and make a Timestamped (Msg Quote).

Timestamped
  { _timestamp = now
  , _contents = Msg
      { _sender = PlayerId 2
      , _recipient = PlayerId 0
      , _payload = Quote
          { _proposal = undefined
          , _premium = 5
          , _share = 3
          }
      }
  }

Clearly, this approach does compose, because we just composed both Timestamped a and Msg a. But remember the msg_for function we had that determined whether a message was intended for a particular recipient? It had the signature

msg_for :: Player -> Msg a -> Bool

meaning it takes any Msg a but it will not be possible to give it a Timestamped (Msg a); we have to unwrap the message from the timestamp first. However, if we gave it an Msg (Timestamped Quote), it would have worked, perhaps counter-intuitively.

The complaint here is not that the approach does not compose at all (clearly it does), but that it does not compose well: the order in which we choose to annotate our data with extra fields affects whether or not we can pass them into existing functions.

I think that’s basically fine. If we think about it, isn’t Timestamped (Msg Quote) a different-feeling thing from a Msg (Timestamped Quote)? But let’s assume we wanted to fix it. What is near at hand?


We could make a typeclass

class HasRecipient a where
  get_receiver :: a -> PlayerId

This is more like a real interface, which can be implemented by any type that has a receiver. Here are two implementations we would want:

instance HasRecipient (Msg a) where
  get_receiver msg = msg^.receiver

instance HasRecipient (Timestamped (Msg a)) where
  get_receiver tsd = tsd^.contents.receiver

We could then rewrite msg_for in terms of this typeclass instead.

msg_for :: HasRecipient msg => Player -> msg -> Bool
msg_for player msg =
  player^.name == get_receiver msg

and this will work for any value of a type that implements HasRecipient, including Msg a and Timestamped (Msg a).

Okay, so let’s roll down this slippery slope. Maybe we have a function

logger :: Show a => [Timestamped a] -> IO ()

which logs things in order of timestamp. This function will not take a list of Msg (Timestamped Quote) because there the outer value is not timestamped but the Quote inside.

We could apply our newly discovered hammer and create a similar HasTimestamp typeclass.

class HasTimestamp a where
  get_timestamp :: a -> UTCTime

instance HasTimestamp (Timestamped a) where
  get_timestamp tsd = tsd^.timestamp

instance HasRecipient (Msg (Timestamped a)) where
  get_timestamp msg = msg^.payload.timestamp

But at this point it gets a little confusing for this author’s brain – at least if long-term maintenance is desired. What would be the best instance for Timestamped (Msg (Timestamped Quote)), for example?

And as we said before, aren’t Timestamped (Msg Quote) and Msg (Timestamped Quote) slightly different kinds of things? Do we really need to be able to pass both unaltered into that logging function?


What we really should do is take a cue from network protocol design. These are robust things that have stood the test of time.

Image data might be stored in a tga file with headers giving information on how to interpret it. It will be placed into a http request with its own headers. This gets stuffed inside a tcp packet with further headers. That in turn is enveloped in an ip datagram with headers. Which might then run along a wire inside an ethernet frame, carrying – you guessed it – its own headers.

At no point during transmission3 To be fair, I would not be surprised if it was possible to find network switches in the wild that inspect ip headers, or routers that look at http headers. So maybe that whole layered protocol thing was a mistake and I’m full of crap! does a switch hold up an ethernet frame to the light and ask, “So what http content type are you transmitting?” We have designed these protocols to be layered with meaningful structure from outside to in. Maybe we can do that in our code as well.

Maybe we don’t need both Timestamped (Msg Quote) and Msg (Timestamped Quote) in our application, and just one of them is enough. A mathematician creates generalisations that work with everything. An engineer strips away the variants that are less important and adapts the code to the big demands at hand. This makes things simpler along the way.

If we do need both, maybe it’s fine to treat them as two different types (they are!) and not try to make functions generic over them. Maybe. Chris Penner seems to argue for the opposite. I don’t claim to have any silver bullets.