Three years ago, @hwayne posted A Totally Polished and not-at-all half-baked Take on Static vs Dynamic Typing to his email newsletter. When I first read it, I found it refreshing to read a more balanced programming-related post than I typically encounter, and I also learned some new ideas I'd never heard about before. However, this Gist is not really about all that; it's about one specific statement in the post that is incorrect. The error doesn't affect any of the arguments being made, nor am I criticizing Wayne for the error. Rather, correcting the statement allows me to explain some pretty interesting concepts that might be insightful to some people who read this.
Here is the statement I'll be talking about, along with the context of the statement:
Maybe a -> a
is a subtype ofa -> Maybe a
: anything that typechecks with the latter should typecheck with the former.²[...]
²Depending on how you think about
Maybe
. If you considerMaybe a
asa ∪ {Nothing}
then anyf: Maybe a -> a
accepts a strict superset of values and outputs a strict subset of values asa -> Maybe a
. Haskell doesn't do this because 1) it treatsJust 2
as a distinct type from2
and 2) function subtyping adds a lot of complexity.For example, function subtyping plays hell with parametric polymorphism.
Maybe a -> a
is a subtype ofa -> a
. That means we lose the theorem that the onlyf: a -> a
isid
!
It is the last sentence that is incorrect:
That means we lose the theorem that the only
f: a -> a
isid
!
The theorem is actually maintained even with this subtyping. The confusion lies in conflating two subtly distinct concepts. First, I will explain the two concepts and how they differ from each other. Next, I will use the two concepts to explain why the statement is incorrect. And then, I will explain some other interesting things related to parametricity and subtyping.
(As an aside, I believe the first sentence of the quote has a typo where "latter" and "former" should be swapped; values of a subtype should typecheck as also being a value of a supertype, not the other way around.)
Let's start by looking at the statement before it, which is correct (for the notion of subtyping used in the quote):
Maybe a -> a
is a subtype ofa -> a
.
(We will refer to this statement as S1.)
What exactly does this statement mean? It means, for example, that:
Maybe Int -> Int
is a subtype ofInt -> Int
.
Maybe String -> String
is a subtype ofString -> String
.
Maybe (Maybe Bool) -> Maybe Bool
is a subtype ofMaybe Bool -> Maybe Bool
.
And, more generally, it means:
For any specific type
τ
, the instantiated typeMaybe τ -> τ
is a subtype of the instantiated typeτ -> τ
.
When I say "the instantiated type", I mean the replacement of the type variable a
with a specific type, such as Int
, String
, or Maybe Bool
. You see, the a -> a
in S1 is not itself a type, but merely represents many specific types, such as Int -> Int
, that share a particular form.
Now, let us look at the following statement, which is also correct:
[T]he only
f: a -> a
isid
.
(We will refer to this statement as S2.)
What exactly does this statement mean? Does the use of a
in S2 mean the same thing as it did in S1? Let's try replacing a
with specific types, as we did for S1:
The only
f: Int -> Int
isid
.
The only
f: String -> String
isid
.
The only
f: Maybe Bool -> Maybe Bool
isid
.
Are these statements correct? No! (+1)
, reverse
, and fmap not
are respective counterexamples to these three statements. So we know S2 cannot mean:
For any specific type
τ
, the onlyf: τ -> τ
isid
.
So what does it mean? It means:
The only
f: forall a. a -> a
isid
.
You may not have seen the forall
keyword before, but the polymorphic types you see in Haskell are implicitly forall
types. Recall how I mentioned that the a -> a
in S1 was not itself a type, but simply represented types like Bool -> Bool
. In contrast, forall a. a -> a
is an actual type, distinct from types like Bool -> Bool
. We can say that id
has the type forall a. a -> a
just like we can say that not
has the type Bool -> Bool
.
However, a value of a forall
type can be used as any instantiation of the type inside the forall
. So even though id
has type forall a. a -> a
, we can use it as a Bool -> Bool
. Typically, when someone talks about a value with a forall
type, they will implicitly be using it as an arbitrary instantiation of the type inside the forall
, as opposed to the forall
type itself. For example, you might encounter a sentence like:
iterate
has type(a -> a) -> a -> [a]
, so it takes ana -> a
and ana
and returns an[a]
.
This actually means:
iterate
has typeforall a. (a -> a) -> a -> [a]
, so for any specific typeτ
, it can be used as type(τ -> τ) -> τ -> [τ]
, which takes aτ -> τ
and aτ
and returns a[τ]
.
But why did this same instantiation not work for S2? Notice that I said "a value of a forall
type can be used as any instantiation [...]". But the erroneous instantiation in S2 wasn't for a value with a forall
type because S2 is a statement about the type forall a. a -> a
itself, which, as I said earlier, is a distinct type from its instantiations.
Now that we've distinguished the two concepts, let's go back to the original erroneous statement:
Maybe a -> a
is a subtype ofa -> a
. That means we lose the theorem that the onlyf: a -> a
isid
!
I think what happened is that Wayne was thinking of a -> a
as forall a. a -> a
, but was thinking of Maybe a -> a
as its specific instantiations, like Maybe Int -> Int
, which have values other than id
. But specific instantiations of a -> a
also have values other than id
, as I mentioned when I discussed S2. An instantiation of Maybe a -> a
isn't a subtype of forall a. a -> a
, it's a subtype of the corresponding instantiation of a -> a
. So this doesn't affect the theorem that the only f: forall a. a -> a
is id
.
In fact, the notion of subtyping we're considering here can't affect parametricity at all. In object-oriented languages, subclassing adds new values to an existing type (the superclass). But the subtyping we're considering here doesn't do that. Instead, the values of the subtype correspond to existing values of the supertype. Since we're not adding or removing values from any type, the free theorems are the same as they are in real Haskell.
How do the values of a subtype correspond to existing values of a supertype? Here's an example: consider the function maybe 0 (+1) :: Maybe Int -> Int
. Since Int -> Int
is a supertype of Maybe Int -> Int
, maybe 0 (+1)
should be usable as an Int -> Int
. And it is: the Int
inputs to the Int -> Int
are propagated to the Just
branch of maybe 0 (+1)
, while the Nothing
branch is ignored. This returns the input incremented by 1, but this is just the function (+1)
, which was already a possible value for Int -> Int
. So maybe 0 (+1)
in the subtype corresponds to the already existing (+1)
function in the supertype.
I will call two values "equivalent" if they behave the same when used as the same type. So maybe 0 (+1)
is equivalent to (+1)
because when maybe 0 (+1)
is used as Int -> Int
, it behaves the same as (+1)
. Notice that maybe 1 (+1)
is also equivalent to (+1)
, since the Nothing
branch of a Maybe Int -> Int
isn't observable when it's used as an Int -> Int
.
Interestingly, this causes tension with another way people sometimes look at subtyping. If we think of a type as a set of values, then we can think of a subtype as a subset of the supertype. A subset has at most as many elements as the original set. But function subtyping, specifically the contravariance of parameter types, doesn't quite work like this. For example, Maybe Bool -> Bool
is a subtype of Bool -> Bool
, but the former has 8 elements, while the latter has only 4. This is because each value of Bool -> Bool
is equivalent to 2 different values of Maybe Bool -> Bool
: one which maps Nothing
to True
and one which maps Nothing
to False
.
I said earlier that S1 was intended to be about instantiations rather than forall
types. But it turns out that the following statement is also true!
forall a. Maybe a -> a
is a subtype offorall a. a -> a
.
In order for this to be true, forall a. Maybe a -> a
can't have any values that aren't equivalent to some forall a. a -> a
. Since the only forall a. a -> a
is id
, one of the following statements must be true:
- There are values of type
forall a. Maybe a -> a
, but they are all equivalent toid
. - There are no values of type
forall a. Maybe a -> a
.
It turns out that it's the second statement that's the case. forall a. Maybe a -> a
has no values because it is actually impossible to write a function with that type. Here's an attempt to write such a function:
f :: Maybe a -> a
f (Just x) = x
f Nothing = {- ...??? -}
We need to return an a
even when passed Nothing
, but in that case, there doesn't exist any possible a
to return! No a
values were passed in to the function (since Nothing
doesn't contain an a
value), and parametricity means we cannot create a new a
out of thin air. This means there can't be any functions with type forall a. Maybe a -> a
.
Furthermore, it is also true that:
forall a. a -> a
is a subtype offorall a. a -> Maybe a
.
For this to be true, all values with type forall a. a -> a
must be equivalent to some forall a. a -> Maybe a
. The only value with type forall a. a -> a
is id
, so we just need to show that id
is equivalent to some forall a. a -> Maybe a
. The Just
function has type forall a. a -> Maybe a
and returns a value equivalent to its input, so the function is equivalent to id
.
Are there any other values of type forall a. a -> Maybe a
? There happens to be exactly one other value of this type: const Nothing
. There are no other values of this type, because any such function must satisfy one of the following:
- It always returns a
Just
value. We already counted theJust
function. A different function would have to return aJust
that contains a differenta
than its input for at least one input, but there is nowhere else for any othera
to come from, so there cannot be any other function that always returns aJust
. - It always returns
Nothing
. We already countedconst Nothing
. Any function that always returnsNothing
is equal to this, so there are no other functions to consider for this case. - It sometimes returns a
Just
value and sometimes returnsNothing
. But such a function is impossible. What determines which case the function returns for a particular invocation? Since we're only talking about pure functions here, it cannot be based on randomness or I/O; it can only be based on the function's input. But parametricity means we cannot examine the input at all to find any characteristics we can use to decide whether to return aJust
value orNothing
for that particular input. Thus, a function of typeforall a. a -> Maybe a
must always use the same constructor for the return value for all inputs.
So we have determined that forall a. a -> Maybe a
has exactly two values: Just
and const Nothing
. Also, we have shown that forall a. a -> a
is a subtype of forall a. a -> Maybe a
. Since we showed earlier that forall a. Maybe a -> a
is a subtype of forall a. a -> a
, we also have that forall a. Maybe a -> a
is a subtype of forall a. a -> Maybe a
by the transitive property of subtyping. This resembles the statement that Wayne had originally started his discussion of this topic with:
Maybe a -> a
is a subtype ofa -> Maybe a
: anything that typechecks with the [former] should typecheck with the [latter].
However, as I said earlier, he was referring to instantiated types; it is this subtype relationship that we make use of in our everyday programming. What I have just shown is regarding forall
types; this subtype relationship is basically just an interesting academic exercise.
So let's end on a more practical note. In the original post, Wayne referred to function subtyping as "something unrepresentable in Haskell", but you actually can make use of the subtyping we've been discussing, and it is often quite simple:
- If you have
f: a -> a
, thenJust . f
gets you an equivalenta -> Maybe a
. - If you have
f: Maybe a -> a
, thenf . Just
gets you an equivalenta -> a
. - If you have
f: Maybe a -> a
, thenJust . f . Just
gets you an equivalenta -> Maybe a
.
And in general, if you want to utilize type-safe function subtyping for Maybe
(or other types, like Either
), you can use function composition to get the "upcasting" functions you need. So while some people may say "Maybe Not", you can instead feel free to "Just Use Just
".