While I was researching how to do level-order traversals of a binary tree in Haskell, I came across a library called tree-traversals which introduced a fancy Applicative instance called Phases. It took me a lot of effort to understand how it works. Although I still have some unresolved issues, I want to share my journey.
Note: I was planning to post this article on Reddit. But I gave up because it was too long so here might be a better place.
See the discussion.
Note: This article is written in a beginner-friendly way. Experts may find it tedious.
Note: The author is not a native English speaker and is glad to accept corrections, refinements and suggestions.
The Phases
Applicative is capable of reordering effects in the chain of the underlying Applicative, without changing the function application semantics.
To demonstrate this, let's import it first.
ghci> import Control.Applicative.Phases
Then we define two effectful computations.
ghci> let foo = putStrLn "foo" *> pure (3 :: Int)
ghci> let bar = putStrLn "bar" *> pure (4 :: Int)
Tip: If you have trouble reading these
*>
expressions, you can interpret them monadically. Which is to saylet foo = putStrLn "foo" *> pure (3 :: Int)
is equivalent to
let foo = putStrLn "foo" >> return (3 :: Int)
or even
let foo = do { putStrLn "foo"; return (3 :: Int) }
(This is only a trick for readability and is definitely not to say every Applicative is a Monad, although in the case of
IO
it's true.)
Let's chain these computations.
ghci> pure (,) <*> foo <*> bar
foo
bar
(3,4)
(One could use <$>
or even liftA2
for conciseness. But I prefer pure
and <*>
because they feel more "primitive".)
To adapt the Phases
Applicative, add now
ghci> runPhasesForwards (pure (,) <*> now foo <*> now bar)
foo
bar
(3,4)
Which just yields the same result. Boring.
Use delay
to postpone one effect.
ghci> runPhasesForwards (pure (,) <*> delay (now foo) <*> now bar)
bar
foo
(3,4)
Use delay
twice to postpone one effect ... by two turns.
ghci> let baz = putStrLn "baz" *> pure (5 :: Int)
ghci> runPhasesForwards (pure (,,) <*> delay (delay (now foo)) <*> delay (now bar) <*> now baz)
baz
bar
foo
(3,4,5)
Note although the order of the effects changes, the associated function application does not. Which can be seen in the above computation yielding the value (3,4,5)
, which corresponds to the values "extracted" in the order of foo,bar,baz
.
It works like magic.
The definition of Phases
is
data Phases f a where
Lift :: f a -> Phases f a
(:<*>) :: f (a -> b) -> Phases f a -> Phases f b
Tip: if you are unfamiliar with the above syntax, learn about the
GADTs
extension first.A side effect of
GADTs
is that it enables the use of existential quantification in data constructors. The definition of(:<*>)
exploits this. It "hides" the typea
inside the constructor.If we use the
ExistentialQuantification
extension instead, the above definition would bedata Phases f a = Lift (f a) | forall x. f (x -> a) :<*> Phases f xIn this definition, the constructor
(:<*>)
"hides" the typex
. (Notex
does not appear to the left hand side of=
, thus is not a part of the type signature.)
This definition resembles that of a Free Applicative a lot because the (:<*>)
constructor captures the <*>
operation instead of actually performing it. So I read the Free Applicative paper.
Tip: A free something, explained in a non Category Theory way, is a wrapper data type that satisfies something but does not require the underlying type to satisfy it. Instead of performing the actual computation, it captures the arguments into corresponding data constructors, accumulating a chain-like structure, which allows the computation to be interpreted in different ways later. If you are interested, try Free Monads first.
To my surprise, there are two kinds of Free Applicatives: the left-parenthesised and the right-parenthesised.
The left-parenthesised version has the following definition:
data FreeAL f a = PureL a
| forall b. FreeAL f (b -> a) :*: f b
Which corresponds to the canonical form for chaining Applicatives:
pure f <*> x1 <*> x2 <*> ... <*> xn
The right-parenthesised version has the following definition:
data FreeA f a = Pure a
| forall b. f (b -> a) :$: FreeA f b
Which corresponds to right-parenthesised canonical form:
fn <*> (... (f2 <*> (f1 <*> (pure x))))
This form is rarely seen in the real world.
The paper claimed these two definitions were isomorphic and it took me some time to figure out why.
To turn a left-parenthesised form into a right-parenthesised one:
(pure f <*> x) <*> y
=
fmap (&) x <*> (fmap (&) y <*> (pure (\fy -> \fx -> f fx fy)))
where (&)
is the reverse function application operator. (Provided in Data.Function
)
i.e.
(&) x f = f x
To turn a right-parenthesised form into a left-parenthesised one:
u <*> (v <*> pure x)
=
pure (\fu -> \fv -> fu (fv x)) <*> u <*> v
This isomorphism / duality should not be very surprising for two reasons.
- It's analogous to how to implement one of
foldl
/foldr
via the other. - It can be seen as a generalization of the interchange law of Applicatives. (
u <*> pure y = pure ($ y) <*> u
)
The definition of Phases
is almost a definition of the right-parenthesised Free Applicative.
data Phases f a = Lift (f a)
| forall x. f (x -> a) :<*> Phases f x
-- vs --
data FreeA f a = Pure a
| forall b. f (b -> a) :$: FreeA f b
But it isn't. Because the Lift
constructor can capture any effectful applicative computation while the Pure
constructor can only capture a plain value.
I have been playing with Haskell for several years. But until this time I learn that Applicative does not restrict the order of its effects.
Take IO
as an example.
The following expression
pure (,) <*> getChar <*> getChar
when executed, reads two characters from stdin and wraps them in a pair.
Suppose the user inputs ab
.
You might get ('a', 'b')
.
You might also get ('b', 'a')
! This will happen if the second getChar
is executed first.
This indetermination will drive you crazy.
So most Applicatives including IO
are implemented in a way that you can assume a left-to-right order on effects. This guarantees the latter behavior will never occur.
Not specifying of the order of effects is a feature, not a bug™. Some libraries exploit it. Backwards
from the transformers package runs effects in reverse order. Concurrently
from the async package runs effects in parallel.
It's a pity that most Applicative tutorials don't point out this issue. One exception is the tutorial from wikibooks, which includes a dedicated section.
Other references include this Reddit post and this Stack Overflow question.
Guess what this expression will output when executed?
add <*> (multiply <*> pure 10)
where
add = putStrLn "add" *> pure (+ (1 :: Int))
multiply = putStrLn "multiply" *> pure (* (2 :: Int))
The answer is
add
multiply
Which is counterintuitive. At the same time, an astute reader will realize the effects in the chain of Applicatives is assocative. This is to say if one can combine effects with <>
, the final effect of (effect1 <> effect2) <> effect3
is the same as that of effect1 <> (effect2 <> effect3)
. Plus there is pure
, a way to wrap a value into an Applicative with no effect (I guess that's why it's called pure
), which corresponds to mempty
. So we can conclude that it forms a special instance of Monoid!
Being a Monoid means, for any "order-sensitive" Applicative like IO
, the order of the associated effects depends only on the order in which the computation appears in the expression, not on the order in which the computation is performed. If we want to shuffle the effects, we need to move the computations around without changing the semantics of the overall computation. And the "overall computation" in the chain of Applicatives is ... a function application.
The function application operator is the most notorious operator in the world.
- It's not associative:
f a b ≠ f (a b)
- It's not commutative:
f x ≠ x f
- It doesn't even have a visible notation! Suppose it has one,
!
, then a Haskell programzipWith (,) [1..10] ['a'..'z']
would have to be written aszipWith ! (,) ! [1..10] ! ['a'..'z']
. (We don't treat($)
as a valid candidate because a function application operator should be left-associative.)
But we can fix that.
- We know that
f x y ≠ f (x y)
. But we can combinex
andy
first then apply the result tof
byf x y = (uncurry f) ((,) x y)
. - We know that
f (g x) ≠ (f g) x
. But we can composef
andg
first then applyx
to the composition byf (g x) = ((.) f g) x
. - We know that
f x ≠ x f
. But we can swap them byf x = ($ x) f
. - We know that
f x y ≠ f y x
. But we can swapx
andy
byf x y = (flip f) y x
. - We know that
(f x) (g y) ≠ (f g) (x y)
. But we can use(f x) (g y) = (\(x', y') -> f x' (g y')) ((,) x y)
. This will change the order of occurrence of terms fromf,x,g,y
tof,g,x,y
.
These techniques are useless, until you find they can be applied in the Applicative world.
By the way, the function application operator in the Applicative world has a visible notation, that is, <*>
.
Now if you stare at the definition of the Applicative
instance for Phases
, it should look less like a bunch of abstract nonsense to you.
instance Applicative f => Applicative (Phases f) where
pure = now . pure
Lift mf <*> Lift ma = Lift $ mf <*> ma
Lift mf <*> (mh :<*> ty) = liftA2 (.) mf mh :<*> ty
(mg :<*> tx) <*> Lift ma = liftA2 flip mg ma :<*> tx
(mg :<*> tx) <*> (mh :<*> ty) = liftA2 (\g h ~(x,y) -> g x (h y)) mg mh :<*> liftA2 (,) tx ty
It "lifts" the above shuffling techniques to the Applicative level!
We have divided our analysis of Applicatives into two parts - the function application part and the effect part. The former is done. Here comes the latter.
I struggled for a long time trying to understand what happens to the effects when two Phases
are chained with a <*>
. Then I came up with the following data type.
data EffectList a = Singleton a
| Cons a (EffectList a)
deriving Show
Here I model effects as a
. And a
should be a Monoid
.
For example:
- String
"1"
represents "some effect that outputs 1 to stdout when executed". - String
"12"
represents "some (composed) effect that outputs 1 then outputs 2 to stdout when executed". - String
""
represents "no effect".
This data type definition is very similar to that of Phases
,
data Phases f a = Lift (f a)
| forall x. f (x -> a) :<*> Phases f x
except it replaces the f (x -> a)
part.
Then I invent an operator <+>
(I just make up the notation), defined as
infixl 4 <+>
(<+>) :: Monoid a => EffectList a -> EffectList a -> EffectList a
(Singleton xs) <+> (Singleton ys) = Singleton (xs <> ys)
(Singleton xs) <+> (Cons ys m) = Cons (xs <> ys) m
(Cons xs m) <+> (Singleton ys) = Cons (xs <> ys) m
(Cons xs m) <+> (Cons ys n) = Cons (xs <> ys) (m <+> n)
I define it by referencing the definition of <*>
for Phases
.
For example, one case of <*>
is
(mg :<*> tx) <*> Lift ma = liftA2 flip mg ma :<*> tx
If we focus on the effect part and blur the others, we will get
(mg :<*> tx) <*> (Lift ma) = (mg ... ma) :<*> tx
Which corresponds to
(Cons xs m) <+> (Singleton ys) = Cons (xs <> ys) m
Using the same method, we can derive the other cases.
Before experimenting with this operator, let's introduce two functions.
now :: Monoid a => a -> EffectList a
now = Singleton
delay :: Monoid a => EffectList a -> EffectList a
delay = Cons mempty
Which are defined in the same way as those with the same name in Phases
. I list them below for your information.
-- | schedule an action to run in the current phase
now :: f a -> Phases f a
now = Lift
-- | delay all actions by a phase
delay :: Applicative f => Phases f a -> Phases f a
delay ta = pure id :<*> ta
Let's try it.
ghci> now "1" <+> now "2" <+> now "3" <+> now "4"
Singleton "1234"
ghci> now "1" <+> now "2" <+> delay (now "3") <+> now "4"
Cons "124" (Singleton "3")
ghci> delay (now "1") <+> now "2" <+> delay (delay (now "3")) <+> now "4"
Cons "24" (Cons "1" (Singleton "3"))
We can interpret the results in this way: "effects" with the same "nested level" are concatenated together.
For instance, in the third example:
- We get
"24"
first becausenow "2"
andnow "4"
are nested under zerodelay
s. - We get
"1"
next becausenow "1"
is nested under onedelay
. - We get
"3"
last becausenow "3"
is nested under twodelay
s.
To further explain what the <+>
operator does, let's list the (desugared) operands and the result in the third example in parallel.
Cons "" (Singleton "1")
Singleton "2"
Cons "" (Cons "" (Singleton "3"))
Singleton "4"
----------------------------------------------
Cons "24" (Cons "1" (Singleton "3"))
Can you recognize this computation?
It's zipWith (<>)
! (To be more precise, it's zipWith4 (<>)
.)
But not exactly. Here are the differences:
zipWith
accepts normal lists while<+>
acceptsEffectList
s, which is defined as non-empty lists.zipWith
handles ragged lists by dropping the extra parts, while<+>
keeps them.- To use
zipWith
, you use it likeliftA2
. To use<+>
, you use it like<*>
(i.e. interleave the operands with it)
For the sake of completeness, here is the definition of run
, the counterpart to runPhasesForward
. It fuses the effects stored in a EffectList
.
run :: Monoid a => EffectList a -> a
run (Singleton xs) = xs
run (Cons xs m) = xs <> run m
You can use it like this.
ghci> run $ Cons "24" (Cons "1" (Singleton "3"))
"2413"
With the aid of EffectList
, it would be natural to understand how Phases
works. Effects in the same phase will be concatenated in the same cell while effects in the later phases will be concatenated in the next ones.
- Evgeniy Malov provides an implementation for level order traversals of a binary tree. It contains a
zipWith'
function, which keeps the extra parts of ragged lists too! Seems thePhases
Applicative really captures some essentials. - The Applicative paper also mentions
zipWith
as an instance of Applicative.zapp
is similar to<+>
.
- I still consider my method (divide an Applicative into the function application part and the effect part and analyze them one by one) a little complicated. If you have a more concise or a more clever explanation, please let me know.
- Does
EffectList
have a canonical name? Is there an isomorphic structure mentioned in a paper?