The popular CI server Jenkins provides a rich API to access information about builds.
This API speaks JSON but the JSON it returns has a rather strange shape.
I needed to extract the Git revision built by a job, but Jenkins hides this information in a specific object in a “catch-all” actions
array which contains JSON objects of different shapes, many of which may or may not be present.
With Circe and some Shapeless magic I managed to decode this irregular JSON in a type-safe and fail-safe way (ignoring unknown JSON objects).
This actions
in the JSON model of a build looks as follows.
I converted the JSON to YAML to remove the syntactic boilerplate of JSON and make the snippet easier to read, and I also removed irrelevant parts:
actions:
- parameters:
- name: SERVICE_BUILD
value: '2840'
# […]
- name: GIT_COMMIT
value: 922cc937eb9c9142ebf0d8672a2b13f5fd28ae3e
- causes: # …
- {}
- buildsByBranchName:
refs/remotes/origin/master:
# …
lastBuiltRevision:
SHA1: 922cc937eb9c9142ebf0d8672a2b13f5fd28ae3e
branch:
- SHA1: 922cc937eb9c9142ebf0d8672a2b13f5fd28ae3e
name: refs/remotes/origin/master
scmName: ''
- {}
As you can see this JSON array contains objects of vastly different shapes (compare parameters
to the Git object with buildsByBranchName
and lastBuiltRevision
), and even empty objects—I have no clue why these exist.
I need the GIT_COMMIT
parameter and the SHA1
value in lastBuiltRevision
, and I’d like to ignore all these empty objects and objects I don’t need (like causes
).
To decode this mess I went straight to Circe, the JSON library of my choice.
Circe uses Decoder
type-classes to describe how JSON decodes into case classes.
A straight-forward decoder for the inner lastBuiltRevision
object looks as follows:
import io.circe._
final case class LastBuiltRevision(sha1: String)
object LastBuiltRevision {
implicit val lastBuiltRevisionDecoder
: Decoder[LastBuiltRevision] =
Decoder.forProduct1("SHA1")(LastBuiltRevision(_))
}
For the actions
array I define an ADT that describes all known actions:
sealed trait Action
object Action {
final case class Git(lastBuiltRevision: LastBuiltRevision)
extends Action
object Git {
implicit val gitDecoder: Decoder[Git] =
Decoder.forProduct1("lastBuiltRevision")(Git(_))
}
final case class Parameter(name: String, value: String)
object Parameter {
implicit val parameterDecoder: Decoder[Parameter] =
Decoder.forProduct2("name", "value")(Parameter(_, _))
}
final case class Parameters(parameters: List[Parameter])
extends Action
object Parameters {
implicit val parametersDecoder: Decoder[Parameters] =
Decoder.forProduct1("parameters")(Parameters(_))
}
implicit val actionDecoder: Decoder[Action] = {
import cats.syntax.functor._
Decoder[Parameters].widen
.or(Decoder[Git].widen)
}
}
The Git
and Parameters
case classes with their straight-forward Decoder
instances describe the corresponding objects.
My Decoder[Action]
then tries both decoders, either failing if both decoders fail, or returning whatever the first decoder decodes.
I need to widen
the decoders to Decoder[Action]
explicitly because Decoder
is invariant.
To skip over unknown actions like causes
I introduce another ADT to represent either a known and decoded action, or an unknown action with its JSON:
sealed trait MaybeAction
object MaybeAction {
final case class Known(action: Action) extends MaybeAction
final case class Unknown(contents: Json)
extends MaybeAction
implicit val maybeActionDecoder: Decoder[MaybeAction] =
Decoder[Action]
.map(Known)
.or(Decoder[Json].map(Unknown))
}
The Decoder
instance of MaybeAction
tries to decode an Action
and maps the result to a Known
action.
If this fails it wraps the JSON into an Unknown
action instead.
Now I can decode the entire build into a Build
case class that holds a list of MaybeAction
values:
final case class Build(actions: List[MaybeAction]) {
def knownActions: List[Action] = actions.collect {
case MaybeAction.Known(action) => action
}
}
object Build {
implicit val buildDecoder: Decoder[Build] =
Decoder.forProduct1("actions")(Build(_))
}
The Decoder
decodes the actions
field as List[MaybeAction]
and wraps it into a Build
value.
The Build
case class offers a knownActions
function to collect
all known actions.
To get the build revision I can now decode
a Build
from a response String
and use collectFirst
to extract the Git
action which contains the SHA1 I’m looking for:
import io.circe.parser.decode
def main(args: Array[String]): Unit = {
val sha1 = for {
build <- decode[Build](
Source
.fromResource("jenkins-response.json")
.mkString)
revision <- build.knownActions
.collectFirst { case Action.Git(rev) => rev }
.toRight(
DecodingFailure(
s"No Git information found in $build",
List.empty))
} yield revision.sha1
println(s"Revision SHA1: $sha1")
}
I take advantage of right-biased Either
(Scala 2.12 and later) here; sha1
either holds a Left[DecodingFailure]
or a Right[String]
with the desired hash after the for-comprehension.
Let’s recap the Action
decoder again:
implicit val actionDecoder: Decoder[Action] =
Decoder[Parameters]
.asInstanceOf[Decoder[Action]]
.or(Decoder[Git]
.asInstanceOf[Decoder[Action]])
It explicitly lists all variants; whenver I add a new action I have to extend the Decoder
as well.
Thanks to the coproduct nature of a sealed traits I can also write a generic Decoder
with Shapeless that automatically decodes all variants of the Action
trait—hence a sealed
trait which makes sure that the compiler knows about all variants at compile time.
The shapeless.Coproduct
corresponding to my Action
trait has the following type:
Action.Git :+: Action.Parameters :+: shapeless.CNil
This type reads as ”either an Action.Git
or an Action.Parameters
”
The CNil
tail serves as recursion anchor when iterating over co-products on the type level; coproducts can never have this value at runtime.
A value of this type, eg, an Action.Parameters
value, looks as follows:
Inr(Inl(Parameters(List(…))))
Inr
reads as “skip this position”, and Inl
means “this position has a value”.
I can now inductively define a Decoder
instance for a coproduct represention of Action
:
private implicit val cnilDecoder: Decoder[CNil] =
Decoder.failed(DecodingFailure("CNil", List.empty))
private implicit def cconsActionDecoder[H <: Action, T <: Coproduct](
implicit decodeH: Decoder[H],
decodeT: Decoder[T]
): Decoder[H :+: T] =
decodeH.map(Inl[H, T]).or(decodeT.map(Inr[H, T]))
As said, CNil
never occurs at runtime and only serves as the base case for inductive definitions like this, so the CNil
case above is unreachable, but I need to define it anyway to make recursive implicit resolution terminate.
Let’s look at the more interesting ccons
or :+:
case: It summons two Decoder
instances, one for any H
of type Action
, and another for any T
of type Coproduct
—here I recursively move through the coproduct until implicit resolution ends up at the CNil
case.
With these two instances I can define a Decoder
for the H :+: T
case, ie, the T
co-product with a H
in front of it.
The Decoder
does what the explicit decoder did as well: It tries to decode the T
action, and falls back to decode other actions through the recursive Decoder
instance for T
.
In either case I need to lift the result to a Coproduct
.
If I can decode a T
I lift it with Inl
to say that I have a value for this position in the coproduct, whereas if I fall back to T
I put an Inr
around to say that I need to skip this position and move on to the tail.
I define both implicits as private
to the Action
companion object to not leak them to other code which might wreck havoc of implicit resolution—after all these instances are quite generic.
I can then use these new instances to define the actual Decoder[Action]
instance:
private def genericActionDecoder[Repr <: Coproduct](
implicit genericAction: Generic.Aux[Action, Repr],
decodeRepr: Decoder[Repr]
): Decoder[Action] = decodeRepr.map(genericAction.from)
implicit val actionDecoder: Decoder[Action] =
genericActionDecoder
The genericActionDecoder
takes the Generic
instance for Action
.
Thanks to Scala’s path-dependent types I do not need explicitly specific the Coproduct
shape of Action
—instead I introduce a generic type parameter Repr
and use it to refer to the coproduct representation of Action
.
Implicit resolution then does the rest and figures out what concrete type applies here.
I also need a Decoder
instance for the represention which comes from the coproduct implicits I defined before.
This implicit again is private
; I use it only once to summon the Decoder[Action]
once.
This avoids expensive derivation of Action
whenever I need a Decoder[Action]
instance, and avoids leaking generic implicits into other scopes.
With this generic definition I can now extend Action
with further variants whenever I need to decode more of the actions
array, and I just need to write a Decoder
for the new variant.
Shapeless then does the rest and gives me a complete Decoder
for all action variants.