Version: 2022.02.25a
This is the “Effective Scala Case Class Patterns” guide I wished I had read years ago when starting my Scala journey. There have been many hundreds of hours spent on futile tangents to earn the simplified actionable nuggets described below.
Because it is a fantastic integration of both OOP and FP, the case class is a key workhorse in any Scala software engineering project. Within Scala, it is primarily designed and intended (but not exclusively) to be used as an (immutable) FP Product Type.
Unfortunately, the default case class pattern…
case class Longitude(value: Double)
…while terse, flexible, convenient, and extensively utilized, suffers from a number of issues which fall into the following categories:
- Extension confusion
- Elevated reasoning complexity
- Poor design for FP
- Future technical debt
- Security vulnerabilities
This article aims to propose several new boilerplate patterns that you can use to replace the defaults provided by the Scala compiler to address the above issue categories. I’ll step through an evolutionary process that will produce patterns of increasing detail, any of which can be an “upgrade” stage you might prefer.
DELAYED (bug in IntelliJ’s Template engine):
Additionally, I will be providing the patterns as IntelliJ File Templates making them as easy to instantiate while coding as using the Scala default.
After many years of using case classes, it has become obvious extending it via inheritance is just a bad idea.
Thus, our very first pattern is to merely prepend final
to the original example:
final case class Longitude(value: Double)
This ensures no descendants, well-intended or malicious, can be defined inheriting from the case class where they inadvertently or purposefully abuse the “Liskov Substitution Principle”.
Implementing this desirably affects Overview categories 2 (Elevated reasoning complexity), 3 (Poor design for FP), 4 (Future technical debt), and 5 (Security vulnerabilities).
One of the biggest frustrations a Scala newcomer faces is when they want to “enhance” a simple default case class by making the companion object explicit. For example, if one was to naively prepend the object
as so…
object Longitude {
}
final case class Longitude(value: Double)
All of the compiler-generated code in the “default companion object” just disappears. And code that was dependent upon it will now fail to compile (ex: the tupled
method).
As a software engineer new to Scala, this can be quite confronting and disorienting. There isn’t any “official guidance” on how one might go about making the default companion object explicit. Googling for this isn’t trivial. Here’s how I explored this problem space on StackOverflow in 2014.
So, the solution is to extend the companion object with FunctionN
so it looks like this:
object Longitude extends (Double => Longitude) {
def apply(value: Double): Longitude =
new Longitude(value)
}
final case class Longitude(value: Double)
Scastie Snippet and IntelliJ Code Template Gist
To better understand this, here’s a 2013 post on StackOverflow where I had to explore this in more depth myself.
And this pattern will now be the basis upon which we fill out the remainder of the template.
Implementing this desirably affects Overview category 1 (Extension confusion).
One of the maxims in OOP’s DbC (Design by Contract - Eiffel originated) and FP is to prevent invalid states from being representable. If one successfully aims at and achieves this goal, it dramatically reduces the “guard” code (i.e. preconditions checks) for any clients making use of the case class instances.
A first pass naive implementation (which is also found and recommended in almost all Scala textbooks) is to use the require
functionality within the case class’s constructor. It looks like this:
final case class Longitude(value: Double) {
require(value >= -180.0d, s"value [$value] must greater than or equal to -180.0d")
require(value <= 180.0d, s"value [$value] must be less than or equal to 180.0d")
}
Scastie Snippet and IntelliJ Code Template Gist
This implementation throws an exception at the first require
that fails.
There are three problems with this approach:
- If there are other parameters that also need to be validated, it requires multiple passes to check other parameters which could have been correctly checked in a prior pass if multiple errors were allowed to be returned.
- The
require
implementation forces the client to deal with exceptions (avoid using exceptions for expected errors, like really). In most cases, the performance overhead of the exception infrastructure (both CPU effort and memory pressure/churn) is both significant and essentially unoptimizable (this remains contested). And even outside the poor performance reasons, FP implementations strongly prefer “error by value” as opposed to “error by exception”. - It doesn’t allow the client to “check the preconditions” prior to instantiation. This mixing of concerns prevents optimization opportunities where the constructor, and therefore the memory allocation for the instance, is never invoked because the preconditions are already known to have not been met.
We can address all of these concerns in one fell swoop using a standard separation of concerns pattern.
First, we move all of the validation logic into its own method, generateInvalidStateErrors
. Then, we ensure the apply
method invokes the new
operator and accepts/rejects the instantiation by first validating the passed parameter value(s). It should now look like this…
object Longitude extends (Double => Longitude) {
def generateInvalidStateErrors(value: Double): List[String] =
if (value < -180.0d)
List(s"value of value [$value] must be not be less than -180.0d")
else
if (value > 180.0d)
List(s"value of value [$value] must be not be greater than 180.0d")
else
Nil
def apply(value: Double): Longitude =
generateInvalidStateErrors(value) match {
case Nil =>
new Longitude(value)
case invalidStateErrors =>
throw new IllegalStateException(invalidStateErrors.mkString("|"))
}
}
final case class Longitude(value: Double)
Scastie Snippet and IntelliJ Code Template Gist
While the pattern is now in place, there is still a hole where a client can just use the new
operator to bypass the apply
method in the companion object. That is fixed by marking the case class constructor as private
. That looks like this…
final case class Longitude private(value: Double)
Scastie Snippet and IntelliJ Code Template Gist
It looks like we’re done, right?
Oops! Sneaky attack vectors ahead!
It turns out there are two other compiler-generated constructor pathways we must address
readResolve
method - Supports the compiler-generatedSerializable
interface. This is especially pernicious it instantiates the memory for the case class, and then directly injects the (possibly malicious) deserialized contents into the instance’s memory. This completely bypasses both theapply
method and the object constructor. And this means no validation takes place whatsoever.copy
method - Uses thenew
operator, and can do so because the method is within the private scope of the constructor. This bypasses the validation we moved into the companion object and invoke via theapply
method.
In each of these cases, we want to reroute the method to the companion object’s apply method. It should look like this…
final case class Longitude private(value: Double) {
private def readResolve(): Object =
Longitude(value)
def copy(value: Double = value): Longitude =
Longitude(value)
}
Scastie Snippet and IntelliJ Code Template Gist
If you know the case class will never be used anywhere that utilizes Java serialization, then feel free to remove the readResolve
method.
While I, too, hate Java Serialization, remember many platforms, including those like Akka, Kafka, and Spark, continue to depend upon Java serialization. And this means when they do so, if the readResolve
method is missing, you’ve left your case class open to a malicious attack that bypasses your case class’s immutable invariant encoded in the precondition check implemented in the generateInvalidStateErrors
method.
We have now ensured there are no reasonable ways to instantiate this case class without going through the precondition check (validation of state prior to invoking instantiation overhead). There are pathological pathways that can be used that involve the illicit use of the Java reflection API, and there is no real way for us to protect against those.
The fully expressed pattern should now look like this…
object Longitude extends (Double => Longitude) {
def generateInvalidStateErrors(value: Double): List[String] =
if (value < -180.0d)
List(s"value of value [$value] must be not be less than -180.0d")
else
if (value > 180.0d)
List(s"value of value [$value] must be not be greater than 180.0d")
else
Nil
def apply(value: Double): Longitude =
generateInvalidStateErrors(value) match {
case Nil =>
new Longitude(value)
case invalidStateErrors =>
throw new IllegalStateException(invalidStateErrors.mkString("|"))
}
}
final case class Longitude private(value: Double) {
private def readResolve(): Object =
Longitude(value)
def copy(value: Double = value): Longitude =
Longitude(value)
}
Scastie Snippet and IntelliJ Code Template Gist
Implementing this desirably affects Overview categories 2 (Elevated reasoning complexity), 4 (Future technical debt), and 5 (Security vulnerabilities).
The default strategy with case classes is to use “error by exception”. It is what using require
is. If the Boolean
condition is false, it throws an exception wrapping the error string you provide.
From a proper FP design perspective, exceptions are considered a poor way to manage known error conditions, like a case class’s preconditions. Exceptions are acceptable for exceptional things like running out of memory or opening a database connection. However, they should be avoided when the error is just part of the method’s domain.
For example, it is an inappropriate use of exceptions for a square root method to use an exception when passing a negative number. The square root method should be defined to return either an error (String
) if the input number is negative, or the actual result if the number is positive.
To add “error by value”, we will an additional applyE
method (where E is for Error) which uses an Either
to cover both the correct and the erred input parameter cases. The method looks like this…
def applyE(value: Double): Either[List[String], Longitude] =
generateInvalidStateErrors(value) match {
case Nil =>
Right(new Longitude(value))
case invalidStateErrors =>
Left(invalidStateErrors)
}
This looks remarkably similar to the apply
method. In fact, it is so similar, it is essentially code duplication. So, to remove code duplication, we will reimplement the apply
method to use the applyE
method which now looks like this…
def apply(value: Double): Longitude =
applyE(value) match {
case Right(longitude) =>
longitude
case Left(invalidStateErrors) =>
throw new IllegalStateException(invalidStateErrors.mkString("|"))
}
Scastie Snippet and IntelliJ Code Template Gist
Implementing this desirably affects Overview category 3 (Poor design for FP).
With this new pattern in place, we have now ensured all precondition checking travels through a single method. And the same with instantiation. Assuming immutability has been retained, it has made trivial adding a memoization (a.k.a. caching) strategy.
Here’s an example of the companion object modified to incorporate memoization.
object Longitude extends (Double => Longitude) {
private var cachedInvalidStateErrorss: Map[Double, List[String]] = Map.empty
private var cachedInstances: Map[Double, Longitude] = Map.empty
def generateInvalidStateErrors(value: Double): List[String] = {
cachedInvalidStateErrorss.get(value) match {
case Some(invalidStateErrors) => invalidStateErrors
case None =>
val invalidStateErrors =
if (value < -180.0d)
List(s"value of value [$value] must be not be less than -180.0d")
else if (value > 180.0d)
List(s"value of value [$value] must be not be greater than 180.0d")
else
Nil
val newItem = (value, invalidStateErrors)
cachedInvalidStateErrorss = cachedInvalidStateErrorss + newItem
invalidStateErrors
}
}
…
def applyE(value: Double): Either[List[String], Longitude] =
generateInvalidStateErrors(value) match {
case Nil =>
Right(
cachedInstances.get(value) match {
case Some(longitude) => longitude
case None =>
val longitude = new Longitude(value)
val newItem = (value, longitude)
cachedInstances = cachedInstances + newItem
longitude
}
)
case invalidStateErrors =>
Left(invalidStateErrors)
}
}
Scastie Snippet and IntelliJ Code Template Gist
The memoization strategy shown in the above code snippet is for EXAMPLE PURPOSES ONLY because it’s a terrible default strategy.
Please use one of the many other options available. And specifically, investigate ScalaCache. It is a great generalized caching library that allows choosing between different specialized backing implementations.
- Never override the .equals and .hashCode methods
- When you think you need to do so, use a normal class, and then ensure you very carefully follow the non-trivial full override pattern captured in this StackOverflow post
- Avoid using a case class if any state mutability is needed because the default assumption is the case class is representing a concurrency safe immutable value
- When state mutability is required, use a normal class and be sure to specify if it is concurrency safe
- Avoid using the sealed case objects/classes pattern for enumerations (a.k.a. FP Sum Type)
- In Scala 2.x, prefer using the Enumeratum library
- In Scala 3.x, prefer using the new Enum type
Even if you find some of the above “boilerplate” undesirable, I hope you enjoyed and learned something about case classes such that it makes them more useful to you in your future Scala software engineering challenges.