SIP-ZZ - NewType Classes
This is a proposal to introduce syntax for classes in Scala that can get completely inlined, so operations on these classes have zero overhead compared to external methods. Some use cases for inlined classes are:
-
Inlined implicit wrappers. Methods on those wrappers would be translated to extensions methods.
-
New numeric classes, such as unsigned ints. There would no longer need to be a boxing overhead for such classes. So this is similar to value types in .NET.
-
Classes representing units of measure. Again, no boxing overhead would be incurred for these classes.
The proposal is currently in an early stage. It’s not yet been implemented, and the proposed implementation strategy is too complicated to be able to predict with certainty that it will work as specified. Consequently, details of the proposal might change driven by implementation concerns.
The gist of the proposal is to allow user-defined classes to extend
from NewType
in situations like this:
class C (val u: U) extends NewType {
def m1(ps1) = ...
...
def mN(psN) = ...
}
Such classes are called newtype classes. A newtype class C must satisfy the following criteria:
-
C
must have exactly one parameter, which is marked withval
and which haspublic
accessibility. The type of that parameter (e.g.U
above) is called the underlying type ofC
. -
C
may not have secondary constructors. -
C
must not be acase class
. -
C
may not define concreteequals
,hashCode
, ortoString
methods. -
C
must be either a top-level class or a member of a statically accessible object. -
C
must be ephemeral (as defined in SIP-15).
The runtime representation of C
is always the same as the runtime
representation of U
(its underlying type). This means that
ClassTag[U]
is used as C
's class tag, Array[C]
erases to []U
in Java, and so on.
The following implicit assumptions apply to newtype classes.
-
Newtype classes are implicitly treated as
final
, so they cannot be extended by other classes. -
Newtype classes are implicitly assumed to have structural equality and hash codes. Their universal methods (
equals
,toString
,hashCode
) cannot be implemented, and are erased to those defined byU
.
Unlike value classes, newtype classes cannot extend universal traits.
Newtype classes are expanded as follows. For concreteness, we assume a
newtype class Logarithm
that is defined like this:
class Logarithm(val exponent: Double) extends NewType {
def toDouble: Double = math.exp(exponent)
def plus(that: Logarithm): Logarithm = Logarithm.logOf(toDouble + that.toDouble)
def plus(n: Double): Logarithm = Logarithm.logOf(toDouble + n)
def times(that: Logarithm): Logarithm = new Logarithm(exponent + that.exponent)
}
object Logarithm {
def logOf(n: Double): Logarithm = {
require(n > 0.0)
new Logarithm(math.log(n))
}
}
Let the extractable methods of a newtype class be all methods that are
directly declared in the class. (It is not possible for a newtype
class to inherit methods.) For each extractable method m
, we create
another method named extension$m
in the companion object of that
class (if no companion object exists, a fresh one is created). The
extension$m
method takes an additional parameter in first position
which is named $this
and has runtime representation as its
type. Generally, in a value class
class C(val u: U) extends AnyVal
a method
def m(params): R = body
is expanded to the following method in the companion object of class C:
def extension$m($this: C, params): R = body2
Here body2
is the same as body with each occurence of this
or
C.this
replaced by $this
.
Overloaded methods may be also augmented with an additional integer to
distinguish them after types are erased (see the transformations of
the add
method in the following steps).
In our example, the Logarithm
companion would be expanded as
follows:
object Logarithm {
def logOf(n: Double): Logarithm = {
require(n > 0.0)
new Logarithm(math.log(n))
}
// generated methods follow
def extension$toDouble($this: Logarithm): Double =
math.exp($this.exponent)
def extension1$plus($this: Logarithm, that: Logarithm): Logarithm =
Logarithm.logOf($this.toDouble + that.toDouble)
def extension2$plus($this: Double, n: Double): Logarithm =
Logarithm.logOf($this.toDouble + n)
def extension$times($this: Logarithm, that: Logarithm): Logarithm =
new Logarithm($this.exponent + that.exponent)
}
In this step any call to a method that got extracted in step 1 into a companion object gets redirected to the newly created method in that companion object. Generally, a call:
p.m(args)
where m
is an extractable method declared in a newtype class C
gets rewritten to:
C.extension$m(p, args)
For instance the two calls in the following code fragment:
val x: Logarithm = new Logarithm(1.0)
val y: Logarithm = new Logarithm(2.0)
val z = x.times(y)
val a = x.plus(12345.0)
would be rewritten to
val x: Double = 1.0
val y: Double = 2.0
val z = Logarithm.extension$times(x, y)
val a = Logarithm.extension2$plus(x, 12345.0)
At this point, using U
(the underlying type of C
) we calculate
C
's transitive underlying type (called W
). This is done by the
following recursive definition:
- If
U
is a newtype class, use the transitive underlying type ofU
. - Otherwise, use
U
as the transitive underlying type.
We now do the following three replacements:
-
We replace every occurence of the type
C
in a symbol’s type or in a tree’s type annotation byW
. -
We replace every occurence of
new C(e)
wille
. -
We replace every occurence of
(c: C).u
withc
. (After applying replacement 1 the type ofc
here will actually beW
.)
We then re-typecheck the program.
Types such as Array
, ClassTag
, Class
, etc. will all be rewritten
according to these same rules:
Array[C]
becomesArray[W]
ClassTag[C]
becomesClassTag[W]
Class[C]
becomesClass[W]
Newtype classes are removed at this stage. That is, the class C
is
removed (although the companion object C
stays). Other than the
companion, there should be no mention of C
(either as a class or a
type) in any tree.
The effect of this is that at runtime values whose provided type was
C
are not distinguishable from values whose type is W
.
The program statements on the left are converted using steps 1 to 3 to the statements on the right.
var m, n: Logarithm var m, n: Double
var o: AnyRef var o: AnyRef
m = n m = n
o = m o = m.asInstanceOf[AnyRef]
m plus n Logarithm.extension1$plus(m, n)
o.isInstanceOf[Logarithm] o.isInstanceOf[Double]
o.asInstanceOf[Logarithm] o.asInstanceOf[Double]
m.isInstanceOf[Logarithm] m.isInstanceOf[Double]
m.asInstanceOf[Logarithm] m.asInstanceOf[Double]
After all 3 steps the Logarithm
class is translated to the following
code.
object Logarithm {
def logOf(n: Double): Double = {
require(n > 0.0)
math.log(n)
}
// generated methods follow
def extension$toDouble($this: Double): Double =
math.exp($this)
def extension1$plus($this: Double, that: Double): Double =
Logarithm.logOf(Logarithm.extension$toDouble($this) + Logarithm.extension$toDouble(that))
def extension2$plus($this: Double, n: Double): Double =
Logarithm.logOf(Logarithm.extension$toDouble($this) + n)
def extension$times($this: Double, that: Double): Double =
$this + that
}
Note that the two plus
methods end up with the same type in object
Logarithm
. (The fact that they also have the same body is
accidental). That’s why we needed to distinguish them by adding an
integer number.
The same situation can arise in other circumstances as well: Two overloaded methods might end up with the same type after erasure. In the general case, Scala would treat this situation as an error, as it would for other types that get erased. So we propose to solve only the specific problem that multiple overloaded methods in a newtype class itself might clash after erasure.
We could imagine this newtype feature being generalized to tuples. In that case, a newtype class could wrap N values (0-22), and be represented either by 0-22 positional parameters (in the extension methods) or by a single tuple of the appropriate arity (when a reified value was needed, e.g. for use in a collection).
Newtype classes as written here can already wrap tuples in a transparent way, so the required addition would be allowing multiple parameters to be provided positionally rather than in a tuple (i.e. adding two kinds of extension method for every method in the base class).
Work in this direction should only start once single-parameter newtype classes are implemented (at least experimentally), since one of the places multi-value newtype classes may struggle is in performance.