[TOC]
* language is to write correct high-performance code in processing large volumes of static, changing and streaming data. Currently many programming languages are not well geared to manipulate large and changing data required by modern big data tasks. Many languages used to write Big Data and Database systems use language which were not really designed to build such systems.
A modern built language should be usable in:
- mass manipulation of data which is
- streaming
- changing
- reasonably low level for systems programming
- highly efficient and fast for
- low latency
- high throughput workloads
The type system would be statically typed and would include:
- nominal typing
- structural typing
- range types
Type definition is by using type
keyword. E.g.
type A
In the above case the type inherits the structure from {}
with the nominative label A
.
Multiple types can be defined:
type A, B, C
Each items can have own elements and common elements. E.g. when sub-typing is involved:
type A{def a : X = x}, B{def b : X = x}, C{def c : X = x} <:: Z <X>(x: X){def common : X = x}
Common elements appear towards the end.
When sub-typing is not involved
type A{def a: X = x}, B{def b: X = x}, C{def c: X = x} : <X>(x: X){def common: X = x}
Only one primary constructors can be specified. Multiple constructors can be simulated. E.g.
type A{def a : X = x}, B{def b: X = x}, C{def c: X = x} : <X <:: Number>(x X){
// Factory methods
def .<X>(x, y : int), .Named1<X>(x, y: int) = <X>(x + y), <X>(x, y)
def .<X>(x, y : int; z long) = <X>(x + y, z :: int)
def .Named2<X>(x, y: int; z: long) = Named1<X>(x + y, z :: int)
// Non memeber constant
def .c = 1
// Non memeber functions
def mAddP, mMinusP : <X>(p: X) -> X = common + p + c, mAddP<X>(-p)
def common: X = x
// Memeber functions
def mAddP, mMinusP : <X>(p: X) -> X = common + p, mAddP<X>(-p)
}
def A.a' : Int = 1 // Associated with A
Following type operators are available:
:
this is to define structural type alias::
this is to define nominative type alias:::
this is to asset structural or nominative type. This cannot be used in a type definition.:>
this is to define a structural super type of::>
this is to define a nominative super type of:::>
this is to asset a structural or nominative super type of<:
this is to define a structural sub type of<::
this is to define a nominative sub type of<:::
this is to asset a structural or nominative sub type of<:>
this is to asset a structural sub or super type of<::>
this is to asset a nominative sub or super type of<:::>
this is to asset a structural or nominative sub or super type of&
intersection type|
union or sum type*
product type=:=
test structural type equivalence=::=
test nominative type equivalence>:>
test structural super type>::>
test nominative super type<:<
test structural sub type<::<
test nominative sub typein
ordered set / range definition#
expand type structure
Types can be atomic or composite:
- Atomic types. E.g.
Int
. These are also language defined. E.g.type <32, BusSize, 0, PlatformEndiannes> Int
this means typeInt
is 32 bits and aligned atBusSize
bit boundary and stating at bit 0. - Composite types – made up of elements of atomic types
- unnamed tuples with unnamed elements (tuples). E.g. Type definition:
type A(Int, String)
; value definitiondef a: A = (1, "Bob")
- unnamed tuples with named elements (records). E.g. Type definition:
type B{ def rank: Int; def name: String }
; value definition:def b: B = {rank = 1, name = "Bob"}
- named tuples with unnamed elements (named types). E.g. Type definition:
type ClassRank(Int, String)
; value definitionsdef c:: ClassRank = ClassRank(1, "Bob")
- named tuples with named elements (names records). E.g. Type definition:
type ClassRank`{ def rank: Int; def name: String}
; value definitionsdef d :: ClassRank` = ClassRank`{rank = 1, name = "Bob"}
- unnamed tuples with unnamed elements (tuples). E.g. Type definition:
NB: the distinction is in the definitions where ::
is used for nominative and :
is used for structural definitions. If it is nominative the name is also used as part of the initialization, e.g., ClassRank(1, "Bob")
, ClassRank`{rank = 1, name = "Bob"}
.
Following typing relationships hold:
- tuples are super type of records
- tuples are super type of names tuples
- records are super type of named records
- named tuples are super type of named records
- transitively tuples are super type of named records through records
The named part in named records and named tuples if the nominative portion of the types system while the rest is the structural part.
Atomic types are language defined. E.g.
type <32, BusSize, 0, PlatformEndiannes> Int
Here <>
appear next to the type
keyword. 1st parameter is the number of bits followed by the alignment, offset and endiannes.
Tuples are constructed as:
type Point : (Int, Int)
def a: Point = (1, 2)
They can be deconstructed as:
def (b, c: Int) = a
The data structure can be extended as follows:
type A : (Int, Int)
type B <: A (String) // <:: can be used for nominative inheritance. Now B will be(Int, Int, String)
type B <: A (String, #A) // Now B will be(String, Int, Int)
Named tuples are constructed as:
type Point(Int, Int) // Shorthand definition
type Point :: | Point(Int, Int) // Alternative definition
def a = Point(1, 2)
They can be deconstructed as:
def (b, c: Int) = a
Or
def Point(b, c: Int) = a
In the latter case a
should be named Point
.
Tuples are constructed as:
type Point : {def x, y: Int}
def a: Point = {x = 1, y = 2}
They can be deconstructed as:
def {b, c: Int} = a
or
def {b: Int = a.x, c: Int = a.y}
Named tuples are constructed as:
type Point{def x, y: Int}
def a = Point{x = 1, y = 2}
They can be deconstructed as:
def {b, c: Int} = a
or
def {b: Int = a.x, c: Int = a.y}
or
def Point{b, c: Int} = a
or
def Point{b: Int = a.x, c: Int = a.y}
In the latter 2 case a
should be named Point
.
Type and value parameters are specified following .
and |
. E.g.
type Rectangle<T <:: Number>{def x, y : T}
type Square <:: Rectangle . <S>{def common :: T}
| <T <:: Number>{def p : T} => <.T = T >{.x = .y = x} // T and p are parameters and not associated with the structure, while S and common are
def s = Square.<Int>{10}
Types can be extended and also multiple parameters supported
type A{def a :: String} . | {def x, y : String} => A{ a = x + y }
type B{b :: String} <:: A .
| {x : String, y : String} => A.{x, y}, B{ b = y }
def b = B.{"Hello", "World"}
You can have multiple parameters. They can also be named. E.g.
type C{c : String} .
| {x: String} => C{c = x}
| Alt{y : String} => C{c = Alt.y}
def c = C.Alt{"C"}
This is loosely analogous to constructors in other languages.
Following type relations can be defined
- Aliases
- Subtype
- Supertype
Type aliases are defined using the :
, ::
or :::
operators.
type B : A // Type B is structural type which is an alias to the structure of A. If A is nominative the nominative component is ignored.
type C :: A // Type C is nominative type which is an alias to the structure and name of A if A is also a nominative type
type X<D ::: A>(d: D) // Type D is structural or nominative type of A
Sub types are defined using the <:
, <::
or <:::
operators.
type B <: A // Type B is structural sub type of A ignoring the nominative part of A if any
type C <:: A // Type C is nominative sub type of A
type X<D <::: A>(D) // Type D is structural or nominative sub type of A
Multiple subtypes can be defined as:
type D{def p, d : Int}, E{def p, d : Int}, F{def p, d : Int} <:: G, H, I . {common : Int}
| {common : Int} => {.common = common}
Super types are defined using the :>
, ::>
or :::>
operators.
type B :> A // Type B is structural super type of A ignoring the nominative part of A if any
type C ::> A // Type C is nominative super type of A
type X<D :::> A>(D) // Type D is structural or nominative super type of A
Bi variant types are defined using the <:>
, <::>
or <:::>
operators.
type X<B <:> A>(B) // Type B is structural sub or super type of A ignoring the nominative part of A if any
type X<C <::> A>(C) // Type C is nominative sub or super type of A
type X<D <:::> A>(D) // Type D is structural or nominative sub or super type of A
When structural types or variables are defined the nominative component is dropped, i.e.,
- named tuples unnamed elements become unnamed tuples with unnamed elements
- named tuples with named elements become unnamed tuples with named elements
Tuples are a super type or records with the same type signature.
Named Tuples and Named Records can be used in Discriminated Unions.
type Pet{def name: String} :: | Cat | Dog // name is a common field
If types are pre defined then type unions can be used for similar effect. E.g.
type Cat{def name : String}
type Dog{def name : String}
type Pet :: Cat | Dog
Each of the types can also be extended. But if it is sub-typing all union members can only have sub-typing. Like wise when super-typing.
type PetRecord{def age : Int} <:: Pet, // name (from the previous definition) and age are common to all and the super-type also has name which is initialized.
| PetCat <:: Cat
| PetDog{def pedigree : String} <:: Dog // pedigree is a new field
| PetParrot
Super-typing:
type PetRecord{def age : Int} ::
| PetCat
| PetDog{def pedigree : String}
| PetParrot
type Pet{def name: String} ::> PetRecord, // A name field is added
| Cat ::> PetCat
| Dog ::> PetDog
Also, each type can have shared type and value parameters or individual parameters.
type X<T1, T2>{def p1: T1} :: // T1, T2 and p1 belongs to X and is common to A and B also
| A<T3>{def p2: Int} // T3, p2 belongs to A
| B{def p3: String} // p3 belongs to B
In case of structural types. Elements should also be different. E.g.
type P :
| A{def a: Int}
| B{def b: Int}
Following is not supported it cannot be matched on structure alone
type P :
| A{def p: Int}
| B{def p: Int}
In all cases, the definition of each item is preceded by |
.
Each type of the language will have structure and an optional symbol attached. In nominative type checking symbol hierarchy and structure is checked. This is only applied to named tuples and named records. In structural type checking, of the purpose of this language, the symbol hierarchy is not checked, but the structure is checked. This is applied to records and tuples.
Nominative type system is implemented by maintaining a hierarchy of symbols associated with the type names. In type checking, it is checked to see if the type is a structurally a subtype and the symbol participates in the hierarchy. E.g. type B <:: A
will record that symbol B
lower in the symbol hierarchy than A
.
In structural type checking, the structure is checked to see if:
- Subtype has 0 or more new field than the supertype
- All subtype field are the same type or subtype of the respective supertype fields
- All subtype function arguments are the same or subtypes, while the return values are of the same or supertype
The type system will support nominal typing.
type B <:: A
The above denotes B is a subtype of A. Similarly, the above relationships can also be denoted as:
type A ::> B
The latter can also be used to rewrite type hierarchies. Though this might a not viewed well in an OOP context, this many of the use cases which Bytecode generation libraries facilitate. This also gives the language a dynamic language feel, due to flexibility and also avoids certain pitfalls in supporting mixed structural typing and nominative typing.
Similarly, a type can be a subtype of more than one other type. This is akin to multiple inheritance in OOP.
type C <:: A & B
More explicitly you can specify how each type is treated.
type C <:: { #A } & B // C inherits only the structure of A but inherits the structural and labelling of B
Multiple inheritance is denoted by a &
separated list of types.
The following scheme can be used to define structure associated with a type:
type A {
def a : Int
}
Similarly, the structure can be extended as follows:
type B <:: A {
def b : Int
}
Or
type B {
def b : Int
} <:: A
Or
type B <:: A . {
def b : Int
}
In the above case B
inherits the structure of A
, the nominative relationship and the following structure:
{
def b : Int
}
Type alias is defined as:
type X :: Y
This means X
is of type Y
.
NB: When ::
appear towards the right in the RHS this is casting or type cohesion. E.g.
def a = x :: Int
Structural typing works similar to nominative typing. E.g.
type B <: A
The above defines B
with the structure of A
. In the case,
type C <: B
then C
will have the structure from B
. In the context of a language like Java, C#, etc. this will be like copying all the body of class
A
and B
into C
.
The following scheme can be used to define structure associated with a type:
type A {
def a : Int
}
Structural type alias is defined as:
type X : Y
In this case X
has the same structure of Y
but not the sub typing relationships.
Both structural and nominative types can be mixed in defining other structural and nominative types. E.g.
type A {
def a : Int
}
type B {
def b : Int
}
type C <: A // C only gets the structure of A
type C <:: { #A } // C only gets the structure of A, hence the same as above
type D <:: B // D gets the structure and establish the nominative hierarchical relationship between B and D
type X : { // No sub typing relationship, hence structure is inherited
#A // expand A
#B // expand B
}
type Y <: A, B { // Sub typing relationship and structure is inherited
def f1: Int
#B // expand B
def f2: Int
#A // expand A
def f3: Int
}
type A
type B <: A
type C <: A
type X : A | B | C // Sum type
type Y : A * B * C // Product type
type Z<S, T> : S | T
type X :: A & B & C // X is a sub type of A, B & C
type Y : A & B & C // Y shared structure among types A, B & C
type Z :: A | B | C // Equivalent to sum type: A + B + C
type X in 1..1..10 // Ordered range of values between 1 and 10 both inclusive with step 1. This can be also written as: 1..10
type Y <:: X in 2..9 // Any range type between 2 to 9 inclusive of type X or Y can be assigned
type Z <: X in 2..9 // Value of range type X between 2 to 9 inclusive can be assigned regardless of type
type A in P, Q, R, S, T // Ordered set of values
type B in P | Q | R | S | T // Unordered set of values
type C in 1 | 2 | 3 | 4 | 5 // Unordered set of values
type D in 1.|.1.|.10 // Unordered set of values in range 1 to 10 inclusive with step 1. This can be also written as: type D of 1.|.10
type E <: Int in Int // All numbers in the range of Int. Same as: type E <: Int
type F <: E in 1... // All numbers in the range of Int from 1
type G <: Int in ..|.. // All numbers in the range of Int as nominal numbers
type H <: G in 1.|.. // All numbers in the range of Int from 1 as nominal numbers
type I <: A in P..S // Ordered subrange of A
type J <: A in P.|.S // Unordered subrange of A
Tag types associate value of other types.
type A
def a = 1
type @X = 1 // Literal value tag
type @Y = A // Type tag
type @Z = a // Variable value tag. Variable should be know at compile time.
type I : Int @(1) // Int type with tag
def u : Int @X = 1 @X // Equivalent to def u :: Int @(1) = 1 @(1)
def v : Int @(1) = 1 @(1) // Add tag to type check. Or def v : Int @(1) = 1 : Int @(1)
type @P = _ > 5 // Value tag who's value is greater than 5
type @Q = ? <: A // Type tag with a sub type of A
type @R = ? <: A & _ > 5 // Multiple tags. All tags must be satisfied.
type @S = A & !1 // Must exclude values tagged as 1
def x : Int = v ~@(1) @(A, 2) // remove tag 1 and retag with A and 2
type J : Int @() // No tags allowed
def y : Int = v ~@ // remove all tag
type @T = A | 1 // Must have tags 1 or A
type @U = ? | _ // Can have any value or type tag
type @V = A & 1 | 3 // Must have tags A and 1 or 3
type @H = @X, 1, 2, 3 // H inherits tags for X plus 1, 2, and 3
If @
appears at the start of type definition it is treated as and annotation. E.g. @my def f = 1
. If it appears at the end it is treated as a tag type like in the above examples. Annotations are not used for type checking but all tags should type check for assignment.
In tag type definitions logical operators (&
, |
, !
, ^
) are allowed. The values which are assigned can have multiple tags. They are added by one at a time (@A
) or as a list (@(A, 1)
). Value tags can be removed (~@(A, 1)
).
Constraints can be imposed through the type system.
type A : Int > 5 // A in an Int greater than 5
type B{def b : Int}
type C : B{def > 5} // C is of B with property b greater than 5
Type parameters can be values or types. E.g.
Wide card types are denoted by ?
.
def a : A<?> // ? denotes any type
def b : B<X: ?, X, Y> // X is any type. You just have to mark only on parameter
Existential type can be used by having ?
wild cards in the RHS of the type definition.
type C : A<?> // existential type
type D : B<X: ?, X, X> // existential type with constraint that all parameters are the same
type E : A<?<...>> // ?<...> a type with a type constructors which takes multiple arguments
type F : A<?<?, ...>> // ?<?, ...> a type with a type constructors which takes one or more arguments
type G : A<?<?>> // a single type with a type constructors which takes another type taking zero arguments
Each non constant value which shared among threads is treated as and active object. There 3 elements in the active object stream:
- Inputs
- Operations
- Outputs
Variables are defined with def
. E.g.
def x : Int = 1
def y : Int > 5 = 6
def a : A { a >= 6 } = A { a = 6 } // a should be of type a with field a which should be greater or equals 6
def v : Vector(length == 10) = p // property length must be 10
def b :> B { a :: Int, b :: Int > 0 } = C { a = -1, b = 1 }
Functions takes a named or unnamed record or tuple to another named or unnamed record or tuple.
type f1<T>(a: T) -> T // A simple function
type f2 <:: f1 // f2 is a sub type of f1, i.e., it takes a T or sub type of T to T or a super type of T
_
is used to search for a value in the environment. Similarly, [...]
can be used to search in the declaration scope, e.g., [a*]
a variable stating with a
in local scope or a[a]
a variable a
within structure a
.
Within a symbol *
means zero or more and ?
exactly 1 character.
def x : Int = _ :: Int // find last defined Int in scope
def x : Int = _ // find last defined Int in scope. Int is inferred.
def x : Int = (_ :: Int)[0] // find the 1st defined int in scope
def x : Int = _[0] // find the 1st defined int in scope. Int is inferred
def x : Int = (_[`a*] :: Int)[0] // find list value this is an Int and starts with a in the environment and select the 1st
def x : Int = _[`a*][0] // find list value this is an Int and starts with 'a' in the environment and select the 1st. In is inferred.
def y : Int[] = a[> 5] // find elements in 'a' with indices greater than 5
def y : Int[] = a[6...] // find elements in 'a' with indices greater than 5
def y : Int[] = a[6..10] // find elements in 'a' with indices between 6 and 10 inclusive
def y : Int[] = a[a > 5] // find elements in 'a' with values greater than 5
def z : A = _[{ xxx :: Int }] // find unnamed record with field xxx which is an Int
def z : A = _[Person{ age :: Int == 10 }] // find named record with field age which is an Int and equals 10
def a : String = (_ : Person(name == X : String == ~"M")) & (_ : Person(father == X, sex == 'M')) => X[0] // find a the name of the 1st person found who is male and the father's name starts with "M"
def a : Person = (Y : Person(name == X : String == ~"M")) & (Y : Person(father == X, sex == 'M')) => Y[0] // find a the 1st person found who is male and the father's name starts with "M". Y is interpreted as the search value as the expression is in the value position and Y appears on the RHS of => operator
def b : Person(sex =:::= X) => X // find the type of field sex in Person and define as type of b
def c : Int = x :: Int => 1, x :: String => 2, y :: Int => 3 // if x is an Int valuate to 1, else if x is a String evaluate to 2, else if y is an Int evaluate to 3
Annotations can be applied on:
- types
- variables
- closures
Quotations are denoted by either $" ... "
or ${ ... }
.
Annotations processing can be used to extend the language.
System defined annotation generally restrict the language as means to give an extra level of safety.
The language itself does not implement access modifiers but implemented through annotation. All elements in structures are public by default.
@access(grant | limit)
- limit or grant access to a class of types@access(grant = ["A", ":> B"])
grant access toA
or sub type ofB
but limit access to any other type@access(grant = ["A", ":> B"])
limit access toA
or sub type ofB
but grant access to any other type
I'm curious abut some of this language's design decisions. For example, why the use of symbols like
>:
to denote type relationships? I think that having too many symbols makes a language hard to read, and also greatly increases the learning curve. Same withA @B()
, etc.Also, in the case of range types, say
Int > 5
, it becomes impossible to determine whether the literal6
is anInt
,Int > 1
,Int > 2
,Int > -23456
, etc. This then necessitates type annotations, which as mentioned before, are confusing and somewhat cryptic.Also, why the need for
$
when creating strings? I think the symbol"
itself is already enough.Next, I don't think member access should be facilitated through annotations; it's not consistent with the other design decisions, which would also increase learning curve.
Overall, I'm wondering, how does this design accomplish the goal of both data manipulation and systems programming effectively? The concepts introduced are very high-level, which would require much overhead for a systems programming language, and generally the constructs in it (like a very dense type system) are not really useful in such a context. I'm also not sure why the features outlined for this language would be very useful for data manipulation at all. I find that data manipulation is more of an issue of library design and availability than a language issue.