Notes on object-oriented programming

Overview of object-oriented techniques

Inheritance

Inheritance is a mechanism by which a class is populated with fields and methods not only from its explicit definition, but also from a set of one or more parent classes. Languages with inheritance (among them Java, C++ and Python) typically allow classes to declare that some methods cannot be redefined by child classes (final in Java), some methods must be redefined, and that some fields may not be used by child classes (private versus protected).

Although inheritance was the marquee feature of influential object-oriented languages like Java and C++, its reputation has suffered considerably in recent years. Even Effective Java, one of the most popular guides to the Java language, recommends severely curtailing its use and recommends composition over inheritance and the use of interfaces over abstract classes.

One issue is that inheritance breaks encapsulation. Effective Java gives the example of subclassing a Set class in order to count the number of times an object was added to the set. Suppose that the Set parent class defined both add and addAll methods, to add a single object and a list of objects, respectively. If you wrote something like

@Override public void add(T x) {
    this.count += 1;
    return super.add(x);
}

@Override public void addAll(Collection<? extends T> xs) {
    this.count += xs.size();
    return super.addAll(xs);
}

you would be at the mercy of the implementation detail of whether the Set parent class uses add internally to implement addAll. If it did, then objects added with addAll would be counted twice. In general, a class which is meant to be inherited from must document all internal use of overridable methods, which is a burden on both the class author and its users. Instead, it’s better for the new InstrumentedSet class to wrap Set by keeping a Set object as an internal variable, and defining all the methods that Set defines by delegating to the internal set. This way, the wrapper class is in full control of how its methods are called. However, this only works if there is an interface that Set implements; otherwise you wouldn’t be able to use InstrumentedSet where a Set is expected.

Interfaces

An interface is a specification of a set of methods that a class must implement. Some languages, like Java, support both interfaces and inheritance; others, like Rust and Go, only have interfaces (Rust calls them “traits”). Interfaces have many of the features that inheritance does: an interface can define default implementations of methods, and an interface can require that implementors also implement another interface. The major difference is that implementors of an interface don’t (necessarily) share any of the same internal state.

Generics

A generic type is one that is parameterized on one or more other types. The most common case is a container type that is parameterized on the type of its elements, e.g. List<Integer> or Map<String, String>. Depending on whether your languages lets you write generic functions as well as generic classes, they can have a similar function to interfaces, e.g.

function foo<T>(x: T) -> void {
    x.bar();
}

is similar to

interface Bar {
    bar() -> void;
}

function foo(x: Bar) -> void {
    x.bar();
}

Algebraic data types

An algebraic data type (ADT) is a type whose values belong to one of a finite set of variants. The boolean type is the simplest example of an ADT:

type Boolean {
    True,
    False,
}

The variants may themselves contain types, e.g.:

type OptionalString {
    Some(string),
    None,
}

ADTs may be suitable for representing flat hierarchies that might otherwise use inheritance, e.g.:

type Polygon {
    Rectangle(int, int),
    Square(int),
    Triangle(int, int, int),
}

Case studies

Squares and rectangles

A square is a type of rectangle and any function that operates on rectangles should be able to operate on squares. This is exactly the kind of ontological relationship that inheritance is meant to capture. However, although squares can be represented more compactly than rectangles (one integer field instead of two), the naive approach of Square subclassing Rectangle precludes the more efficient representation, because the Square class inherits all the fields of Rectangle.

You can avoid this problem using interfaces. If you define a Rectangle interface that requires, say, get_width, get_length and get_area methods, and then ConcreteRectangle and Square classes that implement the Rectangle interface, then you can choose completely different representations for ConcreteRectangle and Square.

Extensible frameworks

The Python standard library has many examples of extension through inheritance. The library provides a parent class that implements some functionality, and the library user writes a child class that defines methods that the parent class calls. html.parser is an example:

from html.parser import HTMLParser

class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print("Got some data:", data)

parser = MyHTMLParser()
parser.feed("<html>...</html>")

This kind of code could also be written fairly easily with interfaces, e.g. in pseudocode:

from html.parser import HtmlParserDriver

interface HtmlParser {
    handle_data(self, data: string) -> void {} // note the default, empty implementation
    handle_start_tag(self, tag: string) -> void {}
    // etc.
}

class MyHtmlParser implements HtmlParser {
    handle_data(self, data: string) -> void {
        print("Got some data:", data);
    }
}

parser = MyHtmlParser()
driver = HtmlParserDriver(parser)
driver.feed("<html>...</html>")

One drawback of the interface approach is that it would not be easy to pass information from the driver to the parser, e.g. if the parser wanted to get the current location in the document. With inheritance the parent class could define a get_location() method or even just a readable location field.

UI hierarchies

In the overview I noted that the major difference between inheritance and interfaces is that child classes inherit internal state from their parent(s), while interfaces do not. An example of where this behavior might be desirable is in the hierarchy of objects in a UI framework. For example, suppose your UI has a TextView class and ScrollableTextView class. If ScrollableTextView can inherit from TextView, then it can reuse its (presumably complex) internal state. If ScrollableTextView and TextView merely both implemented the same interface, it would be harder to reuse code between the two classes.

Similar arguments may or may not hold for other cases where data types form a genuine hierarchy, e.g. in the definition of the abstract syntax tree inside a compiler. datetime as a subclass of date is another example where it is convenient for the child class to inherit state from the parent class.

To give a concrete example, for https://iafisher.com/projects/cities/usa and https://iafisher.com/projects/cities/europe, I use a common Game base class with UsaGame and EuropeGame subclasses, something like this:

abstract class Game {
  protected localStorage: LocalStorage;
  protected allCities: City[];
  protected citiesOver1Mil: number;
  protected domMap: HTMLElement;
  protected domInput: HTMLElement;
  // etc.
}

class UsaGame extends Game {
  getProjection(): Projection {
    // define a custom projection for the U.S. map
  }

  addCity(city: City): void {
    // update U.S.-specific statistics
  }
}

Inheritance is highly convenient here because it allows me to reuse all of the functionality and internal state of the Game class with minimal hassle.

iafisher/oop.md