Marginal vs. Conditional Subtyping

In computer programming, a type is a just a set of objects—not a physical set of objects in memory, but a conceptual set of objects. The type Integer is the set {0, 1, -1, ...}. Types are used to reason about the correctness of programs. A function that accepts an argument of type Integer, will work for any value 0, 1, -1, etc., for some definition of "works". Subtyping is used to define relationships between types. Type B is a subset of a type A if the set of objects defined by B is a subset of the objects defined by A. Every function that works on an instance of A also works on an instance of B, for some definition of "works". If we were just doing math, subsets would be the end of subtyping. But types as pure sets exist only conceptually. Actual types must be concretely defined. It is not very efficient to define the Integer class by writing all possible Integers! In most programs, the possible values of a type are constrained by their fields. Subtypes and supertypes typically differ by having more or fewer fields than the other. Sometimes, the subtype has more fields, and sometimes, the supertype has more fields. Many of you may be thinking, "What? It's always one way, not the other!" The funny thing is that some of you think the subtype always has more fields and others think the supertype always has more fields. That's because there are two kinds of subtyping. The first is what I call "marginal subtyping", which is encountered in application programming and is well modeled by inheritance. The second is what I call "conditional subtyping", which is encountered in mathematical programming and is well modeled by implicit conversions. Depending on the genre of programming you work in, the other kind of subtyping may be unknown and the language features needed to implement it may be maligned. But both needs are real and both features are necessary.

Marginal subtyping

Marginal subtyping occurs when the programmer has one class, say Square, and then needs a new class, say ColoredSquare, that is like the old class but it has some additional functionality that is independent of the old functionality, usually in a totally separate domain. Marginal subtyping is well implemented through traditional object oriented inheritance:

class Square:
  x: Float
  y: Float
  width: Float
  new(x, y, width):
    self.x = x
    self.y = y
    self.width = width

class ColoredSquare extends Square:
  color: Color
  new(x, y, width, color):
    super(x, y, width)
    self.color = color

This is inheritance at its best. It has eliminated the need to copy the fields, methods, and constructors of Square into ColoredSquare. Every function that requires a Square can safely accept a ColoredSquare. The Square part of the ColoredSquare can even be mutated, the x and y and width fields can be changed while the color field is carried safely along.

I call this marginal subtyping because relationship between the subtype and the supertype is like that of a distribution and a marginal distribution. In marginal subtyping, the subtype has more dimensions, more degrees of freedom, more fields than the supertype. In some sense, the Venn diagram of subtyping is little misleading. There is not some universe of possible squares and then we carve out a subregion of squares and call them colored squares. It is more that there is a universe of possible squares and then there is an additional dimension, color, perpendicular to the diagram; the ColoredSquares live in this space off the flat surface and each one can be projected down to its corresponding Square. It is a many-to-one mapping from subtype to supertype, with ColoredSquare(0,1,2,Color.black) and ColoredSquare(0,1,2,Color.cornflower_blue) both being interpreted as Square(0,1,2) whenever a Square is required.

Conditional subtyping

Conditional subtyping occurs when one class is a specific kind of another class or, identically, one class is a generalization of the other. Conditional subtyping is well implemented through implicit conversions:

class Rectangle:
  x: Float
  y: Float
  width: Float
  height: Float
  new(x, y, width, height):
    self.x= x
    self.y = y
    self.width = width
    self.height = height

class Square:
  x: Float
  y: Float
  width: Float
  new(x, y, width):
    self.x= x
    self.y = y
    self.width = width
  implicit static def to_rectangle(r: Square) -> Rectangle:
    return Rectangle(r.x, r.y, r.width, r.width)

I am following Scala's implicit conversion design, where the implicit conversion can be defined in either the class being converted from or the class being converted to. By defining the implicit conversions as part of the class, it ensures that the user of the class does not have to import the conversion functions separately. Now anytime a function requires a Rectangle, like intersect(rectangle1, rectangle2), one can actually pass in a Square, like intersect(square, rectangle), and the implicit function will automatically be applied, like intersect(Square.to_rectangle(square), rectangle).

I call the kind of subtyping required here conditional subtyping because the relationship between the supertype and the subtype is like the relationship between a distribution and a conditional distribution. In conditional subtyping, the subtype has fewer degrees of freedom and, in an ideal world, fewer fields than the supertype. Here, the Venn diagram of subtyping is quite accurate, Squares really are nothing more than a special region within Rectangles. Squares are Rectangles under the condition that width and height are equal, a condition guaranteed by the Square constructor, assuming there is no mutability to spoil it later.

Alternatives

Both inheritance and implicit conversions are often reviled as useless and complicated. This is not that surprising on its own. Both features are very powerful, often overused, and sadly, designed poorly in some languages. An additional difficulty is that marginal subtyping is often the only relationship model needed for application programming and conditional subtyping is often the only relationship needed by mathematical programmers. The unstoppable juggernaut of object oriented programming particularly encourages the use of inheritance everywhere, even on relationships that requires conditional subtyping. It may seem advisable to get rid of one or the other, but this leaves bad solutions for many problems, as shown below.

Replace inheritance with membership

If one wanted to avoid the use of inheritance, one could make the ColoredSquare class like this:

class ColoredSquare:
  circle: Square
  color: Color
  new(x, y, width, color):
    self.circle = Square(x, y, width)
    self.color = color

But this would require one to access the circle member every time one wanted to use a ColoredSquare as a Square, writing my_square.square.area() instead of my_square.area() or intersection(one_square.square, another_square.square) instead of intersection(one_square, another_square). Over an entire program, eliminating these "hasa" casts can save a lot of boilerplate and noise.

Replace inheritance with implicits

One can try to replace inheritance with implicit conversions like this:

class Square:
  x: Float
  y: Float
  width: Float
  new(x, y, width):
    self.x = x
    self.y = y
    self.width = width

class ColoredSquare:
  x: Float
  y: Float
  color: Color
  new(x, y, width, color):
    self.x = x
    self.y = y
    self.color = color
  implicit static def to_square(s: ColoredSquare) -> Square:
    return Square(s.x, s.y, s.width)

The first problem with this is a large amount of repetition compared to the original above. There is no super constructor to which to delegate, so all fields must be set in each class. This is not that bad, because the user is also freed from having to call the super constructor, a responsibility that can cause problems when one wants finer control over the order in which fields are initialized. There is more flexibility in this design and, while I have manually copied each constructor argument to its corresponding field in all my examples here, most modern languages have syntax that makes fields for some or all parameters of the constructor.

The second problem is fatal however. If these are mutable classes, then the object returned by the implicit conversion has no relationship to the original object. A change made to a field on the down-cast object is not reflected in the corresponding field on the original object, and vice versa:

val colored_square = ColoredSquare(1.0, 2.0, 3.0)
val square: Square = colored_square  # ColoredSquare.to_square(colored_square)

colored_square.x = 5.0
square.x  # Still 1.0

square.y = 4.0
colored_square.y  # Still 2.0

Replace implicits with inheritance

Because of the popularity of object oriented programming languages, using inheritance where implicit conversions would be more appropriate is very common. It is also very easy to screw up. Consider this attempt to fit Square and Rectangle into an inheritance hierarchy:

class Square:
  x: Float
  y: Float
  width: Float
  new(x, y, width):
    self.x = x
    self.y = y
    self.width = width

class Rectangle extends Square:
  height: Float
  new(x, y, width, height):
    super(x, y, width)
    self.height = height

Like ColoredSquare, Rectangle has one additional field compared to a Square, so it becomes the child class. Programmatically, this is completely legal. Conceptually, this is wrong, wrong, wrong. Remember the diagram of a subtype from above? Subtyping must satisfy an "is a" relationship—Rectangle is only a subtype of Square if every Rectangle "is a" Square. This is mathematically wrong—the worst kind of wrong.

If I remember my middle school math correctly, a square is a rectangle with two equal sides. To do this properly with inheritance we need to flip the relationship:

class Rectangle:
  x: Float
  y: Float
  width: Float
  height: Float
  new(x, y, width, height):
    self.x= x
    self.y = y
    self.width = width
    self.height = height

class Square extends Rectangle:
  new(x, y, width):
    super(x, y, width, width)

It may seem a little strange for the subtype to add no fields, and we'll come back to that in a bit. But programmatically, this is completely legal. Conceptually, this also correct, but only if an important property holds—instances of Square and Rectangle must be immutable. That is, there cannot be a stretch_horizontally method that mutates an Rectangle, because Square would have to inherit it and calling such a method on a Square makes it no longer a Square. Variables typed as Square may not point to things that a reasonable person would consider a square:

val square: Square = Square(0.0, 1.0, 2.5)
square.stretch_horizontally(2.0)
# square is still type Square, but width does not equal height

This problem is unresolvable as long as the objects are mutable. When a rectangle is stretched in math, it is a new rectangle, but in programming, mutable objects can be changed and still remain the same object, just with a different values. The reality is that the mutable square type is disjoint with the mutable rectangle type; there is no object that is both a mutable square and a mutable rectangle. That is, there is no object that can change into any rectangle while also always remaining a square. If a mutable rectangle is the best model for your problem, by all means use it, but understand that there is no type relationship between it and the square version; they are totally unrelated classes.

Ok, so let's limit ourselves to immutable objects. Technically, the classes model the concepts they are supposed to. Something is still a little bit weird about modeling these concepts with inheritance. The problems are clearer when we go to add some more classes in this hierarchy. The rectangle can be generalized further into a parallelogram. Let's put that into the type hierarchy:

class Parallelogram:
  x: Float
  y: Float
  width: Float
  height: Float
  angle: Float
  new(x, y, width, height, angle):
    self.x= x
    self.y = y
    self.width = width
    self.height = height
    self.angle = angle

class Rectangle extends Parallelogram:
  new(x, y, width, height):
    super(x, y, width, height, tau/4)

class Square extends Rectangle:
  new(x, y, width, height):
    super(x, y, width, width)

The first problem with this is that adding a new generalization requires adding a supertype to an existing class. It would not be possible to add the Parallelogram class in this way if Rectangle and Square were in a different library. With PEP 3141, Python created the class hierarchy Number >: Complex >: Real >: Rational >: Integral. This is all correct, but if you ever wanted to make a Quaternion class, you'd be out of luck, because you can't "supertype" an existing class. If you wanted to add Rhombus, Trapezoid, Quadrilateral, or Polygon to our shape classes, you'd also be out of luck unless you owned the library that defined Square and Rectangle.

The second problem with this is that the simpler classes get progressively more complicated by the addition of the general base classes. When it was just Rectangle and Square, Square had both width and height even though it only needed one number to describe both. When Parallelogram was added, each class got angle. If we wanted to add Polygon, each class would need a variable length list, even though the simpler classes could be fixed-size. This wastes a ton of memory, which is why languages that do this usually turn the classes into interfaces and have only one concrete class for each interface. Python may have the Complex interface, but float doesn't have to actually have a field for the imaginary part. And there is a separate complex class that provides the actual implementation of Complex.

So modeling mathematical relationships with object-oriented inheritance means you need two classes for every concept or tons of extra state on simple classes, and you can't generalize existing classes no matter what. It is no wonder that object oriented programming is often denigrated for its excessive complexity and awkward boilerplate. The purists aren't wrong, traditional OOP really is bad for this kind of subtyping.

Implicit conversions make it easy to add additional generalizations. One can define Parallelogram separately from Rectangle and Square like this:

class Parallelogram:
  x: Float
  y: Float
  width: Float
  height: Float
  angle: Float
  new(x, y, width, height, angle):
    self.x= x
    self.y = y
    self.width = width
    self.height = height
    self.angle = angle
  implicit static def from_rectangle(r: Rectangle) -> Parallelogram:
    return Parallelogram(r.x, r.y, r.width, r.height, tau/4)
  implicit static def from_square(s: Square) -> Parallelogram:
    return Parallelogram(s.x, s.y, s.width, s.width, tau/4)

Living in harmony

Both inheritance and implicit conversion are necessary in a programming language that intends to be universally useful because marginal subtyping and conditional subtyping both exist. In my projects, one kind or the other typically dominates. When I am writing graph algorithms, I have a lot of implicit conversions. When I am writing GUIs, I have a lot of inheritance. One thing I liked about Scala was that I could fit the algorithm and the user interface comfortably in the same program. This is probably not too surprising as the marriage of object oriented programming and functional programming was the principle design goal of Scala. If I were designing a language, I would start with Scala's design of inheritance and implicit conversion. They play together pretty well in practice, but they were not universally acclaimed. The developers recently put user-defined implicit conversions behind a feature gate. This may be a good idea, not because implicit conversions are bad, but because most new Scala programmers come from an object-oriented background and most programs are applications rather than math libraries. I think the problems that came from Scala's implicit conversions could have been avoided if they (1) had made them harder for new programmers to find and (2) didn't overuse implicit conversions in the standard library, of which the worst offender is the infamous and deprecated any2StringAdd.