In computer programming, a type is a just a set of objects—not a physical set of objects in memory, but a conceptual set of objects. The type Integer is the set {0, 1, -1, ...}. Types are used to reason about the correctness of programs. A function that accepts an argument of type Integer, will work for any value 0, 1, -1, etc., for some definition of “works”. Subtyping is used to define relationships between types. Type B is a subset of a type A if the set of objects defined by B is a subset of the objects defined by A. Every function that works on an instance of A also works on an instance of B, for some definition of “works”. If we were just doing math, subsets would be the end of subtyping. But types as pure sets exist only conceptually. Actual types must be concretely defined. It is not very efficient to define the Integer class by writing all possible Integers! In most programs, the possible values of a type are constrained by their fields. Subtypes and supertypes typically differ by having more or fewer fields than the other. Sometimes, the subtype has more fields, and sometimes, the supertype has more fields. Many of you may be thinking, “What? It’s always one way, not the other!” The funny thing is that some of you think the subtype always has more fields and others think the supertype always has more fields. That’s because there are two kinds of subtyping. The first is what I call “marginal subtyping”, which is encountered in application programming and is well modeled by inheritance. The second is what I call “conditional subtyping”, which is encountered in mathematical programming and is well modeled by implicit conversions. Depending on the genre of programming you work in, the other kind of subtyping may be unknown and the language features needed to implement it may be maligned. But both needs are real and both features are necessary.

B is a subtype of A.


Marginal subtyping

Marginal subtyping occurs when the programmer has one class, say Square, and then needs a new class, say ColoredSquare, that is like the old class but it has some additional functionality that is independent of the old functionality, usually in a totally separate domain. Marginal subtyping is well implemented through traditional object oriented inheritance:

This is inheritance at its best. It has eliminated the need to copy the fields, methods, and constructors of Square into ColoredSquare. Every function that requires a Square can safely accept a ColoredSquare. The Square part of the ColoredSquare can even be mutated, the x and y and width fields can be changed while the color field is carried safely along.

I call this marginal subtyping because relationship between the subtype and the supertype is like that of a distribution and a marginal distribution. In marginal subtyping, the subtype has more dimensions, more degrees of freedom, more fields than the supertype. In some sense, the Venn diagram of subtyping is little misleading. There is not some universe of possible squares and then we carve out a subregion of squares and call them colored squares. It is more that there is a universe of possible squares and then there is an additional dimension, color, perpendicular to the diagram; the ColoredSquares live in this space off the flat surface and each one can be projected down to its corresponding Square. It is a many-to-one mapping from subtype to supertype, with ColoredSquare(0,1,2, and ColoredSquare(0,1,2,Color.cornflower_blue) both being interpreted as Square(0,1,2) whenever a Square is required.

Conditional subtyping

Conditional subtyping occurs when one class is a specific kind of another class or, identically, one class is a generalization of the other. Conditional subtyping is well implemented through implicit conversions:

I am following Scala’s implicit conversion design, where the implicit conversion can be defined in either the class being converted from or the class being converted to. By defining the implicit conversions as part of the class, it ensures that the user of the class does not have to import the conversion functions separately. Now anytime a function requires a Rectangle, like intersect(rectangle1, rectangle2), one can actually pass in a Square, like intersect(square, rectangle), and the implicit function will automatically be applied, like intersect(Square.to_rectangle(square), rectangle).

I call the kind of subtyping required here conditional subtyping because the relationship between the supertype and the subtype is like the relationship between a distribution and a conditional distribution. In conditional subtyping, the subtype has fewer degrees of freedom and, in an ideal world, fewer fields than the supertype. Here, the Venn diagram of subtyping is quite accurate, Squares really are nothing more than a special region within Rectangles. Squares are Rectangles under the condition that width and height are equal, a condition guaranteed by the Square constructor, assuming there is no mutability to spoil it later.


Both inheritance and implicit conversions are often reviled as useless and complicated. This is not that surprising on its own. Both features are very powerful, often overused, and sadly, designed poorly in some languages. An additional difficulty is that marginal subtyping is often the only relationship model needed for application programming and conditional subtyping is often the only relationship needed by mathematical programmers. The unstoppable juggernaut of object oriented programming particularly encourages the use of inheritance everywhere, even on relationships that requires conditional subtyping. It may seem advisable to get rid of one or the other, but this leaves bad solutions for many problems, as shown below.

Replace inheritance with membership

If one wanted to avoid the use of inheritance, one could make the ColoredSquare class like this:

But this would require one to access the circle member every time one wanted to use a ColoredSquare as a Square, writing my_square.square.area() instead of my_square.area() or intersection(one_square.square, another_square.square) instead of intersection(one_square, another_square). Over an entire program, eliminating these “hasa” casts can save a lot of boilerplate and noise.

Replace inheritance with implicits

One can try to replace inheritance with implicit conversions like this:

The first problem with this is a large amount of repetition compared to the original above. There is no super constructor to which to delegate, so all fields must be set in each class. This is not that bad, because the user is also freed from having to call the super constructor, a responsibility that can cause problems when one wants finer control over the order in which fields are initialized. There is more flexibility in this design and, while I have manually copied each constructor argument to its corresponding field in all my examples here, most modern languages have syntax that makes fields for some or all parameters of the constructor.

The second problem is fatal however. If these are mutable classes, then the object returned by the implicit conversion has no relationship to the original object. A change made to a field on the down-cast object is not reflected in the corresponding field on the original object, and vice versa:

Replace implicits with inheritance

Because of the popularity of object oriented programming languages, using inheritance where implicit conversions would be more appropriate is very common. It is also very easy to screw up. Consider this attempt to fit Square and Rectangle into an inheritance hierarchy:

Like ColoredSquare, Rectangle has one additional field compared to a Square, so it becomes the child class. Programmatically, this is completely legal. Conceptually, this is wrong, wrong, wrong. Remember the diagram of a subtype from above? Subtyping must satisfy an “is a” relationship—Rectangle is only a subtype of Square if every Rectangle “is a” Square. This is mathematically wrong—the worst kind of wrong.

If I remember my middle school math correctly, a square is a rectangle with two equal sides. To do this properly with inheritance we need to flip the relationship:

It may seem a little strange for the subtype to add no fields, and we’ll come back to that in a bit. But programmatically, this is completely legal. Conceptually, this also correct, but only if an important property holds—instances of Square and Rectangle must be immutable. That is, there cannot be a stretch_horizontally method that mutates an Rectangle, because Square would have to inherit it and calling such a method on a Square makes it no longer a Square. Variables typed as Square may not point to things that a reasonable person would consider a square:

This problem is unresolvable as long as the objects are mutable. When a rectangle is stretched in math, it is a new rectangle, but in programming, mutable objects can be changed and still remain the same object, just with a different values. The reality is that the mutable square type is disjoint with the mutable rectangle type; there is no object that is both a mutable square and a mutable rectangle. That is, there is no object that can change into any rectangle while also always remaining a square. If a mutable rectangle is the best model for your problem, by all means use it, but understand that there is no type relationship between it and the square version; they are totally unrelated classes.

Ok, so let’s limit ourselves to immutable objects. Technically, the classes model the concepts they are supposed to. Something is still a little bit weird about modeling these concepts with inheritance. The problems are clearer when we go to add some more classes in this hierarchy. The rectangle can be generalized further into a parallelogram. Let’s put that into the type hierarchy:

The first problem with this is that adding a new generalization requires adding a supertype to an existing class. It would not be possible to add the Parallelogram class in this way if Rectangle and Square were in a different library. With PEP 3141, Python created the class hierarchy Number >: Complex >: Real >: Rational >: Integral. This is all correct, but if you ever wanted to make a Quaternion class, you’d be out of luck, because you can’t “supertype” an existing class. If you wanted to add Rhombus, Trapezoid, Quadrilateral, or Polygon to our shape classes, you’d also be out of luck unless you owned the library that defined Square and Rectangle.

The second problem with this is that the simpler classes get progressively more complicated by the addition of the general base classes. When it was just Rectangle and Square, Square had both width and height even though it only needed one number to describe both. When Parallelogram was added, each class got angle. If we wanted to add Polygon, each class would need a variable length list, even though the simpler classes could be fixed-size. This wastes a ton of memory, which is why languages that do this usually turn the classes into interfaces and have only one concrete class for each interface. Python may have the Complex interface, but float doesn’t have to actually have a field for the imaginary part. And there is a separate complex class that provides the actual implementation of Complex.

So modeling mathematical relationships with object-oriented inheritance means you need two classes for every concept or tons of extra state on simple classes, and you can’t generalize existing classes no matter what. It is no wonder that object oriented programming is often denigrated for its excessive complexity and awkward boilerplate. The purists aren’t wrong, traditional OOP really is bad for this kind of subtyping.

Implicit conversions make it easy to add additional generalizations. One can define Parallelogram separately from Rectangle and Square like this:

Living in harmony

Both inheritance and implicit conversion are necessary in a programming language that intends to be universally useful because marginal subtyping and conditional subtyping both exist. In my projects, one kind or the other typically dominates. When I am writing graph algorithms, I have a lot of implicit conversions. When I am writing GUIs, I have a lot of inheritance. One thing I liked about Scala was that I could fit the algorithm and the user interface comfortably in the same program. This is probably not too surprising as the marriage of object oriented programming and functional programming was the principle design goal of Scala. If I were designing a language, I would start with Scala’s design of inheritance and implicit conversion. They play together pretty well in practice, but they were not universally acclaimed. The developers recently put user-defined implicit conversions behind a feature gate. This may be a good idea, not because implicit conversions are bad, but because most new Scala programmers come from an object-oriented background and most programs are applications rather than math libraries. I think the problems that came from Scala’s implicit conversions could have been avoided if they (1) had made them harder for new programmers to find and (2) didn’t overuse implicit conversions in the standard library, of which the worst offender is the infamous and deprecated any2StringAdd.

  • NebuPookins

    > In some sense, the Venn diagram of subtyping is little misleading. There is not some universe of possible squares and then we carve out a subregion of squares and call them colored squares. It is more that there is a universe of possible squares and then there is an additional dimension, color, perpendicular to the diagram; the ColoredSquares live in this space off the flat surface and each one can be projected down to its corresponding Square.

    Your mental model is self-consistent, but it’s not the same as my mental model (which I believe is also self-consistent). There *is* a universe of possible squares, with a subregion of which are colored squares. It does form a perfectly valid Venn diagram, as long as you’re willing to extend the concept of Venn diagram to be multidimensionals. For example, a 3D Venn diagram would cut the 3D space into possibly overlapping more-or-less-spherical blobs, just like a 2D Venn diagram would cut the 2D space into possibly overlapping more-or-less-circular blobs.

    I had not encountered your mental model before, and I find it interesting. I wonder under what circumstances, or for what problems, your mental model is more convenient than mine, or if perhaps there is some fatal inconsistency in my model that is elegantly addressed by yours.

    • David Hagen

      That section is certainly more philosophical than mathematical. The main point I was trying to capture is that a black square and a red square retain some kind of membership in the space of uncolored squares. They may be equal in a square sense and unequal in a colored square sense, because it may be sensible to drop the color field and do a comparison between the dimensions only. We don’t get this in all hierarchies. For example, we don’t say that 1+2i is equal to 1+3i in an “integer” sense. It just doesn’t make sense to drop the imaginary field and do an integer comparison.

      You mental model is entire valid as a classification of objects. I see your model as more concrete than mine—uncolored squares are distinctly separate from colored squares. My model ignores the concrete runtime type as property of the object, or at least never checks that property when determining if two objects are equal. The benefit I get for this abstraction is the distinction in the preceding paragraph, which a concrete mental model would have difficulty mapping.

  • vladap

    It is exceptionally interesting article, great food for thoughts.

    I would recommend to note immediately when used that marginal subtyping and conditional subtyping are your own terms. I got an impression it is an established mathematical terminology even you note it later.

    I have got the same impression for implicit conversions. But I don’t think they are actual mathematical term. Or am I wrong? What would be the closest in math?

    Could we call it just subtyping and supertyping?

    • David Hagen

      Thanks for the feedback. I now note in the introduction that these are my terms.

      Now “implicit conversions” is not my term. It is an established programming term for a function that is applied automatically to convert an object of type Square into an object of type Rectangle whenever some other function is given a Square but it requires a Rectangle. You don’t really need this in math because there is no concept of needing to convert a Square into a Rectangle; a Square just is a Rectangle. But in a computer a Rectangle is defined by two numbers (width and height) and the function is going to work on those numbers to do its thing. You can’t directly pass in a Square which is represented by only one number. Something has to do the copying; when it’s an automatically applied function, it’s called an “implicit conversion”.

      The closest thing in math would probably be defining a mapping from every square to its corresponding rectangle. I don’t know enough formal math to know if that alone is sufficient to define a subtyping, but it should be necessary. In any case, this would be conditional subtyping. As far as I know, there is no equivalent to marginal subtyping in math because there is no concept of mutable objects—a resized window is still the same window; two documents with identical contents may still be distinguishable.