When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Monday, November 30, 2009

Subclass and Subtype in Java

Here I just want to get together a pretty much widely-known information; somehow it escapes you if you just read Java books or google the internets. And what I was looking for was an example (hence a proof) that subclassing in Java is not always subtyping, or some kind of proof that it is.

I am talking specifically about the state of affairs in Java. We may try to do the same trick in Scala; my point here though was to have a real-life, although contrived, example of how subclassing is different from subtyping, and Java is a easy target.

First, definitions (informal, as well as the rest of this post).

Types in Java.
Java has the following kinds of types:
  • primitive (e.g. char)
  • null
  • interfaces (declared with interface keyword)
  • classes (declared with class keyword)
  • array (declared with [square brackets])
  • type variables (used in generics)

Subclassing: declaring one class to be a subclass of another (class ... extends ...) - this allows a subclass to inherit functionality from its superclass

Subtyping: A is a subtype of B if an instance of A can be legally placed in a context where an instance of B is required. The understanding of "legally" may vary.

Many sources state firmly that in Java, a subclass is always a subtype. Opinions that a subclass is not always a subtype are also widespread; and that's obviously true for some languages: Self type is a good source of subclass not being a subtype confusion; we do not have Self type in Java.

In Java 5 a covariant return type inheritance was added: if class A has a method that returns X, and its subclass B declares a method with the same name and parameter list (meaning, parameter types), and returning a subclass of X, then this method overrides the method in A.

In addition to this, arrays are also covariant: if A is a subtype of B, then A[], in Java, is a subtype of B[].

This last feature allows us to create an example showing that no! Subclasses in Java are not always subtypes! See:

public class Subtyping
interface A
A[] m();
void accept(A a);

interface B extends A // subtype of A

public static void main(String[] args) {
A a = new AImpl(); // see AImpl below
B b = new BImpl(); // see BImpl below
// now note that B is a subtype of A.
a.accept(a); // this works; substitution is trivial
b.accept(a); // this works too (substitution is trivial too)
a.accept(b); // oops, this fails! b, being a subtype of a, is not accepted at runtime

static class AImpl implements A {
public A[] m()
return new A[]{this};

public void accept(A a)
a.m()[0] = this;

static class BImpl extends AImpl implements B{
public B[] m()
return new B[]{this};

So there, this code demonstrates a subclass that is not legally a subtype. By 'legally' here I mean that the code always throws a Java runtime exception without adding any client precondition predicates.

Here's a good discussion of this issue:

public boolean lspHolds(Object o) {
return o.toString().contains("@");

Monday, November 16, 2009

Not Just Subtyping

I've been trying to figure out how come the only relationship between types we use to encounter is "sub". Type a is a subtype of type b. Validated by subsitution principle.

If we limit ourselves with "sub", we'll have infinite arguments on whether a square can be considered as a subtype of rectangle; whether a colored-point can be considered a subtype of point; whether an int can be considered, under certain conditions, as a subtype of float; whether, in Java, if A implements B, A is a subtype of B. Opinions differ.

I think we should look deeper and maybe consider this case by case.

  1. Is square a rectangle? Seems so; the objection being that when we start stretching the rectangle, not every square remains a rectangle. Modification is a weird thing. Even in simple cases, i = 1, then (i == 1) becomes true; but it is not actually true, because next moment i = 2, and the expression is false. No temporal logic transforms true into false, there's something wrong there. But if we discard modification, we can state this: each square 'is a' rectangle. We have an injection. Two different squares treated as rectangles remain different.
  2. Is colored-point a point? This is a totally different case. The world of colored-points is different from the world of points: it is bigger. Yes, a colored-point can be used instead of a point in a substitution. So, in a sense, it 'is a' point. But two different colored-points can become the same point if we forget about their color. Each point 'is a' projection of a colored-point, and eachcolored-point can be treated as a point. So what we have here is a projection here.
  3. Let's try to implement an unordered-pair. We can do it by taking a pair type and then defining equality in such a way that (a, b) == (b, a). Again, we have a projection, but in this case we went the other way around: not building a new type by adding a dimension, but discovered a new type by defining an equivalence relationship between pairs. So what we have here is a factorization.
  4. In the case of int -> float, we have a more general case. Not all operations are preserved; not all floats come from ints, and not all two different ints map to two different floats. Still, it is a legal type transformation, done usually by default, called coercion.
  5. In Java-like languages, if a class implements an interface, it thus declares that it can be used in any context where the interface is required. It projects itself into the realm of the given interface, and that's all we know; the "true nature" of the class implementing an interface remains obscure.
  6. In the context of 5., the Java keyword "extends" is meaningless. That is, in reality it has many meanings: "borrows the functionality", "implements the same set of interfaces", "has access to internal namespace". Non of this has anything to do with types relationships.
  7. 'is a' actually has many meanings too. We probably never can be sure whether A 'is a' B: we should always ask: "in which contexts" (that is, "what do you mean?"). If we look at this:
the pyramid here is neither a square nor a triangle. But when projected, it can be percieved as one. If we say "he's a soldier", it does not mean that's all the person is; what we mean is just a projection.

So, how about substitution? We can substitute A with B if we know the mapping that maps B into the context where A should be present. If we are talking about coercion (and we know that an int is not a subtype of float), for instance, we can substitute a float with an int value if coercion is legal. In this and other cases, our ability to substitute means we know a canonical mapping.

Now examples.

  1. Injection. Take types A and B; build A+B, a union. Some languages support unions, some don't. Some support them implicitly, like Scala with its case classes. By definition, there are canonical inclusions from A and B into A+B; so it would be logical to call A and B subtypes of A+B.
  2. Projection. Take types A and B, and their product, A×B. One can represent A×B as the type of pairs (a, b), with a of type A and b of type B. There are two projections from A×B to A and B. Neither A nor B can be seriously called a subtype of A×B, of course. They can be considered as a special case of factor-types. Below is another example of factor-type.
  3. Surjection. Given A×A, a type of pairs (a1, a2), we can build a type of unordered pairs by equalizing (a1, a2) with (a2, a1). This gives us an equivalence relationship, and a projection from A×A to A×A /2, the type of unordered pairs.
  4. Enum surjection. Suppose we build an enum class for Rainbow. It turns out in different cultures the number of colors is different. In China, Japan, Russia, there are seven colors, while in some other countries (US for instance) there are 6. Two colors, light-blue and dark-blue, in Asian rainbow, both map to just blue in US rainbow. So every time we pass a US rainbow instance to a function, we can as well coerce an Asian rainbow ("substitute") into US version, using the standard mapping. It is similar to projection, where the receiving side has no clue about the hidden multiplicity of possible parameter values.

In computer science there is a variety of terms that are supposed to denote this variety of "-typing", but somehow they either are all called kinds of subtyping or declared not to belong to the "-typing" realm. I suggest to apply pretty simple categorical notions, so that things just look simpler and more manageable.


Subscribe To My Podcast