When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Wednesday, January 20, 2010

Dispatch by Return Type: Haskell vs Java

If you are a Haskell programmer, this is all trivial for you; but if you are a Java programmer, you would ask yourself two questions:
  • is it possible?!
  • is it safe?!
Let's start with Java. In Java, we have pretty nice polymorphism which includes method overloading: using the same method name with different lists of parameters:

  • int length(Iterable i);
  • int length(Collection c);
  • int length(String s);
  • int length(T[] array);


Not that we need these specific methods, but for the sake of argument. What happens when compiler/JVM finds the right method to call: it finds the method that matches best the list of parameters, and in the case of confusion will either show an error message or (during runtime) throws an exception.

One thing is impossible, though: defining two methods that differ only in return type, something like

  • int parse(String source);
  • long parse(String source);
  • boolean parse(String source);

In principle, we could, in most cases, deduce which method we want, by the context: if we write

int n = parse("123");

the compiler could guess, of course, what exactly we mean... but no, it is not allowed.

So, a Java programmer, given a method name, expects to always get the same return type (even if signatures are different, we are used to expecting the same return type anyway).

Now, how about generics? What if we define
interface Ubiq {
T parse(String s);
}

then implement the interface and apply it to the strings?

Now we have a problem. If we define

class UbiqInt implements Ubiq<Integer> {
...
}

then we explicitly specify the type, and will be forced to write

int n = ubiqInt.parse("12345");

then all the polymorphisms is gone now.

What if we define a class that is still parameterized by T?

class UbiqImpl<T> implements Ubiq<T> {
...
}

Then the question is, how exactly can we have several different implementation of parse(String) in one class? We cannot, and we are stuck.

Now, what is different in Haskell that makes it possible? Let's take a look at Haskell code:

class Ubiq a where
parse :: String → a

instance Ubiq Bool where
parse ('T':cs) = True
parse _ = False

instance Ubiq Integer where
parse ('0':cs) = 2 * parse(cs)
parse ('1':cs) = 2 * parse(cs) + 1
parse _ = 0

check :: Bool → Integer → String
check b n = if b then show n else "no nothing"

tryme = check (parse "Tautology") (parse "1001")


What happens here? We defined a typeclass Ubiq, which is a vague analog of Java interface; and then we have a couple of implementations. But both implementations define known types, Integer and Bool to be implementing Ubiq. So that, from now on, if we have an expression parse(String) in a position where is expected to have an Integer type, then the compiler finds the appropriate implementation and "calls" it when needed. How does it happen? We just know the expected return type, so it is not hard to figure out. In cases when the compiler is confused, like in an expression show(parse("something")), the compiler will need a hint:
*Main> let x = parse("T")

:1:8:
Ambiguous type variable `a' in the constraint:
`Ubiq a' arising from a use of `parse' at :1:8-17
Probable fix: add a type signature that fixes these type variable(s)


(please do not hesitate to write your comments, questions, objections)

Followers

Subscribe To My Podcast

whos.amung.us