When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Friday, October 24, 2008

Java's Strong Typing

In a typical case of user application dealing with some kind of "apache javabean properties access", we still want to have generics help us with strong typing, to keep code safe. For this, we introduce a parameterized class Property<T>. And write something like this:

public class ClientCode {
Object model;
public ClientCode(Object model) {
this.model = model;
}

public void businessLogic() {
Property<String> name = new Property<String>("userName", model);
Property<Date> dob = new Property<Date>("dateOfBirth", model);
Property<Boolean> isGood = new Property<Boolean>("isGood", model);

if (isGood.value() && dob.value().before(new Date(1990, 10, 24))) {
System.out.println("One good adult user: " + name.value());
}
}

public static void main(String[] args) {
new ClientCode("ignore me").businessLogic();
}
}


Naive but natural. And now the implementation of Property<T>:

package common;

public class Property<T> {
String name;
T value;

public Property(String name, Object container) {
this.name = name;
this.value = (T) "hello world";
//        org.apache.common.weird.helper.util.BeanAccessUtil.
//            getProperty(container, name);
}

public T value() { return value; }
}


Looks pretty convincing, right? But surprise! Do you think this code will throw a ClassCastException from within Property<T>? Seems like it should; we even have a variable that, in case of it being type Date, cannot accept a string in assignment. Right?

Wrong...

As you know, generic type information is lost in compilation. As a result, all the compiled class knows about value type is that it is something. It is actually an Object, and can accept whatever is assigned. More, the method value() also returns an Object. But the client is sure, due to generic parameterization, that it returns the expected type. But it does not. And the exception is thrown right in the client code.

Monday, October 20, 2008

Experimenting with Covariance and Contravariance in Java Generics

Here I'll just publish the code that demonstrates what is accepted and what is not by the compiler when you use generics. Not many comments, since most of the code is extremely trivial.

Oh, and definitions. An expression is contravariant in variable x of class X if x can be substituted with an instance y of class Y which is a subclass of X. An expression is covariant in variable x of class X if x can be substituted with an instance y of class Y which is a superclass of class X.

E.g. (a = f(b)) is contravariant in b and covariant in a.

Let's have the following class inheritance diagram ('=>' means iheritance):

E => C, D => C, C => B, B => A

In details,

class A {
String id;

A(String id) { this.id = id; }
}

class B extends A {
B(String id) { super(id); }
}

class C extends B {
C(String id) { super(id); }
}

class D extends C {
D(String id) { super(id); }
}

class E extends C {
E(String id) { super(id); }
}


And let's have an interface and a class, something similar to Map and HashMap:

interface M<X, Y> {
void put(X x, Y y);
Y get(X x);
}

class M1<X, Y> implements M<X, Y> {
M1() {};
public void put(X x, Y y){};
public Y get(X x){ return null; }
}


The following code demonstrates a ubiquitous and pretty legal usage of covariance and contravariance.

First, the example where we vary the second parameter ("the value"), and provide an exact knowledge of its type on declaration (and instantiation):

void testParameter2() {
M<String, B> mSB = new M1<String, B>();
mSB.put("", new B(""));
mSB.put("", new C(""));
B aa = mSB.get("");
M<String, C> mSC = new M1<String, C>();
mSC.put("", new C(""));
B b = mSC.get("");
C c = mSC.get("");
}


Now let's try a contravariant version of wildcard in the second parameter of M. Turns out that using wildcard duly blocks some of the "expected" functionality.

void testParameter2WithExtendsWildcard() {
M<String, ? extends C> mSCx = new M1<String, D>();
mSCx = new M1<String, E>();
// Actually, only E should be accepted, but this information is lost on assignment.
// So the next line won't compile:
mSCx.put("", new C(""));
// put(java.lang.String,capture#660 of ? extends common.VarianceTest.C) in
// common.VarianceTest.M<java.lang.String,capture#660 of ? extends common.VarianceTest.C>
// cannot be applied to (java.lang.String,common.VarianceTest.C)

//
// And we don't know if it is D (it is not).
// So the next line won't compile:
mSCx.put("", new D(""));
// put(java.lang.String,capture#511 of ? extends common.VarianceTest.C)
// in common.VarianceTest.M
// cannot be applied to (java.lang.String,common.VarianceTest.D)

B b = mSCx.get("");
C c = mSCx.get("");
}


Trying the same, but 'super' instead of 'extends':


void testParameter2WithSuperWildcard() {
M<String, ? super C> mSCs = new M1<String, C>();
mSCs = new M1<String, B>();
// We don't know what's hiding behind 'super', maybe it's Object
// So the next line won't compile:
mSCs.put("", new B(""));
// put(capture#701 of ? extends common.VarianceTest.C,java.lang.String)
// in common.VarianceTest.M
// cannot be applied to (common.VarianceTest.C,java.lang.String)

mSCs.put("", new C(""));
mSCs.put("", new D(""));
// We don't know what's hiding behind 'super', maybe it's Object
// So the next line won't compile:
B b = mSCs.get("");
// incompatible types found
// : capture#222 of ? super common.VarianceTest.C required: common.VarianceTest.B

Object any = mSCs.get("");
}


The same with parameter 1. No surprises here:

void testParameter1() {
M<B, String> mBS = new M1<B, String>();
mBS.put(new B(""), "");
mBS.put(new C(""), "");
String s = mBS.get(new B(""));
s = mBS.get(new B(""));
M<C, String> mCS = new M1<C, String>();
mCS.put(new C(""), "");
mCS.put(new D(""), "");
s = mCS.get(new C(""));
s = mCS.get(new D(""));
}


The following example shows that it is totally useless to specify 'extends' wildcard in contravariant position:

void testParameter1WithExtendsWildcard() {
String s;
M<? extends C, String> mCxS = new M1<C, String>();
mCxS = new M1<D, String>();
// Who knows what is the actual type of param1...
// The following two lines do not compile:
mCxS.put(new C("");
// put(capture#701 of ? extends common.VarianceTest.C,java.lang.String)
// in common.VarianceTest.M
// cannot be applied to (common.VarianceTest.C,java.lang.String)

mCxS.put(new D(""), "");
// put(capture#701 of ? extends common.VarianceTest.C,java.lang.String)
// in common.VarianceTest.M
// cannot be applied to (common.VarianceTest.D,java.lang.String)


// Same here, parameter type unknown
// The following two lines do not compile either:
s = mCxS.get(new C(""));
// get(capture#897 of ? extends common.VarianceTest.C) in common.VarianceTest.M // of ? extends common.VarianceTest.C,java.lang.String>
// cannot be applied to (common.VarianceTest.C)

s = mCxS.get(new D(""));
// get(capture#897 of ? extends common.VarianceTest.C) in common.VarianceTest.M // of ? extends common.VarianceTest.C,java.lang.String>
// cannot be applied to (common.VarianceTest.D)


}


Now let's try the same with 'super' instead of 'extends'. This means that an instance of any superclass instance can show up (down to Object), we just have no way to know which one it is:


void testParameter1WithSuperWildcard() {
M<? super C, String> mCsS = new M1<C, String>();
mCsS = new M1<B, String>();
// Who knows what should be the actual type of param1...
// The following line does not compile:
mCsS.put(new B(""), "");
// put(capture#248 of ? super common.VarianceTest.C,java.lang.String) in
// common.VarianceTest.M
// cannot be applied to (common.VarianceTest.B,java.lang.String)

// C can be cast to any superclass of C
mCsS.put(new C(""), "");
mCsS.put(new D(""), "");
String s = mCsS.get(new C(""));
s = mCsS.get(new D(""));
s = mCsS.get(new C(""));
// Who knows what should be the actual type of param1...
// The following line does not compile:
s = mCsS.get(new B(""));
// get(capture#330 of ? super common.VarianceTest.C) in common.VarianceTest.M // of ? super common.VarianceTest.C,java.lang.String>
// cannot be applied to (common.VarianceTest.B)

}


Generics with wildcards are not broken. The lack of understanding on how they work stems from the lack of understanding of covariant and contravariant substitution. Once you grasp it, the rest is easy.

And there's something specific about wildcards. If it is ? extends A, it does not mean that anything extending A can substitute. It means that there is something that extends A, and there' no way to figure out what.

Monday, October 06, 2008

Unlimited Cartesian Product in Java - part 2

(This is the second part, see part 1).

Here is the code that implements Cartesian product in Java. The implementation amounts to providing the following methods of class Cartesian:

 
public static <X, XS extends Iterable<X>>
Iterable<Iterable<X>> product(final XS... xss) {
return product(Arrays.asList(xss));
}

public static <X, XS extends Iterable<X>>
Iterable<Iterable<X>> product(final Iterable<XS> xss) {
return new Cartesian<X, XS>().calculateProduct(xss);
}


So we could write code like this:


Iterable<Iterable<Integer>> theProduct =
Cartesian.product(Arrays.asList(
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6)));
assertEquals(9, Set(actual).size());


and like this:


Iterable<Iterable<Integer>> theProduct =
Cartesian.product(
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6));


Note that we could not use Iterable<Iterable<X>>, since, imagine, in this case we could not even build a product of Set<Set<X>>.

How do we calculateProduct? The idea is that we define it by recursion. If we have X0, X1,..., Xn, we define product(X0, X1,..., Xn) as X0 x product(X1, X1,..., Xn); and if the sequence is actually empty, we return a singleton,
new ArrayList<Iterable>() {{
add(Collections.EMPTY_LIST);
}};
(I use crazybob's contraption to build the singleton).


private Iterable<Iterable<X>>
calculateProduct(final Iterable<XS> xss) {
if (xss.iterator().hasNext()) {
Pair<XS, Iterable<XS>> pair = split(xss);
XS head = pair.x();
Iterable<XS> tail = pair.y();
return appendComponentToProduct(head, calculateProduct(tail));
} else {
return new ArrayList<Iterable<X>>() {{
add(Collections.EMPTY_LIST);
}};
}
}


Here I use two methods, split and appendComponentToProduct. The first one is obvious (okay, it's an exercise if you are a beginner); the second one is here. See what it does: we have a component, X0, say,
[1, 2, 3]
, and a product of previous components X1 x ... x Xn, which is an iterable of iterables, e.g. [[4, 5], [5, 6]]; we need to append elements of X0 to all elements of the partial product. That appendage function is called consWith: it takes an x and appends it to all partial product elements. We apply this function to the whole X0; what we receive has to be flattened to produce a list of tuples [x0, x1, ..., xn]


private Iterable<Iterable<X>>
appendComponentToProduct(
XS component,
Iterable<Iterable<X>> partialProduct) {
// E.g. [1, 2, 3], [[4, 6], [5, 6]] ->
// [[[1, 4, 6], [1, 5, 6]], [[2, 4, 6], [2, 5, 6]], [[3, 4, 6], [3, 5, 6]]] ->
// [[1, 4, 6], [1, 5, 6], [2, 4, 6], [2, 5, 6], [3, 4, 6], [3, 5, 6]]
return flatten(consWith(partialProduct).map(component));


Flattening is, again, a trivial exercise; let's take a look at our consWith. It is a method that returns a function (which we apply to the component X0:


public <ElementOfProduct extends Iterable<X>>
Function<X, Iterable<Iterable<X>>>
consWith(final Iterable<ElementOfProduct> pxs) {
return new Function<X, Iterable<Iterable<X>>>() {
// takes an x, returns a function appending x to sequences
public Iterable<Iterable<X>> apply(final X x) {
// Takes a sequence [x1, x2, ...], returns [x, x1, x2, ...]
return new Function<ElementOfProduct, Iterable<X>>() {
public Iterable<X> apply(ElementOfProduct y) {
return concat(Arrays.asList(x), y);
}
}.map(pxs);
}
};
}


A little bit dizzy? It's okay. We have arrived to the summit.

This is Higher Order Java!

Wednesday, October 01, 2008

Unlimited Cartesian Product in Java - part 1

You probably know what Cartesian product is. Take two sets, A and B, their product, A x B, is a set of all pairs (a, b), where a belongs to A and b belongs to B. Something like that.

Here I am going to show how to build a Cartesian product for an unlimited collection of sets, or rather for an Iterable of Iterables. Unlimited means any number starting from 0. Due to limitations of Java, all these component sets (or rather component Iterables, will have the same element type.

More formally, given a sequence of sets (or of sequences), A0, A1, ... An, we will produce A = A0 x A1 x ... x An, whose elements are sequences (a0, a1, ..., an), where each ai belongs to Ai.

So, for instance if we have just A0 and A1, their product will be the following: {(a0, a1}.

With this definition, we, with some confusion, have to admit that the Cartesian product of just one A0 is a set of singletons, {(a)} for all a in A. Of course there is a canonical one-to-one correspondence between A and its "product".

More curious is the question what exactly is the product of 0 of components. In a category with limits it is a terminal object, 1. But how about JVM? There's no such thing as a terminal object.

No problem. We can use our definition, and take the set of empty sequences. There's just one empty sequence, so our canonical singleton has the following form:
{{}}. Did you notice the underlying type is not even mentioned. We have an Iterable of one single empty iterable of what type? Actually, Java deduces it from the type of the variable the product is assigned to. That's the safest way to pass the type knowledge into the executing code.

So, the solution I propose is to have something like this:

Iterable<Iterable<Integer>> actual = Cartesian.product(Arrays.asList(Arrays.asList(1, 2, 3), Arrays.asList(4, 5, 6)));
assertEquals(9, Set(actual).size());


Note the funny function name: Cartesian.product. Why do we need the special class named Cartesian? For two reasons. One, it is a neat way to provide type parameterization; second, calculating the product is not an elementary operation, so we need a bunch of private methods to help us. In JavaScript this can be accomplished by defining functions within our function; in Haskell we could comfortably add to our main definition the auxiliary "where" definitions. In Java, the "where" become class or instance methods.

So, we are introducing a special class named Cartesian, with just one exposed method, product:


public static class Cartesian<X, XS extends Iterable<X>> {
...
public static <X, XS extends Iterable<X>> Iterable<Iterable<X>> product(final Iterable<XS> xss) {
return new Cartesian<X, XS>().calculateProduct(xss);
}
...


You've noticed the IT parameter type. The reason for having this parameter is that we can pass for product calculations all kinds of Iterables of Iterables: Sets of Lists, Lists of Sets... any combination would work.

First, the algorithm. Here's how lj user=_navi_ defines it:

product :: [[a]] -> [[a]]
product [] = [[]]
product (xs:xss) = product' xs (product xss)
where
product' xs pxs = concat $ map (\x -> map (x:) pxs) xs


In plain words, for us, Java pdestrians, it means this:

Say, we have an Iterable consisting of the first component, and the tail. And suppose we already have a product (call it pxs) built for the tail.
Then we do the following: scan the first component, A0 attaching its elements ai to the head of each element of the already built product.

We do this by having a function (let's call it f(a0, as) that concatenates an element a0 with the tail [a1, ..., an], producing a new sequence [a0, ..., an].
Having function f, we build another function, g(a) that builds a list of all [a0, ..., an] - by mapping the product of the tail, pxs using the function f: g returns, for each a0, a list of concatenations built by f.

So, for each element a the function g produces a list. Now let's apply this to all the elements of the first component, A0: that is, transform the component... to what? To a sequence of sequences of sequences.

We do not need such a deep structure. We just need the sequences of the lower level, listed in some order or in no order. So we do what is known as flattening the list.

The important thing is that the product we build should be lazy. As an example, suppose we build all permutations of N elements. This is equivalent to building a cartesian product of [0..N-1] x [0..N-2] x ... x [0..1] x [0]; there's N! elements in the product. We do not want to fill our whole memory with these elements before we start processing them.


(the code is published and commented in the second part)

Followers

Subscribe To My Podcast

whos.amung.us