When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Monday, September 22, 2008

Java Pair: a closer look

Who did not implement a Pair class, he did not write code yet. Or she.

People tired of implementing it, beg Sun to add Pair to the library (bugs 6229146, 4947273, 4983155, 4984991). Sun decided not to fix it. So the whole world under the Sun does.

A cheap solution would be to have a record with two fields, first and second, and implement equals() to compare both fields, and hashCode() as something like (first * 31) ^ second (computer people believe in magic "randomizing" properties of xor and number 31) so that one would be able to freely change the values.

Convenient, but wrong. What if you want to use pairs as keys in a map? You put an entry in the map, then change the key content, and kaboom! entry not found. A funny but working solution could be to freeze the hashCode on creation, getting them from another source of randomness in Java world, System.currentTimeMillis(): one can easily build half a million records within one milliseconds these days.

A better solution would be to have your pair immutable: take constructor parameters, store them, and only return them, but never change them. This, with a decent implementation of hashCode() and toString() is a good candidate to enter the imaginary World of Ideal Code.

But there's a nuisance. It would be great if it implemented Map.Entry< - that is, have aliases for getFirst() and getSecond(); first and second are now also known as key and value. Makes life easier if you already have a set of pairs and want to expose it as a map... yes, uniqueness of keys is required.

As an example, imagine that we have a Function<X, Y>, and a set of arguments; to expose this as a map, we have choices: either to instantiate a regular HashMap and fill it with the data; instantiate a Map as an AbstractMap where entrySet() is a prebuilt Set<Pair<X, Y>>, or do something smarter. See, if the key set is really huge, and/or evaluating the function takes some time/resources, we probably do not want to duplicate the keys and run the function over all of them. Who knows how many entries in this map will be needed by the client code.

Things can be even worse: the key set maybe large and/or virtual, so that building a set of pairs is an impossible task. What if it is a set of Natural Number, like this:

public class N extends AbstractSet<Integer> {
public Iterator iterator() {
return new Iterator() {
int n = 0;
public boolean hasNext() {
return true;
}

public Integer next() {
return n++;
}

public void remove() {
throw new UnsupportedOperationException();
}
};
}

public int size() {
return Integer.MAX_VALUE;
}
}


I am sure you do not want to build a set of pairs on such a set.

So let us not assume anything. One pair may be lazy, the other one just instantiate from two given values. We need an abstract class to generalize this.


/**
* Abstract pair class.
* Implementations should implement two methods,
* and may be lazy or regular.
* But in any case this is an unmodifiable pair.
*/
public abstract class Pair<X, Y>
implements Map.Entry<X, Y> {
public abstract X x();
public abstract Y y();

public X getKey() {
return x();
}

public Y getValue() {
return y();
}

public Y setValue(Y value) {
throw new UnsupportedOperationException();
}

private boolean equal(Object a, Object b) {
return a == b || a != null && a.equals(b);
}

public boolean equals(Object o) {
return o instanceof Pair &&
equal(((Pair) o).x(), x()) &&
equal(((Pair) o).y(), y());
}

private static int hashCode(Object a) {
return a == null ? 42 : a.hashCode();
}

public int hashCode() {
return hashCode(x()) * 3 + hashCode(y());
}

public String toString() {
return "(" + x() + "," + y() + ")";
}
}


From this class we can build a couple of concrete classes:


public class BasePair<X, Y> extends Pair<X, Y> {
private final X x;
private final Y y;

public X x() { return x; }
public Y y() { return y; }

private BasePair(X x, Y y) {
this.x = x;
this.y = y;
}

public static <X> BasePair<X, X> Pair(X[] source) {
assert source.length == 2 :
"BasePair should be built on a two-element array; got " +
source.length;
return new BasePair(source[0], source[1]);
}
}


BasePair is built from either two values or a two-element array; do we need a list-based constructor? It's possible.


public class LazyPair<X, Y> extends Pair<X, Y> {
private final X x;
private final Function<X, Y> f;
private Y y;
boolean haveY = false;

private LazyPair(X x, Function<X, Y> f) {
this.x = x;
this.f = f;
}

public X x() {
return x;
}

public Y y() {
if (!haveY) {
y = f.apply(x);
haveY = true;
}
return y;
}

@Override
public boolean equals(Object o) {
return o instanceof LazyPair ?
equal(x(), ((LazyPair)o).x()) : o.equals(this);
}

@Override
public int hashCode() {
return hashCode(x);
}
}


LazyPair calculates the function value only once, and caches it.

This seems to be all on this trivial issues. Thanks for reading!

No comments:

Followers

Subscribe To My Podcast

whos.amung.us