When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Tuesday, August 05, 2008

Java Style. Iterating over collections

Suppose you decide to concatenate collections; it should of course look something like this:


Collection<A> cat(Collection<A>... collections);


What you need to do is return an instance of anonymous class extending AbstractCollection<A>; and you will need to implement two methods, iterator() and size().


Collection<A> cat(final Collection<A>... collections) {
return new AbstractCollection<A>() {
public Iterator<A> iterator() {
...
}

public int size() {
...
}
};
}


Implementing size() is almost trivial: we can start with this naive code


public int size() {
int size = 0;
for (Collection<A> c : collections) {
size += c.length;
}
return size;
}


Now raise your hAnds who think this is the correct solution... Wrong.

The problem is that for a large collection the result may be not precise. Quoting Collectio.size javadoc: "Returns the number of elements in this collection. If this collection contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE."

So, let's rewrite size():

public int size() {
int size = 0;
for (Collection<A> c : collections) {
if (c.size() == Integer.MAX_VALUE) return Integer.MAX_VALUE;
size += c.size();
if (size < 0) return Integer.MAX_VALUE;
}

return size;
}


If you find an error in the code above, let me know.

Now, iterator(). Not that I doubt you can implement it; what I want to say is that in this exercise minute details turn out to be important.

Here's our first attempt:


Collection<A> cat(final Collection<A>... collections) {
return new AbstractCollection<A>() {
public Iterator<A> iterator() {
private int i = -1;
private Iterator<A> current = null;


public boolean hasNext() {
while (i < collections.size()) {
if (current != null && current.hasNext()) return true;
current = collections[++i].next().iterator();
}
return false;
}

public A next() {
if (hasNext()) return current.next();
throw new NoSuchElementException();
}

public void remove() {
if (current != null) current.remove();
}
}
...
}
}




Here we are trying to be smart, bypassing empty collections, and getting to the one that has something; this will also work if the argument list is empty.

The problem is, it won't work if the argument list is not empty: see we are doing this assignment,

current = collections[++i].next().iterator();
- when i reaches collections.length - 1, we will go overboard. So we have to fix this, replacing the line with

if (i < collections.length) current = collections[++i].next().iterator();


This does look disgusting! In addition to checking for null, we are checking the same condition twice!

Null... can we get rid of null? Let's try this: set the initial value of current to an empty iterator:

Iterator<A> current = Collections.EMPTY_SET.iterator();


So that now we can say just

if (current.hasNext()) return true;

instead of

if (current != null && current.hasNext()) return true;


Good, good. How about index checking?

I have this idea: replace an array with a list (actually, an iterator). We take the array of arguments and turn it into a list, like this:

public Iterator<A> iterator() {
return new Iterator<A>() {
// Top iterator
private Iterator<Collection<A>> i = Arrays.asList(collections).iterator();
// Bottom iterator
private Iterator<A> j = Collections.EMPTY_SET.iterator();

public boolean hasNext() {
for (;!j.hasNext() && i.hasNext(); j = i.next().iterator());
return j.hasNext();
}

public A next() {
if (hasNext()) return j.next();
throw new NoSuchElementException();
}

public void remove() {
j.remove();
}
};
}


Amazing, heh? Where's double index checking? It's gone. It's all hidden inside the delegate iterator.
That's final.

No comments:

Followers

Subscribe To My Podcast

whos.amung.us