When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Sunday, September 12, 2010

Dealing with Infinities in Your Code - 2

In this post I had introduced the idea of having countable sets in your code. Next I want to show how we can combine countable sets producing new countable sets.

We have two operations, union and Cartesian product. Let's start with disjoint union - the union of non-intersecting sets.

If we expect our sets to be both finite, there's nothing special here. The predicate checks whether an element belongs to one set or another; the iterator iterates over the first set, then the second.

What if one or both of our sets is/are infinite? The predicate seems to be still okay: we just check whether the element belongs to one of the sets. But the iterator... if the first set is infinite, our naive iterator will never reach the second set. I mean, not in this universe; we will need transfinite numbers readily available, which is well beyond the power of modern computers or most of modern humans. What can we do then? Well, we can alternate over two iterables.

Let's look at the code:
def union[X, X1 <: X, X2 <: X](set1: Set[X1], set2: Set[X2]): Set[X] = {
lazy val parIterable = new ParallelIterable(set1, set2)
lazy val size = if (set1.size == Integer.MAX_VALUE ||
set2.size == Integer.MAX_VALUE) {
} else {
set1.size + set2.size
(x: X) => (x.isInstanceOf[X1] && (set1 contains x.asInstanceOf[X1]))||
(x.isInstanceOf[X2] && (set2 contains x.asInstanceOf[X2]))

What we have here? A new iterable, a new size evaluator, a new predicate. We can ignore the size evaluator; nothing is calculated here until someone requests it. The predicate is obvious; the only new thing is ParallelIterable. We need it to iterate over the union of two countable sets (rather, over two iterables) in a smart way:

class ParallelIterator[X, X1 <: X, X2 <: X](
iterator1: Iterator[X1],
iterator2: Iterator[X2]) extends Iterator[X] {
var i2 : (Iterator[X], Iterator[X]) = (iterator1, iterator2)
def hasNext = iterator1.hasNext || iterator2.hasNext
def next = {
i2 = (i2._2, i2._1)
if (i2._1.hasNext) i2._1.next else i2._2.next

Note that we flip i2, so it changes from (iterator1, iterator2) to (iterator2, iterator1) and back every cycle.

As a result, in the output we just interleave members from the first component with the members of the second component. What is important here is that we do not make any assumptions regarding which iterator is finite and which is infinite; if one component does not have more elements, fine, we continue scanning the other one.

This was probably pretty simple; now we have to deal with Cartesian products, that is, the set of all pairs (x: A, y: B) where x is from the first set and y is from the second set; with no assumptions regarding the finiteness of the said sets.

That would be the next part.


9000 said...

I still doubt that
if (i2._1.hasNext) i2._1.next else i2._2.next
works as expected. It keeps iterating over first iterator as long as it hasNext(). I'd expect something to enforce interleaving that only omits it when one of hte iterators is over.
OTOH, maybe I just don't know enough Scala to see this in place :)

Vlad Patryshev said...

Look at how the iterator pair, i2, is being flipped every cycle.


Subscribe To My Podcast