When I find something interesting and new, I post it here - that's mostly programming, of course, not everything.

Thursday, September 09, 2010

Dealing with Infinities in Your Code - 1

These days it is not unusual to have an infinite sequence in a computer system. The idea is that while the sequence grows, the system upgrades, and there's no limit unless we make certain assumptions like accepting non-standard arithmetics. Then even sequences that we expect to be finite may turn out to be infinite - like Goodstein's sequence.

Still, it seems like in a foreseeable future we will only encounter countable sets. They are not necessarily infinite; any finite set is traditionally considered countable, that is, equal in size to a subset N, the set of natural numbers. The set of all real numbers, R, is a typical example of an uncountable set (it is a continuum). (Whether there is a set of intermediate size, between N and R, depends on what axiom we assume. If we assume it exists, it exists; if we assume it does not exist, it does not. Quite an eye-opener for people that use to treat Mathematics as a provider of absolute truths, right?)

What's good in a countable set is that (assuming set-theoretical axioms) one can list its element in such a way that any arbitrary element may be eventually reached. (That's due to another axiom, the Axiom of Choice - it says that we can introduce linear order in any set. If we don't assume this axiom - then we cannot.)

When I worked in Google, one of my favorite interview questions was "given an alphabet, write a code that would produce all words that сan be built based on this alphabet". Then I had to explain that when words are built out of alphabet, it does not mean each letter is used at most once. It does not. Then I had to explain that by "producing all words" I mean writing a code that could reach any word that can be built out of the alphabet. For instance, given {A,B}, the sequence "A", "AA", "AAA", etc would be infinite, but it won't constitute a solution: it will never reach letter 'B'.

In my work I limit myself with sets as they are presented in Java or Scala.

One particular problem is that, usually, an implementation of Set is supposed to have a method that returns the set's size. Not sure I know why. Probably it has something to do with array reallocations... some technical stuff. Because see, if you know that your set is infinite, you are safe, you can always return, by the rules of Java, Integer.MAX_VALUE; but what if you do not know? Say you have a function that takes another function, calls it subsequently until it halts (e.g. fails), and records the times it takes each call. What would be the size? We have to solve Halting Problem to know the answer.

On the other hand, isEmpty is a pretty decent method; it could easily belong to Iterable: to check if an Iterable is empty, one has to just get its Iterator and call its hasNext.

Also, it would be good to supply a set with a predicate, similar to what they do in Zermelo-Fraenkel Set Theory: according to the Extensionality Axiom, two sets are equal iff they consist of the same elements. Not having to count to infinity (or futher) to check if an element belongs would be a good advantage. E.g. for the set of all odds the predicate would not have to scan through all odd numbers from 1 to the given one; just check its last bit.

So that's why I came with a set that is defined by its

  • iterable that enumerates the elements;

  • predicate that "efficiently" checks if a value belongs to the set

  • size evaluator that, when called, tries to evaluate the size of the set
  • .

I use Scala for coding these days; hope it is reasonably pretty:

private def setForIterator[X](sourceIterator: => Iterator[X], sizeEvaluator: => Int, predicate: X => Boolean): Set[X] = {
new Set[X] {
override def contains(x: X) = predicate(x)
override def size = sizeEvaluator
override def iterator = sourceIterator filter predicate
override def isEmpty = !iterator.hasNext
override def -(x:X) = requireImmutability
override def +(x:X) = requireImmutability
def map[Y] (f: Functions.Injection[X,Y]) : Set[Y] = f.applyTo(this)
def map[Y] (f: Functions.Bijection[X,Y]) : Set[Y] = f.applyTo(this)
override def filter(p: X => Boolean) = filteredSet(sourceIterator, (x: X) => predicate(x) && p(x))

def setOf[X](source: Iterable[X], sizeEvaluator: => Int, predicate: X => Boolean) =
setForIterator(source.iterator, sizeEvaluator, predicate)

Had to override a bunch of methods declared in trait Set, to avoid any immediate calls of size evaluator or assuming that the size is known.

Here's how, using this class, we can define the set of natural numbers:
val N = setOf(new Iterable[Int] {
def iterator = new Iterator[Int] {
private var i: Int = -1
def hasNext = true
def next = {i += 1; i}
(x: Any) => x.isInstanceOf[Int] && x.asInstanceOf[Int] >= 0)

That's it for now. In the next part I'll talk about how to combine two countable sets. It is an old fact that a union and a product of two countable sets are countable; we will need a code that behaves according to this knowledge.

No comments:


Subscribe To My Podcast