In this article I use Pair class, see details here.
public abstract class Function<X, Y> {
public abstract Y apply(X x);
}
If that were all that we wanted, we could as well make it an interface. There is a certain trend to first introduce an interface, and then an abstract base class implementing this interface and providing the base functionality. But... how many different ways of defining a composition of two functions do we know? Exactly one, right? See, there are certain rules and properties, without which a function would not be a function. Let's make them a part of the abstract class, since there's no choice anyway.
public static <X, Y, Z> Function<X, Z>
compose(final Function< f, final Function<Y, Z> g) {
return new Function<X, Z>() {
public Z apply(X x) {
return g.apply(f.apply(x));
}
};
}
public static <T, T1 extends T> Function<T1, T> id() {
return new Function<T1, T>() {
public T apply(T1 t) { return t; }
};
}
We will be tempted to apply a
Function
to an Iterable
, a Collection
, a Set
, a List
.The naive way would be to write something like this:
Function<X, Y> f = ....
....
List<X> xs = ...
...
List<Y> ys = new ArrayList<Y>();
for (X x : xs) {
ys.add(x);
}
// good job!
I have three problems with this kind of code.
Problem 1. It runs an explicit loop. Did you ever ask yourself what is the reason programmers have been using loops in their code? Are not all the use cases already known, so that we could just mention the case, and the loop is magically applied? Want an example? String concatenation. Nobody writes a loop to concatenate two strings.
Problem 2. We may not even need all the results in the new array,
Problem 3. Although an average computer these days has more memory than an average programmer can fill, it can happen that the source array is exceptionally long; and if we fill each temporary array with data, we may be soon out of memory. And for what reason? For no reason, just because we don't know better.
But we do.
The transformed set can be virtual. Built on demand. Let's start with applying a
Function
to an Iterator
. What are we going to obtain? Another Iterator
, with exactly the same number of values, but the values are different.
public Iterator<Y> map(final Iterator<X> source) {
return new Iterator<Y>() {
public boolean hasNext() {
return source.hasNext();
}
public Y next() {
return apply(source.next());
}
public void remove() {
source.remove();
}
};
}
Iterator
s rarely materialize themselves in the code; they use to be returned by Iterable
s; so let's apply the same trick to Iterable
s.
public Iterable<Y> map(final Iterable<X> source) {
return new Iterable<Y>() {
public Iterator<Y> iterator() {
return map(source.iterator());
}
};
}
We may not feel comfortable if the resulting
Iterable
's iterator() is called multiple times; but in this case we can eventually decide to cache the values. Or maybe not, if there's a billion of them, and the function is inexpensive to calculate.The only difference between a
Collection
and an Iterable
is that in the case of Collection
we know the size. So there.
public Collection<Y> map(final Collection<X> source) {
return new AbstractCollection<Y>() {
public Iterator<Y> iterator() {
return map(source.iterator());
}
public int size() {
return source.size();
}
};
}
For List, we throw in one more method,
get(int)
:
public List<Y> map(final List<X> source) {
return new AbstractList<Y>() {
public Y get(int index) {
return apply(source.get(index));
}
public Iterator<Y> iterator() {
return map(source.iterator());
}
public int size() {
return source.size();
}
};
}
I wonder if anybody can come up with a solution that does not involve copy-pasting iterator() implementation.
Now, you have probably noticed that all these transformation methods are not static "util", but are a part of
Function
class definition. As a result, if you have a Function f
, you can write in your code List<Y> results = f.map(souceList);
Applying the same trick to a
Set
is not exactly fair. Set
"contract" say that we should not have repeating values; and in our case there is no guarantee that we would not:
public Set<Y> map(final Set<X> source) {
return new AbstractSet<Y>() {
public Iterator<Y> iterator() {
return map(source.iterator());
}
public int size() {
return source.size();
}
};
}
On certain occasions, though, the uniqueness of the function values is guaranteed, like in the case of Schwartzian Transform. Schwartzian transform consists of producing pairs (x, f(x)) for a given collection of x's. This operation is known to be useful in the times of Perl language; I found it useful in Java too.
private static <X, Y> Function<X, Pair<X, Y>> schwartzianTransform(final Function<X, Y> f) {
return new Function>() {
public Pair<X, Y> apply(final X x) {
return LazyPair(x, f);
}
};
}
It is kind of funny that I have to define this method as static; one of us (me or java) is not smart enough to build new functions inside instance methods. Feel free to go ahead and try.
All of this allows us, given a
Function
and a Set
, build a Map
:
public Map<X, Y> toMap(final Set<X> keys) {
// this innocent function makes a Pair look like a Map.Entry
Function<Pair<X, Y>, Map.Entry<X, Y>> fromPairToMapEntry = id();
final Function<X, Map.Entry<X, Y>> pairBuilder =
compose(schwartzianTransform(this), fromPairToMapEntry);
return new AbstractMap<X, Y>() {
public Set<Entry<X, Y>> entrySet() {
return pairBuilder.map(keys);
}
};
}