Friday, November 24, 2006

Higher order functions in Java with an annotation processor factory

I’m continuing my explorations of both Java 5 features like generics and annotations and of functional programming constructs. A lot of Java discussions on the web these days are about closures: writing quick anonymous methods without having to wrap them in a abstract class (think Runnable, Comparator, …). Now, closures can also be related to higher order functions, that is to say functions considered as objects in their own right. If you can reference a precise method inside an instance instead of just having to deal with interface, you can have a type of closures. For example, instead of passing a Comparator object you could pass a Function object, meaning a function that takes two T arguments and returns an integer.
To illustrate this point, let’s take an example.
You have a simple Sale class:

public class Sale {
private String customer;
private Date dateOfSale;
private float amount;

public Sale(String customer, Date dateOfSale, float amount) {
…constructor, getters and setters…

And you have a SaleManager class that encapsulates a list of sales (think retrieved from a database, etc…) and gives you a method to retrieve all sales sorted by what you want:

public class SalesManagerClassic {
private List sales;

public SalesManagerClassic(List sales){
this.sales=sales;
}

public Sale[] listSales(Comparator comp){
Sale[] ret=new Sale[sales.size()];
sales.toArray(ret);
Arrays.sort(ret,comp);
return ret;
}


}

So far so good. Now you define a SaleComparator that will provide you all kinds of comparators:
public class SaleComparatorClassic {

public Comparator compareByCustomer(){
return new Comparator(){
public int compare(Sale o1, Sale o2) {
return o1.getCustomer().compareToIgnoreCase(o2.getCustomer());
}
};
}

public Comparator compareByAmount(){
return new Comparator(){
public int compare(Sale o1, Sale o2) {
return (int) (o2.getAmount()-o1.getAmount());
}
};
}

}

As you can see, creating a comparator each time is tedious. Closure and higher order functions aim at simplifying the important code bits so that the actual strategy (how do we sort the sales) is not buried in all the boilerplate code. This is how we could use that class:

List l=new ArrayList();

SalesManagerClassic smc=new SalesManagerClassic(l);
SaleComparatorClassic scc=new SaleComparatorClassic();
Sale[] s=smc.listSales(scc.compareByCustomer());

A Function object in Java is something like that, roughly:

public abstract class Function {
private Object instance;

public Function(Object instance){
this.instance=instance;
}

public Object getInstance() {
return instance;
}
}

So it encapsulates the instance on which the function will be run. We’re not going out of the OO world right now.
Then generics in Java do not (if they do, drop me a line) allow you to specify that a class can take any number of generic type in its definition, so we’ll have to define a concrete class for all possible numbers of parameters:

public abstract class Function0 extends Function {
public Function0(Object instance){
super(instance);
}

public abstract R run();

}

The code above represents an object method that takes no argument and its return type is the generic type R (which could be Void if the method return nothing). Subclasses need to provide the actual implementation of the run method. And a function taking one parameter will be:

public abstract class Function1 extends Function {

public Function1(Object instance){
super(instance);
}

public abstract R run(P param);

public Function0 curry(final P param){
Function0 f=new Function0(getInstance()){
@Override
public R run() {
return Function1.this.run(param);
}
};
return f;
}

}

Here P is the type of the first parameter. For fun I added the curry function: if you know the first parameter but you don’t want to run the function right now, you can create a Function object with no argument that can be run later.

So what would that mean for our sales manager? This is what the code can look like:

public Sale[] listSales(final Function2 func){
Sale[] ret=new Sale[sales.size()];
sales.toArray(ret);
Arrays.sort(ret,new Comparator(){
public int compare(Sale o1, Sale o2) {
return func.run(o1, o2);
}
});
return ret;
}

The boiler plate for the comparator has been moved in one place (of course we still need it, we’re not writing our own version of Java where standard API calls takes our Function objects right now…). The Sale comparator is where the gain is made:

@FunctionAnnotation
public int compareByCustomer(Sale o1,Sale o2){
return o1.getCustomer().compareToIgnoreCase(o2.getCustomer());
}

@FunctionAnnotation
public int compareByAmount(Sale o1,Sale o2){
return (int) (o2.getAmount()-o1.getAmount());
}

The FunctionAnnotation is a simple method level annotation. This tells us: I want to be able to refer to these methods as Function objects. Note that my code for the moment does NOT deal with methods with the same name and different argument list. We’ll worry about that later… Notice for now how the methods are much simpler and how what they do is clear.

Now, what we need to do, is to do automatically the translation between simple methods with the FunctionAnnotation and Function<…> objects that we can pass to our SalesManager. For this we’ll use an annotation process factory. This factory will run on the source code for SaleComparator, and will generate a SaleComparatorFunctions class that will have methods giving you the Function<…> objects:

public static Function2 compareByAmount(final fr.moresmau.jp.func.samples.SaleComparatorFunctional instance){
return new Function2 (instance){
public Integer run(fr.moresmau.jp.func.samples.Sale param0, fr.moresmau.jp.func.samples.Sale param1) {
return instance.compareByAmount( param0, param1);
}
};
}

public static Function2 compareByCustomer(final fr.moresmau.jp.func.samples.SaleComparatorFunctional instance){
return new Function2 (instance){
public Integer run(fr.moresmau.jp.func.samples.Sale param0, fr.moresmau.jp.func.samples.Sale param1) {
return instance.compareByCustomer( param0, param1);
}
};
}

Aren’t we happy this is generated for us ? Then we can just code our calls like that:
List l=new ArrayList();

SalesManagerFunctional smf=new SalesManagerFunctional(l);
SaleComparatorFunctional h=new SaleComparatorFunctional();
Sale[] s=smf.listSales(SaleComparatorFunctions.compareByCustomer(h));

That code is about the same complexity as before. Of course using another class for the function wrapping make the code not as clear as it should be. In a later post I’ll look at alternatives, so for the moment bear with me.

So we have obtained higher order functions from a standard Java class, and we can use these functions instead of comparator objects.

Now of course I haven’t talked about the annotation process factory yet. There’s nothing magic there. We need to implement an AnnotationProcessFactory and to say we want to do something when we encounter our annotation:

public Collection supportedAnnotationTypes() {
return Arrays.asList("fr.moresmau.jp.func.FunctionAnnotation");
}

And we need to implement a AnnotationProcessor that will generate a Functions class for every class that as at least one method with the FunctionAnnotation. For that we need to use the createSourceFile method of the Filer object provided by the environment to create the function file.
Running the factory is done through the JDK apt tool:

apt -s src -factory fr.moresmau.jp.func.FunctionProcessorFactory -d bin -cp bin src/fr/moresmau/jp/func/samples/SaleComparator.java

The code for the Function objects and the AnnotationProcessFactory can be found here, the Sale example here, with unit tests.

Friday, November 17, 2006

Design By Contract in Java using Annotations, Scripting and Aspects

When you write a Java interface, you define a contract. This contract is made of methods and their signatures. Then, in the implementation code, you can write statements to check that the input parameters are not null, that return values are within a correct range, etc. The problem there of course is that if the implementation doesn't do these checks, or not properly, bugs
may creep in and they may be hard to find. "Design by Contract" is a set of techniques that
let you specify conditions on method parameters, return values, etc, in such a way that they
can be associated with interfaces and that the execution of these statements is assured by the
runtime system, freeing the developer from the tedious tasks of throwing
IllegalArgumentExceptions or assertions.

To implement such techniques, you can use some off the shelf frameworks, like jContractor, but it's easy to roll out your own. You need only three components:
  • you need to be able to attach conditions to classes and methods, without cluttering the code. For this, Java annotations are perfect
  • since we're not in Java code proper, we need a way to run our conditions, so we need a language for them. Well, why have another language, just use a Java scripting languages like Beanshell.
  • finally, the Beanshell code needs to be evaluated at runtime before or after methods calls. This is what aspect are for, right? We'll use AspectJ for that.

So, let's see a concrete example. Suppose we implement a Stack (yeah, right, I don't have a lamer example).
We start with the push method:

public interface IStack {
void push(T o);
}

Ok, but we don't like to have null values in this stack. So we want to specify in the interface definition that the first parameter cannot be null:

@ContractMethodAnnotation (
pre = {"p0!=null"}
)
void push(T o);

We cannot access the names of the method parameters at runtime, so we use a simple convention: p0 is the first parameter, etc. You can also define post conditions that can access the return value using the name "return".

Notice that we specify conditions on the interface, but that we will want the checks to apply to all its implementations.

Conditions can access methods and variables of the class you define them on, but be careful, it's easy to create circular references where two method call each other:

boolean isEmpty();

@ContractMethodAnnotation (
post = {"result>=0"}
)
int size();

It would be tempting to define a condition that isEmpty is true if size()==0 and that size()==0 if isEmpty is true... What you can do is define a class invariant condition: a condition that is always true of an object (even if it can transiently be false, it is true before and after public method invocation).

@ContractClassAnnotation (
invariants={"(isEmpty() && size()==0) || (!isEmpty() && size()>0)"}
)
public interface IStack { ... }

The implementation of the aspect is not very complicated. We define pointcuts for public methods that will check class invariants and for annotated methods that will check pre and post conditions. We use the BeanShell Interpreter object to parse and run the condition code. Thanks to the importObject Beanshell command we can run the code as if all methods referenced by the annotation code referenced the current instance.

When we encounter a failure, we can either use an assertion or throw a RuntimeException.
This can be set through the ContractAspect.setUseAsserts method. For fun, I have defined a way to set it through a bean shell configuration file, that is loaded by another aspect, that executes when the ContractAspect is statically initialized.

The code can be found here: ContractAspect.aj is the aspect implementation, ContractClassAnnotation.java and ContractMethodAnnotation.java are the definition of the annotations. SettingsAspect.aj is the aspect that reads the configuration file and turn assertions on or off. An example of a configuration file is here. As you see, it's only one line of Java code.

The Stack example can be found here, with an implementation and a unit test.


Friday, November 10, 2006

My first Haskell adventure game!

After years of coding mainly in Java, I have taken the plunge and looked
into a pure functional language. Haskell is quite popular these days, so I decided to read a book
on it and play around with it myself.
I got a bit frustrated at the maths or geometry examples, when I got to this great article Casting spels in LISP that uses the medium of a text based adventure game to teach the basic notions of LISP.
So I coded a very little (400 lines with data and documentation) text-based game in Haskell.
The code can be found here (You need a little extra module found here). In my enthusiasm I even added some Haddock documentation here.
The main features are:
- use of the IO monad to get user input and give back feedback
- use of the IO monad to save and load gaves
- use of the Data.Map structure
- use of the Show and Read classes to serialize and deserialize a Map structure
- pure functional handling of state: every change in the game state recreates an instance of the
GameState object. Yes, one day I'll look into the State monad to simplify these functions

My first impressions of Haskell:
- the syntax is quite hard to grasp, with loads of different signs. The indenting drove me mad sometimes.
- thanks for the static typing and the compilation. I found that the hardest part was to get the program to compile.
Once the program compiled all right it usually worked as intended.
- I know that pattern matching and guards are only syntaxic sugar for case statements, but I found them (the patterns, the guards) a lot more elegant and easy to understand
- I need to learn how to deal better with "newtype" and pattern matching. I used "newtype" instead of type for some data types because I needed to declare them as instances of Show and Read, but that forced me to use as-patterns for game state.
Surely there's simpler ways to do that
- yes, I know about $ to avoid parenthesis for function parameters, but with a good editor it's ok to manage

Now, if anybody has tips on how to improve the code I'll take them on board. I'm going to look at separating the data from the code, maybe with a simple DSL and parser. And yes, the state or IORef monads...