Friday, October 26, 2007

A simple functional structure in Java

I have progressed a little bit in creating a functional language on top of Java, code name FLOJ (Functional Language On Java (-:). For the moment I'm only working on single files (I have a working parser), on the simple language constructs.
It's really interesting, because looking at very simple things make you think in depth about some issues that you may not really consider in Java proper. For example, this is the definition of a very simple structure:

package fr.moresmau.jp.floj.samples;

public class Simple(String data,int port){

}


So you can see that the package system is Java's and class definition are very similar. There are no constructor, so. Since null are not allowed (nulls are a huge source of bugs, I'll provide a Maybe object or something like that later),
the structure must have all its data filled in. If you want different operations to be done, you just create static functions that take parameters and create what you need, no need for several constructors. So the list of fields is entered straight after the name of the class.
Simple is a structure with a string and an int. It's easy enough to generate the constructor in Java:

package fr.moresmau.jp.floj.samples;

public class Simple{
public final String data;

public final int port;

public Simple(final String _data, final int _port){
if(_data==null){throw new NullPointerException("_data");}
this.data=_data;
this.port=_port;
}
...


The fields are immutable of course, so we don't bother with a getXXX() method, we just expose the field.
And you can then add setters that return a new object :

... public Simple setData(final String _data){
if(_data==null){throw new NullPointerException("_data");}
if (_data.equals(data)){return this;}
return new fr.moresmau.jp.floj.samples.Simple(_data,port);
}

public Simple setPort(final int _port){
if (_port==port){return this;}
return new fr.moresmau.jp.floj.samples.Simple(data,_port);
}
...


Note the null handling is not ideal, left for later. Note also we avoid superfluous creations if the data will not change.

So far, so good.

Then I decided to implement equals, so that equals is based on the data and not pointer equality:


...
public boolean equals(final java.lang.Object o){
if (this==o){return true;}
if (o instanceof fr.moresmau.jp.floj.samples.Simple){
fr.moresmau.jp.floj.samples.Simple co=(fr.moresmau.jp.floj.samples.Simple)o;
if (!this.data.equals(co.data)){return false;}
if (this.port!=co.port){return false;}
return true;
}
return false;
}
...


And with this of course the problem of overloaded equals method raises its head. If I want to allow subclasses to add more fields, my equals method won't work for subclasses.
If I check the exact class name in equals, subclasses that only add/overload methods will not be equals to superclasses.
So what can I do? I haven't decided. I could forbid subclasses to define more data fields (that's I think what you have in Haskell: you inherit functions from your type classes but no data).
I could also make the equals method final and not generate it for subclasses, but equality would then ignore the additional fields.
Another option would be to check the class name in that equals method and generate the proper equals method for all subclasses, but that may cause problems if we then extends the FLOJ class by a Java Class (which will be possible since we generate Java code).
But maybe it should be so, that a class is never equals to its subclasses? Returning equals between two instances that have the same data but different behavior (one instance has an overloaded method) is probably an error in the OO world...

Anyway, creating your own language is fun, I'm learning a lot and maybe I'll even end up with something usable!! I'll keep you posted on my progress.

6 comments:

Anonymous said...

Don't want to sound too negative and off topic, but the coding style (formating) is horrible. The code is totally unreadable. Just couldn't continue reading... Sorry.

JP Moresmau said...

And there's not even generics yet (-: ... Seriously, apart from the fact that the code show 1 space instead of tabs and in a couple of places doesn't go to a new line to have return false, I don't see what's so horrible about it.

Anonymous said...

The last sniplet:
1. we know what package Object is in, and you defined Simple earlier so no need to type in the full class names.
2. There's nothing wrong with indenting with 1/2/3/4/10 spaces or tabs, but you have to put spaces around binary operators (==, =, +, -, etc.).
3. no need to declare "o" final. You're not referencing it from inside of an embedded anonymous class.

public boolean equals(Object o) {
  if (this == o) {
    return true;
  }
  if (o instanceof Simple) {
    Simple co = (Simple) o;
    if (!this.data.equals(co.data)) {
      return false;
    }
    if (this.port != co.port) {
      return false;
    }
    return true;
  }
  return false;
}

I'm sorry, I don't mean to be bitchy.

JP Moresmau said...

Sorry, you're not bitchy, it's good to have reactions!! What may not be clear in the post is that the code here is generated by my converter from the functional code to Java. Hence the absence of spaces, the aggressive use of final, etc... But yes since the converter generates source files (I didn't want to dive into .class format right now) I could generate prettier output. Thanks!

Anonymous said...

I take everything I said back if that's generated code. :]

Anonymous said...

As for equals part, I have seen some years ago seen a trick in DDJ, which may do the job.

public boolean equals( b) {
return this.blindlyEquals( b) && b.blindlyEquals( this);
}

where simpleMindedEquals is implemented as in your post (equal fields => true)

You get reflexivity and transivity for free so to say.

Hope this helps.