Friday, June 20, 2008

Serializing Haskell objects to JSON

I'm trying to do some simple serialization of Haskell objects to save some state to disk. After tearing some of my hair out debugging parse errors due to silly code I'd written in Read instances declarations, I've decided to use another approach. I'm going to save objects as JSON, since I've already used that JSON library.
The first task is to write generic code that can serialize a Data instance to JSON. This has probably been done somewhere else, but I need to learn, right? So I took a deep breath and dived into Data.Generics.
I quickly rounded up the issues I would face. Most notably, Strings and lists are accessed without any syntaxic sugar, so to speak: Strings are lists of Char, and lists are made of two fields, the head and the rest. Of course, you say, but from that I need to regenerate String and lists of JSON Value objects.
So, to start, how to recognize Strings to treat them differently than other algrebraic types:

isString :: Data a => a -> Bool
isString a = (typeOf a) == (typeOf "")


There's probably better ways to do that, just tell me (-:

Lists (that are not strings) can be recognized with abstract representations of constructors, which are equals regardless of the actual type of elements in the list

isList :: Data a => a -> Bool
isList a
| isString a = False
| otherwise = (typeRepTyCon (typeOf a)) == (typeRepTyCon $ typeOf ([]::[Int]))


Now, transforming a list of the form (head,rest) to [head1,head2...]

jsonList :: Data a => a -> [JSON.Value]
jsonList l=
concat $ gmapQ f l
where f b
| (isList b)= jsonList b
| otherwise = [objToJson b]


For each element (the actual number depends on whether we're the empty list or not) we either reapply the same method, if it's the inner list or we simply transform to JSON

And then the actual method on objects:

objToJson :: Data a => a -> JSON.Value
objToJson o | isString o=JSON.String (fromJust $ ((cast o)::(Maybe String)))
objToJson o | isList o=JSON.Array (jsonList o)
objToJson o | otherwise=
let
c=toConstr o
in
case (constrRep c) of
AlgConstr _-> JSON.Object (Data.Map.fromList(zip (constrFields c) (gmapQ objToJson o)))
StringConstr s -> JSON.String s
FloatConstr f -> JSON.Number f
IntConstr i -> JSON.Number (fromIntegral i)


We first handle Strings, then list, then general objects using constrRep to distinguish between algebraic types that create JSON objects with the proper field names (using constrFields) and other types for JSON primitives.

And that's it for the serialization! The Generics package is not that hard to use but you have to look up examples to figure out how actually use the functions like gmapQ and such...

Now, I have to work on the opposite process: given a type and JSON data, reconstruct the objects... More Haskell fun!

3 comments:

Anonymous said...

Great article. Also see this if you haven't - http://lingnerd.blogspot.com/2007/12/pushing-haskells-type-system-to-limits.html

I'd like to use this for serializing my own data type. I guess I need an instance of Data for whatever type I declare.

JP Moresmau said...

Funny, I mentioned that blog post in the next blog entry about deserializing... Yes, you need to specify that your objects derive from Data. I've realized that some structures like Data.Map or Data.Set error on toConstr and such so they won't work, I'll work on implementing support for those

Jeff said...

I must have just had a blogfart. I think I did get the link for that article from you in the first place. I don't know - I keep too many firefox tabs open. Sorry.

BTW, I've combined your http server and the code for the Shellac backend together to make a JSON based webapp framework. You can grab it here -
git clone git://jaxclipse.com/monadServ.git

It is a bit rough around the edges and still only single threaded. I've got a sample app included and I've used it to make some more sophisticated apps as well.