Friday, July 11, 2008

instance Data Map where -- half done!

While testing my JSON serializer/deserializer, I made the (unpleasant) discovery that some standard structures (most notable Data.Map.Map) do not implement all of Data.Generics.Data interface. They just call error on things like toConstr and gunfold. Which is not good, since I use these methods. Why they can't just pretend that a Map K v is for generics a map of (k,v) I don't understand (they implement gfoldl that way), but hey, I'm only a humble learner of Haskell...

So I spent a few hours trying to modify my JSON code to work around this issue, to no avail (well I can get the serialization all right, but the deserialization seems to be tricky). So I've got another approach to work: since the default for Data instances is not right when your objects contains Map structures, lets rewrite these... Using the source for Data.Map and the documentation for Data.Generics.Data, I managed to get it to work, but of course the price is loads (loads, loads) of boilerplate code.

So here goes: I have the simple object: (assume import qualified Data.Map as M)

data MapObjF=MapObjF (M.Map String Int) String
deriving (Eq,Typeable,Show)

So MapObjF contains a Map String Integer and a String

And here's the Data instance definition (big breath):

instance Data MapObjF where
gfoldl k z (MapObjF m s) = k (k (z (MapObjF . M.fromList)) (M.toList m)) s
gunfold k z _ = k (k (z (MapObjF . M.fromList)))
toConstr (MapObjF _ _) = con_MapObjF
dataTypeOf _ = ty_MapObjF

con_MapObjF = mkConstr ty_MapObjF "MapObjF" [] Prefix
ty_MapObjF = mkDataType "MyModule.MapObjF" [con_MapObjF]

I define the constructor, the datatype, and implement gfoldl and gunfold transforming the map into a list of tuples. Of course if you have several constructors, loads of fields it soon becomes unwieldy. Now, is there a Haskell macro system so I can easily generate all that boilerplate for all my data types? Noooo I don't want to leave Haskell for LISP...

Tuesday, July 08, 2008

Handling errors in JSON to Haskell deserialization

My JSON deserializing code presented previously works, but when an error occurs, most likely because the JSON doesn't represent the objects you expect it to, we're a bit lost: we get a terse error message and our program aborts, because Data.Generics tend to call error when it encounters a problem, and our own calls from Maybe to Just may fail. So we need to build a way to report errors without crashing. Now, famously, there are several ways to report errors, and for a start I'll work with Either.

First, we create a error type that can help us represent both JSON parsing error and deserialization errors:

data JSOND13NError=JSONError ParseError
| D13NError String String
deriving Show

The two parameters in D13NError are the error message and the path in the JSON object where the error occured. They will be carried in the Either monad in a tuple of Strings

So parsing can either throw a JSONError or try to deserialize:

jsonStringToObj :: forall a. Data a => String -> Either JSOND13NError a
jsonStringToObj s= case parse JSON.json "JSON.parse" s of
Left err->Left (JSONError err)
Right js->case (jsonToObj js "/") of
Left (s,path) -> Left (D13NError s path)
Right o-> Right o

The path starts at "/" for the deserialization

To work with Either we need a couple of helper function, one to translate the Maybe values we get from cast to Either:

toEither :: String -> Maybe a -> String -> Either (String,String) a
toEither s Nothing path=Left (s,path)
toEither _ (Just a) _=Right a

And then the monad declaration for Either(message,path):

instance Monad (Either (String,String)) where
return a=Right a
fail s=Left (s,"/")
Right a >>= f2 =f2 a
Left s >>= f2 =Left s

With this, our main function becomes:

jsonToObj :: forall a. Data a => JSON.Value -> String -> Either (String,String) a
jsonToObj (JSON.String s) path=toEither "Not a String" (cast s) path
jsonToObj (JSON.Bool b) path=toEither "Not a Bool" (cast b) path
jsonToObj x path=do
(values,cons)<-case x of
JSON.Object m -> fieldJSONValues m myDataType path
JSON.Number fl -> Right ([],if isPrefixOf "Prelude.Int" (dataTypeName myDataType)
then mkIntConstr myDataType (round fl)
else mkFloatConstr myDataType fl)
JSON.Array [] -> Right ([],indexConstr myDataType 1)
JSON.Array (x:xs) -> Right ([(x,path),((JSON.Array xs),path)],indexConstr myDataType 2)
let StateT f=fromConstrM (StateT (\((x,path):xs) -> do
r<-jsonToObj x path
return (r,xs)
)) cons
r<- f values
return (fst r)
getArg :: a' -> a'
getArg = undefined
getType :: a
getType = undefined
myDataType = dataTypeOf (getArg getType)

We use a StateT monad transformer to wrap our Either monad in the State. I'm not sure I understand 100% of how monad transformers actually work, but it works!

And the fieldJSONValues handles a few error conditions, along with building the path where we're at, by adding after the current path the name of each field in turn

fieldJSONValues :: (M.Map String JSON.Value) -> DataType -> String -> Either (String,String) ([(JSON.Value,String)],Constr)
fieldJSONValues m dt path | isAlgType dt=
if idx>(maxConstrIndex dt)
then Left ((printf "Constructor index %d too big" idx),path)
if null fn
then Right ((map (\x->(fromJust $ M.lookup (show x) m',path++(show x)++"/")) (sort $ map (\x->((read x)::Int)) (M.keys m'))),c)
let vals=map (\(x,y)->(fromJust x,y)) (filter (\(x,y)->isJust x) (map (\x->((M.lookup x m'),path++x++"/")) fn))
in if (length vals) < (length fn)
then Left ("Not enough fields",path)
else Right (vals,c)
idx=case M.lookup constrIndexField m of
Just (JSON.Number f)->truncate f
_ -> 1
m'=M.delete constrIndexField m
c=indexConstr dt idx
fn=constrFields c
fieldJSONValues m dt path | otherwise =Left ("Not an algebraic type",path)

Once I had the previous code working, using a different monad and adding error handling was easy. I suppose I could further abstract and try not to hard code the Either monad in it but use any type of Monad or MonadError, but this gives me what I want: no program crash and the ability to recover from errors!