Friday, September 24, 2010
Haskell Neural Network: plugging a space leak
First, the good news: last week when I posted about my little digit recognition program I had made a mistake in my test code: when I give a rounded eight to my network, it does recognize it, even though it's been trained only with a square eight! Yoohoo!
Also, a few people pointed to online database of handwritten digits so that I could get more data to train my network. I dully downloaded and started working with some files. Soon I knew I had a problem. Not only was the learning slow, but the memory usage of the program showed exponential growth, like using 400MB after a few iterations. Ha ha, I thought, so it's me against lazy evaluation again!
Well, I was right. A couple of strictness annotation in the neuron output calculation function helped performance, nearly making the code twice as fast, but didn't help the memory consumption. Going mad with ! (BangPatterns is your friend) didn't help any more. WTF?
Then I started to read a bit more about seq and Weak Head Normal Form. So when I put a BangPattern in front of a tuple, it evaluates the tuple but not what's inside it. And, lo and behold, if at each learning iteration I traced the network to the standard output, memory usage stayed constant! So I needed to tell Haskell to fully evaluate the network each time, so as not to keep thunks full of arithmetic operations hanging round. A distant memory of a package called deepseq provided me the final help I needed. I didn't use the package itself because I wanted to understand really what's going on, so I just roll out a few lines of my own:
And by calling deepSeq on my network in each iteration, my program runs in constant space! And I looked at the flat line of memory usage in my task manager, and I was pleased. Now, I can try to improve the speed of the dang thing, but that's another set of problems...