Friday, June 24, 2011

EclipseFP: the way forward??

This is not the usual "hey I managed to get some code to work, check it out" blog post. Bear with me as I'm just thinking out loud. This is about EclipseFP and how we can move it forwards.

At the moment EclipseFP works pretty well for simple, smallish Haskell projects. People have noted some performance issues on large projects, and part of it is the huge memory footprint of GHC when used through its API. After a couple of hours of working on a yesod project (so we're talking quasi quotation, template haskell, etc), the scion server takes 1Gb of memory... Restarting brings it back to reasonable levels, which seems to point to memory leaks in the GHC API.
Maybe more critically, people are now asking for features that get increasingly harder to implement using Scion:
- static GHC flags like -threaded (Nominolo's working on a version of Scion that could handle these)
- C or HSC sources (or other kind of stuff requiring a preprocessor)
- Having executables or test suites referencing the library component directly in the Cabal file

The main thing with these is that Cabal handles them fine, so basically typing cabal build at the command prompt solves these issues, which undermines the usefulness of having an IDE in the first place... But they require more than just calling the GHC API: building and linking a library in-place, calling a c compiler, hsc2hs, etc...
Of course Cabal publishes an API. So for example the code used by Cabal to build an in-place version of the library I could access, in theory, from Scion. But then Cabal calls the ghc executable, so the Cabal API gives me no easy way to get back errors from GHC, etc...

What does accessing the GHC API gives us? An AST, that can be generated from an unsaved file (so you don't need to save your haskell source file to see your errors, our up to date outline, etc). All the rest (syntax highlighting, jump to definition, etc) could probably be done from the AST.

So I'm wondering now if scrapping scion totally would not be a good idea. A rough outline of how things could work would be:
- create a temp folder in which we will work, copy in it all the project files (sources, etc)
- use cabal to build in that folder
- parse cabal output to get errors and their location
- unsaved files in the EclipseFP editor could have their unsaved contents copied into that folder (that folder represent the current state of the project, which may not be what is really saved on disk)
- have only a simple Haskell executable that given a module file, loads it using all the built files in the temp folder, and return the AST in some easy to parse format. Firing and getting the result would ensure GHC cannot grab loads of memory and not release it
- the IDE code then gets the AST and does all it needs to it
- the AST could even be saved in that temp folder in some format, so that the Java code would only request it when the original file is more recent, which would allow easy parsing of dependent modules AST (for example to retrieve all the symbols exported, etc)

There are a lot of ugly bits in there, maybe, but that would solve my problems... I suppose all of that could be written in Haskell and we could keep a similar enough API to scion, so that the Java code wouldn't need change too much...

For the moment I'm not going to reach into things anyway, but if anybody has some feedback, I'll gladly take any opinion on board!

6 comments:

clanehin said...

I've noticed the memory leak also. I was considering submitting some kind of patch that would simply shut down scion after a user-configurable period of inactivity. I would observe that compilers in general are not designed for long-running operation. :)

I also encountered some other problems: roguestar has multiple binaries that need to be aware of each other's file system paths. Also, eclipsefp does not always report full error messages (ghc error messages can be as much as a page long with some extensions, with the important bit not always at the top). Both of these facts had me switching between eclipse and my terminal window frequently.

Nevertheless, I find the fast feedback and go-to-declaration in eclipse to be good enough, on the balance, that I kept using eclipsefp.

(Unfortunately I've been out of action with hardware failures since the beginning of the month.)

My personal opinion is that I'd like to see steady incremental improvement. The idea of any kind of re-write makes me a little uncomfortable.

Thomas Schilling said...

Actually, the plan for the new design for Scion is to manage multiple short-running processes and save most of the intermediate state to disk. I don't see how removing Scion would improve anything here.

JP Moresmau said...

Clanehin, yes I know that rewrites are dangerous, that's why I'm venting my thoughts before doing anything there.
Thomas, what are the plans for Scion?When do you plan to have that new version ready. Can we help in any way?

vol said...

How do other IDEs do it? Would be interesting to compare.

Keep up the good work, I'm most grateful for something approaching a Haskell IDE.

Do you plan on releasing 2.0.5?

Pradeep said...

I have seen resharper a excellent plugin for visual studio parsing and writing to a scratch dir. I have also seen them doing interesting stuff with the parsed AST.

Look at this page for some cool stuff http://www.jetbrains.com/resharper/features/index.html

It would be wonderful to have refactoring in haskell, much better possibilities lie there.

JP Moresmau said...

I suppose we'll release 2.0.5 after the summer, when Alejandro has completed enough functionality to make it worthwhile.