[Checkers] Next steps on IGJ: case studies, Javari comparison, inference

Wed Mar 19 05:51:58 EDT 2008

Mahmood-

Here is a very long response to your progress report email.  I'm copying it
to the checkers mailing list and to Jaime because it may be of interest to
them.

> I am yet to redo all my case studies from the summer to insure that  
> IGJ Checker is quite usable.

I think you mean that you want to ensure that the tool weaknesses that
reported in the paper have been eliminated.  You've made a lot of important
improvements to the system that have let it run on much larger programs and
that have fixed problems we encountered in practice.  However, you need to
double-check that you've fixed the problems you encountered in the case
study.  (It would be interesting if the framework problems exposed by your
recent testing were different than those exposed by the case studies from
last summer.)

(I think we may be able to report use of fewer annotations, too, thanks to
your method invocation inference.)

> Personally, I have a great confidence  
> in the IGJ Checker.

That's great.

Do you think the checkers framework is completely done?  Is completely done
from the point of view of the IGJ checker?  Is done enough for you (and you
want to move on)?

I agree with you that you should evaluate usability and utility.  A solid
checker (and a solid checkers framework) is a necessary but not sufficient
condition for usability.

Thank you for thinking about this issue!

> 1. Investigating the value-added of IGJ.
> The previous IGJ case studies unrevealed bad code/patterns in some of  
> the libraries, but unfortunately without revealing real bugs.  I am  
> planning to do more case studies to find the value of IGJ type system.
> 
> The value added could be due to bugs found, better engineering  
> practices.
> 
> 2. Usability testing of IGJ
> I might need to do some evangelizing for IGJ to get feedback regarding  
> the type system and its weaknesses and benefits.

These two projects sound similar to me, with the only difference being
whether you do the case studies yourself, or you find another user who
wants to try the system.  Is that correct?  Your decision between those
probably depends on your confidence in its usability and whether you want
to focus on the type system or also evaluate the toolset.

> 3. Comparing IGJ and Javari
> Javari has been on-going long-term project in the PAG lab.  I believe  
> that IGJ is more expressive (e.g. Immutable References), more flexible  
> (e.g. immutability type variables), and simpler (less needed  
> annotations).  However, we haven't yet researched how this extra  
> expressiveness and simplicity benefit the users.

This would be great to investigate.
It might be nice to throw other type systems for immutability into the mix,
too, but I'm not sure what other decent implementations exist.

The main reason we haven't been able to do this up until now is that the
tools haven't been good enough.  Thanks to your work and that of others, we
can finally do so.  (Though Telmo is lagging a bit behind you in terms of
implementation, but the Javari checker is improving a lot too.)

Probably the best way to do this is to annotate the same code with Javari
and with IGJ, and to compare the expressiveness (and the simplicity of the
annotations).

> 4. Investigating ways to improve the type system.

C uses the "restricted" type qualifier for what is more generally called
"unique" in the literature.  This is also related to the notion of
"ownership".  There is a very big literature in this area.  Primarily
because it would take a while to absorb it propose new type systems, I lean
against attempting this in the limited time that you have left.  An
alternate project that might achieve the effect you want is to implement a
checker for an existing ownership or uniqueness type system.  At least some
of the work in that field is purely theoretical; I am not sure how much
experimental work has been done, and thus whether this would be an easy
task or would require a lot of work and possibly yield some new insight.
Whether to take this on depends primarily on whether it is the key
usability problem with your system.  A problem that merely forces users to
annotate a few variables as mutable isn't as bad as something that prevents
users from utilizing the type system at all, so I think #1 and #2 should
precede work on #4 in order to keep you focused on what really matters.

> 5. Type Inference tool and framework
> It seems to me that one of the benefits of Javari over IGJ is the  
> existence of Javarifier. Having an inference tool is essential to  
> attracting user to the system.  I have talked with Matt and Jaime  
> about the various tools and framework to analyze the bytecode (and  
> especially parse annotations for symbols).  Personally, I did not feel  
> impressed by their description of Soot and ASM as it seems to me that  
> the tools are designed for purposes other than type inference.

Jaime had a choice between building on what Matthew Tschantz had started,
or writing a new system entirely from scratch.  Jaime chose to extend
Matthew's implementation.  It's impossible to know which approach was best,
even in retrospect.  Even if we build over from scratch, the existence of
the current Javarifier will tremendously ease the construction of the next
system (e.g., by acting as an oracle), and Jaime's accomplishments go well
beyond just the implementation itself, so his insights would also ease the
task of re-writing.  Obviously, Jaime will continue working on his current
codebase until graduation.

It's a requirement to process classfiles.  At the time Matthew started
Javarifier, ASM and Soot looked like the best choice for processing
classfiles.  An alternative would be to decompile the class files, and then
write a tool that only works on source code.  (This has its own problems,
of course, but maybe it would have been better in retrospect.)

Jaime probably has more information.  Jaime, a couple of pages discussing
the design tradeoffs would be interesting for your thesis, especially in
aiding people who wish to build such tools in the future.  This might
include Mahmood or Victoria Popic, who will be joining PAG and the
qualifiers project shortly.  Maybe you could even sketch that section now,
since Mahmood's questions and Victoria's interest make it timely.

If we want to stick with a classfile tool (which Jaime has shown is a
viable approach), then we should consider WALA.  Here is some text from
wisdom/java-tools.txt:

  WALA: http://wala.sourceforge.net
    IBM "T.J. Watson Libraries for Analysis" of bytecode.
    WALA is a subset of IBM's DOMO program analysis infrastructure.
    Seems like a good choice for new projects (as of late 2006).
    Should be solid, since it is used by commercial projects within IBM.
    Manu Sridharan recommends it (but I think he worked at IBM and had access
    to the implementers).
    Has lots of analyses built in.
    Documentation is a bit spotty (but so is that of other tools like Soot),
    since the developers are primarily trying to solve their own problems
    rather than support a community.
      wala.properties, the Java runtime directory is in "Getting Started":
      http://wala.sourceforge.net/wiki/index.php/UserGuide:Getting_Started
    There's a mailing list (approx 30 messages per month as of 2/2008) at
      http://sourceforge.net/mailarchive/forum.php?forum_name=wala-wala

That file has a lot of information about other tools as well -- but many of
those tools are probably abandonware by now, and that fact should be noted
in the file.

> I remember Mike recommending a type inference framework from a while  
> ago, but I haven't looked at it yet.

There's information on various tools in the group bibliographies, I think.
The only one that I think is even worth trying is JastAdd
(http://jastadd.org/).  It does both type inference and type checking.  I
have been working with its author, Torbjorn Ekman, for months trying to get
his NonNull inference to work, and it still has bugs that render it
unusable.  (Torbjorn's difficulties should come as no surprise to Jaime,
but Torbjorn had said he could have everything set in a day or two thanks
to the flexibility and power of JastAdd.)  I don't know whether it would be
easier to start over or to build on his work.  Building on others' research
prototypes is always a bit scary, and I don't know how much time he has to
help other users.

Another possibility might be write an inference tool that builds somehow on
the checkers framework or at least on the JSR 308 compiler infrastructure.
I don't know whether this approach makes sense.

                    -Mike