[Checkers] IGJ And Type Inference

Wed Mar 19 14:47:10 EDT 2008

Just to clear up some points:

> 1. I believe that any type inference tool need to work on both:  
> source and bytecode.  The bytecode is necessary for analysis on  
> third-party libraries, whose licenses may not allow you to reverse- 
> engineer as decompilation is a form of reverse-engineering.  The  
> source is essential for users to annotate their own (potentially  
> partially annotated) code, where annotation should go into the  
> source directly.  However, we can use the bytecode as intermediate  
> representation and figure out a way to insert the resulting  
> annotations back in the source.  I assume that this mainly done  
> within IDEs, and eclipse provides a very nice mutable AST tree that  
> allows for tool to modify the tree.
>

So, the fact that some code will be partially annotated does not  
affect whether or not to make a source- or bytecode-based inference  
tool, since the tool will always be able to read the partial  
annotations and decide what the right defaults should be.  As for  
inserting annotations back into source code, that's one of the selling  
points of the annotation file format; it is a simple format for  
storing annotations that can then be inserted into the source or  
bytecode.  If the tool outputs an annotation file, it leaves this  
decision up to the user.

> 2. I believe that it is highly desirable (yet not quite important)  
> that the type inference tool does not load the classes.  Requiring  
> to load the classes to be annotated would prevent the type inference  
> tool to be able to annotate the JDK and the libraries the tool is  
> depending on. It will also force the user to setup the classpath and  
> the jni native library paths which may not be required for inference  
> (if we supply the user with a flag for default annotations for those  
> libraries).
>

What do you mean by 'load' exactly?  To Javarify the JDK, the  
Javarifier has Soot read about 2,000 JDK classes from a couple of jar  
files (i.e. rt.jar).  I think with only 512MB of memory it has  
problems, but with 1G it does fine.

> In another note, my initial assumption is that the core of the type  
> inference is a constraints solver with the following inputs:
> 1. Relations between symbols and the operations that are done on  
> them (e.g. method 'add()' was called on symbol 'myList', method  
> 'setX' contains an assignment to field 'X', etc).
> 2. A set of constraints (e.g. can only call mutable method on  
> mutable receivers; if no mutable method is called on a type it's  
> immutable, etc).
> And then generates the least restrictive types that satisfy all the  
> constraints. The type inference framework should provide a  
> representation that could be easily constructed to generate the  
> graph of the first kind.
>

I think what you're describing is a general framework for qualifier  
inference.   This generality makes the JQual paper difficult to read  
if you want to understand the fine details of inferring readonly,  
along with the fact that you need context-sensitivity and field- 
sensitivity to get good results.  You may actually want to start by  
reading the Javarifier ECOOP paper, which describes a constraint-based  
system that infers Javari readonly qualifiers.  (Also, any feedback  
you give me would be helpful in finishing the final version of that  
paper.)   I've put a pdf at:

http://people.csail.mit.edu/jaimeq/javarifier-ecoop.pdf

Jaime