[Checkers] IGJ And Type Inference
mahmood at MIT.EDU
Wed Mar 19 12:59:23 EDT 2008
Hi guys again,
I haven't studied the literature available about type inference and
the different inference tools out there yet, so I will simply post
what my intuition and goals are for sanity check. Please correct my
I would like to proceed with the following points:
1. I believe that any type inference tool need to work on both: source
and bytecode. The bytecode is necessary for analysis on third-party
libraries, whose licenses may not allow you to reverse-engineer as
decompilation is a form of reverse-engineering. The source is
essential for users to annotate their own (potentially partially
annotated) code, where annotation should go into the source directly.
However, we can use the bytecode as intermediate representation and
figure out a way to insert the resulting annotations back in the
source. I assume that this mainly done within IDEs, and eclipse
provides a very nice mutable AST tree that allows for tool to modify
2. I believe that it is highly desirable (yet not quite important)
that the type inference tool does not load the classes. Requiring to
load the classes to be annotated would prevent the type inference tool
to be able to annotate the JDK and the libraries the tool is depending
on. It will also force the user to setup the classpath and the jni
native library paths which may not be required for inference (if we
supply the user with a flag for default annotations for those
In another note, my initial assumption is that the core of the type
inference is a constraints solver with the following inputs:
1. Relations between symbols and the operations that are done on them
(e.g. method 'add()' was called on symbol 'myList', method 'setX'
contains an assignment to field 'X', etc).
2. A set of constraints (e.g. can only call mutable method on mutable
receivers; if no mutable method is called on a type it's immutable,
And then generates the least restrictive types that satisfy all the
constraints. The type inference framework should provide a
representation that could be easily constructed to generate the graph
of the first kind.
> Jaime had a choice between building on what Matthew Tschantz had
> or writing a new system entirely from scratch.
I will talk further with Jaime about his experiences with Soot and see
how to refractor his code to allow for other type inferences.
> It's a requirement to process classfiles. At the time Matthew started
> Javarifier, ASM and Soot looked like the best choice for processing
ASM and Soot by first glance gives you a view of bytecode in a very
low level view. Soot is designed for performing optimizations, thus
all the representations are geared towards that. ASM is more general
allows one to analyze the bytecode in almost anyway they like. While
we might still need them to actually parse the files, it might be nice
to have a intermediate framework that efficiently generates a graph
representation that satisfy the first condition.
> WALA: http://wala.sourceforge.net
> IBM "T.J. Watson Libraries for Analysis" of bytecode.
> WALA is a subset of IBM's DOMO program analysis infrastructure.
> Seems like a good choice for new projects (as of late 2006).
> Should be solid, since it is used by commercial projects within
I haven't gone wrong with using an IBM library. I will look at it.
> Another possibility might be write an inference tool that builds
> somehow on
> the checkers framework or at least on the JSR 308 compiler
> I don't know whether this approach makes sense.
The JSR308 infrastructure can be easily extended for a type inference
tool for source, especially that it already parses JSR308
annotations. I don't find that desirable necessarily because of the
classpath limitation described above.
However, if the type inference framework is done rightly, the
relations graph is independent from (yet close to, for performance
reasons) the parser, so we can have ASM, Soot, eclipsec, and javac
front-ends (anything one likes).
More information about the checkers