[Checkers] Nullness inference

Sat Dec 27 05:04:32 EST 2008

Fausto-

Thanks for your message.  It's great to hear from you, and I hope you are
enjoying a well-deserved Christmas break after a first term teaching.
Teaching is tremendous fun, but can also be exhausting.

I have cc'ed the checkers mailing list, which contains the developers of
the Checker Framework.

It's great that your tool can generate stub files; that is very useful, and
is just what we had in mind.

Mahmood, can you update us on the status of support for stub files?  It's
easy for me to write a little script that converts them into skeleton files
in the short term, of course.

Thanks for sending the stub files for JLex.  I notice that the JLex
maintainer recommends (http://cscott.net/Projects/) using JFlex
(http://jflex.de/) instead.  I think it's great to start with any program
at all in order to improve the tool.  I also think that when doing case
studies, we will learn most from analyzing programs that are in use and
that people care about.  For instance, we will be able to ask a developer
whether a particular reference was intended to be nullable, and bugs we
discover may be validated and fixed, which is more impressive than finding
un-verified bugs in legacy software.

One possibility as another modest piece of software that you could analyze
is the "utilMDE" package of the Daikon tool
(http://groups.csail.mit.edu/pag/daikon/download/).  It is primarily a
library, but you could use its test suite as a main program that invokes it
in many typical ways.  An advantage of using the utilMDE package, beyond
that is is actively used and maintained, is that the source code is already
fully annotated with @Nullable annotations.  (These are the complement of
@NonNull annotations:  every reference can be thought of as having one or
the other, so only one of the two is necessary.)  That means that you can
directly compare the precision of your tool with that of our (presumably
less precise) type-checker.  As a side benefit, that should make it
possible to more quickly find bugs in both tools.

As another example, the Checker Framework contains annotated versions of
many parts of the JDK.  However, the JDK is not type-safe with respect to
generics and is often written in a low-level style, so I do not think it is
a very good case study.

I'm happy to collaborate on analyzing other pieces of software as well.

> Is it ok to put the stub files into a directory structure mirroring the
> package hierarchy structure?

Yes, that is a convenient way to store them.

> There are some limitations that I forsee:
> - Since I analyse Java bytecode rather than source code, the parameter
> names are lost and replaced with the local variable used to hold them.

I don't think this is a serious problem; the synthetic variable names are
fine.  (If the bytecode was produced using the -g option to the java
compiler, then the parameter names are available in the class file.)

> - Moreover, nullness annotations which are internal to a method can be
> computed, but they are hard to relate to the source code. For instance,
> the analyser infers that in:
> 
> Object o = method_that_never_returns_null();
> 
> variable o is @NonNull, but it has not the source code to express this
> in any way. However, I think that this is not important for a stub file,
> which only deals with signatures.

You are right.  Furthermore, since our inference tools perform local
flow-sensitive type qualifier inference, annotations on local variables are
only rarely necessary in order for our type checkers to verify an
implementation.  And when only the signature is being used, the local
variable annotations aren't necessary at all.

> - Also consider that I do not use the @PolyNull annotation at the
> moment. The analyser does use dependencies in the analysis, also more
> precise than that expressed by @PolyNull, but I will think about it
> later. I think that I need some clarification first about the formal
> meaning of that annotation.

OK.  Please let us know your questions, and we will be very happy to
clarify -- and those questions will enable us to improve our documentation
as well!

We've found that PolyNull makes a very substantial difference in precision
in practice, so that will be a welcome feature to add.

In addition to the limitations that you mentioned, I have another question,
about generics.  Does your tool infer qualifiers for generic type
arguments?  For example, could it infer that a list contains non-null
elements, as in List<@NonNull String> ?  If not, then that is likely to
significantly impact precision.  Our type-checkers target Java 5 code.
(This seems reasonable to us, since Java 5 was publicly released over 4
years ago, and since anyone who is really interested in software quality
is almost certainly taking advantage of the increased type safety that
generics provide.)

Thanks again, and let us know as your tool evolves.  We are also happy to
help with answering questions, running experiments, and in other ways.

                    -Mike