Annotation File Format Specificationhttp://types.cs.washington.edu/annotation-file-utilities/ |
Java annotations are meta-data about Java program elements, as in “@Deprecated class Date { ... }”. Ordinarily, Java annotations are written in the source code of a .java Java source file. When javac compiles the source code, it inserts the annotations in the resulting .class file (as “attributes”).
Sometimes, it is convenient to specify the annotations outside the source code or the .class file.
All of these uses require an external, textual file format for Java annotations. The external file format should be easy for people to create, read, and modify. An “annotation file” serves this purpose by specifying a set of Java annotations. The Annotation File Utilities (http://types.cs.washington.edu/annotation-file-utilities/) are a set of tools that process annotation files.
The file format discussed in this document supports both standard Java SE 6 annotations and also the extended annotations proposed in JSR 308 [Ern08]. Section “Class File Format Extensions” of the JSR 308 design document explains how the extended annotations are stored in the .class file. The annotation file closely follows the class file format. In that sense, the current design is extremely low-level, and users probably would not want to write the files by hand (but might fill in a template that a tool generated automatically). As future work, we should design a more user-friendly format that permits Java signatures to be directly specified. Furthermore, since the current design is closely aligned to the class file, it is convenient for tools that operate on .class files but less convenient for tools that operate on .java files. For the short term, the low-level format will serve our purpose, which is primarily to enable testing by the Javari developers.
By convention, an annotation file ends with “.jaif” (for “Java annotation index file”), but this is not required.
Throughout this document, “name” is any valid Java simple name or fully qualified name, “type” is any valid type, and “value” is any valid Java constant, and quoted strings are literal values. The Kleene qualifiers “*” (zero or more), “?” (zero or one), and “+” (one or more) denote plurality of a grammar element. A vertical bar (“|”) separates alternatives. Parentheses (“()”) denote grouping, and square brackets (“[]”) denote optional syntax, which is equivalent to “( ... ) ?”.
In the annotation file, whitespace (excluding newlines) is optional with one exception: no space is permitted between an “@” character and a subsequent name. Indentation is ignored, but is encouraged to maintain readability of the hierarchy of program elements in the class (see the example in section 3).
Comments can be written throughout the annotation file using the double-slash syntax employed by Java for single-line comments: anything following two adjacent slashes (“//”) until the first newline is a comment. This is omitted from the grammar for simplicity. Block comments (“/* ... */”) are not allowed.
The annotation file itself contains one or more package definitions; each package definition describes one or more annotations and classes in that package.
| annotation-file ::= |
| package-definition+ |
The annotation file may omit certain program elements — for instance, it may mention only some of the packages in your program, or only some of the classes in a package, or only some of the fields or methods of a class. Program elements that do not appear in the annotation file are treated as unannotated.
Package definitions describe a package containing a list of annotation definitions and classes. A package definition also contains any annotations on the package itself (such as those from a package-info.java file).
| package-definition ::= |
| # To specify the default package, omit the name. |
| # Annotations on the default package are not allowed. |
| “package” [ name? “:” annotation* ] “\n” |
| ( annotation-definition | class-definition ) * |
An annotation definition describes the annotation’s fields and their types, so that they may be referenced in a compact way throughout the annotation file.
The Annotation File Utilities can read annotation definitions from the classpath, so it is optional to define them in the annotation file.
If an annotation is defined in the annotation file, then it must be defined before it is used. (This requirement makes it impossible to define, in an annotation file, an annotation that is meta-annotated with itself.) In the annotation file, the annotation definition appears within the package that defines the annotation. The annotation may be applied to elements of any package.
| annotation-definition ::= |
| “annotation” “@”name |
| [ “:” annotation-field-definition+ ] |
| “\n” |
| annotation-field-definition ::= |
| type name “\n” |
Class definitions describe the annotations present on the various program elements. It is organized according to the hierarchy of fields and methods in the class. Class definitions are defined by the class-definition production of the following grammar.
Inner classes are treated as ordinary classes whose names happen to contain $ signs and must be defined at the top level of a class definition file. (To change this, the grammar would have to be extended with a closing delimiter for classes; otherwise, it would be ambiguous whether a field/method appearing after an inner class definition belonged to the inner class or the outer class.)
| annotation ::= |
| # The name may be the annotation’s simple name, unless the file |
| # contains definitions for two annotations with the same simple name. |
| # In this case, the fully-qualified annotation name is required. |
| “@”name [ “(” annotation-field [ “,” annotation-field ]+ “)” ] |
| annotation-field ::= |
| # In Java, if a single-field annotation has a field named |
| # “value”, and that field name may be elided in uses of the |
| # annotation: “@A(12)” rather than “@A(value=12)”. |
| # Furthermore # The same convention holdss in an annotation file. |
| name “=” value |
| class-definition ::= |
| “class” name “:” annotation* “\n” |
| bound-definition* |
| field-definition* |
| method-definition* |
| field-definition ::= |
| # The annotation on the “field” line is that of the field declaration, |
| # while the annotation on the “type” line is that of outermost type. |
| “field” name “:” annotation* “\n” |
| “type:” annotation* “\n” |
| type-argument-or-array-definition* |
| method-definition ::= |
| # The method-key consists of the name followed by the signature |
| # in JVML format, for example: “foo([ILjava/lang/String;)V”. |
| # The annotation on the “method” line is that on the method not the return value. |
| “method” method-key “:” annotation* “\n” |
| bound-definition* |
| return-definition? |
| receiver-definition? |
| parameter-definition* |
| variable-definition* |
| typecast-definition* |
| instanceof-definition* |
| new-definition* |
| type-argument-or-array-definition ::= |
| # The integer list here contains the values of the “location” array [Ern08]. |
| “inner-type” integer [ “,” integer ]+ “:” annotation* “\n” |
| bound-definition ::= |
| # The integers are respectively the parameter and bound indices of |
| # the type parameter bound [Ern08]. |
| “bound” ,integer &integer “:” annotation* “\n” |
| type-argument-or-array-definition* |
| return-definition ::= |
| “return:” annotation* “\n” |
| type-argument-or-array-definition* |
| receiver-definition ::= |
| “receiver:” annotation* “\n” |
| parameter-definition ::= |
| # the integer is the index of the parameter in the method |
| # (i.e., 0 is the first method parameter) |
| # The annotation on the “parameter” line is that of the parameter declaration, |
| # while the annotation on the “type” line is that of outermost type of the parameter. |
| “parameter” integer “:” annotation* “\n” |
| “type:” annotation* “\n” |
| type-argument-or-array-definition* |
| variable-definition ::= |
| # The integers are respectively the index, start, and length |
| # fields of the annotations on this variable [Ern08]. |
| “local” integer “#” integer “+” integer “:” annotation* “\n” |
| type-argument-or-array-definition* |
| typecast-definition ::= |
| # The integer is the offset field of the annotation [Ern08]. |
| “typecast” “#” integer “:” annotation* “\n” |
| type-argument-or-array-definition* |
| instanceof-definition ::= |
| # The integer is the offset field of the annotation [Ern08]. |
| “instanceof” “#” integer “:” annotation* “\n” |
| new-definition ::= |
| # the integer is the offset field of the annotation [Ern08]. |
| “new” “#” integer “:” annotation* “\n” |
| type-argument-or-array-definition* |
For annotations on expressions (typecasts, instanceof, new, etc.), the annotation file uses offsets into the bytecode array of the class file to indicate the specific expression to which the annotation refers. Because different compilation strategies yield different .class files, a tool that maps such annotations from an annotation file into source code must have access to the specific .class file that was used to generate the annotation file. For non-expression annotations such as those on methods, fields, classes, etc., the .class file is not necessary.
Consider the code of Figure 1. Figure 2 shows two legal annotation files each of which represents its annotations.
package p1; import p2.*; // for the annotations @A through @D public @A(12) class Foo { public int bar; // no annotation private @B List<@C String> baz; public Foo(@B List<@C String> a) @D("spam") { @B List<@C String> l = new LinkedList<@C String>(); l = (@B List<@C String>)l; } }
Figure 1: Example Java code with annotations.
package p2: annotation @A: int value annotation @B: annotation @C: annotation @D: String value package p1: class Foo: @A(value=12) field bar: field baz: @B inner-type 0: @C method <init>(): parameter #0: @B inner-type 0: @C receiver: @D(value="spam") local 1 #3+5: @B inner-type 0: @C typecast #7: @B inner-type 0: @C new #0: inner-type 0: @Cpackage p2: annotation @A int value package p2: annotation @B package p2: annotation @C package p2: annotation @D String value package p1: class Foo: @A(value=12) package p1: class Foo: field baz: @B package p1: class Foo: field baz: inner-type 0: @C // ... definitions for p1.Foo.<init>() // omitted for brevity
Figure 2: Two distinct annotation files each corresponding to the code of Figure 1.
The Java language permits several types for annotation fields: primitives, Strings, java.lang.Class tokens (possibly parameterized), enumeration constants, subannotations, and one-dimensional arrays of these.
These types are represented in an annotation file as follows:
Annotation field values are represented in an annotation file as follows:
The following example annotation file shows how types and values are represented.
package p1:
annotation @ClassInfo:
String remark
Class favoriteClass
Class favoriteCollection // it's probably Class<? extends Collection>
// in source, but no parameterization here
char favoriteLetter
boolean isBuggy
enum p1.DebugCategory[] defaultDebugCategories
@p1.CommitInfo lastCommit
annotation @CommitInfo:
byte[] hashCode
int unixTime
String author
String message
class Foo: @p1.ClassInfo(
remark="Anything named \"Foo\" is bound to be good!",
favoriteClass=java.lang.reflect.Proxy.class,
favoriteCollection=java.util.LinkedHashSet.class,
favoriteLetter='F',
isBuggy=true,
defaultDebugCategories={DEBUG_TRAVERSAL, DEBUG_STORES, DEBUG_IO},
lastCommit=@p1.CommitInfo(
hashCode={31, 41, 59, 26, 53, 58, 97, 92, 32, 38, 46, 26, 43, 38, 32, 79},
unixTime=1152109350,
author="Joe Programmer",
message="First implementation of Foo"
)
)
We mention two alternatives to the format described in this document. Each of them has its own merits. In the future, the other formats could be implemented, along with tools for converting among them.
An alternative to the format described in this document would be XML. XML does not seem to provide any compelling advantages. Programmers interact with annotation files in two ways: textually (when reading, writing, and editing annotation files) and programmatically (when writing annotation-processing tools). Textually, XML can be very hard to read; style sheets mitigate this problem, but editing XML files remains tedious and error-prone. Programmatically, a layer of abstraction (an API) is needed in any event, so it makes little difference what the underlying textual representation is. XML files are easier to parse, but the parsing code only needs to be written once and is abstracted away by an API to the data structure.
Another alternative is a format like the .spec/.jml files of JML [LBR06]. The format is similar to Java code, but all method bodies are empty, and users can annotate the public members of a class. This is easy for Java programmers to read and understand. (It is a bit more complex to implement, but that is not particularly germane.) Because it does not permit complete specification of a class’s annotations (it does not permit annotation of method bodies), it is not appropriate for certain tools, such as type inference tools. However, it might be desirable to adopt such a format for public members, and to use the format described in this document primarily for method bodies.