Skip to main content

A Java7 Grammar for VisualLangLab

Posted by sanjay_dasgupta on March 3, 2012 at 7:20 AM PST

VisualLangLab now has a Java7 (JDK7) grammar!  Read on to find out how you can use the grammar to locate usages of the new Java7 (project coin) language features in the source-code of the Oracle JDK 7u3 itself.

If you are new to VisualLangLab (an easy-to-learn-and-use parser-generator) condsider reading the tutorial A Quick Tour.

Java7 (JDK7) Grammar Specification

The Java7 (JDK7) grammar used is based on the contents of Chapter-18. Syntax, of The Java Language Specification (Java SE 7 Edition). A PDF version of this book is available online. The grammar in the book has been changed a little as described below.

Java7 (JDK7) Features

The grammar includes the following Java7 "project coin" language features

  1. Strings in switch
  2. Binary integral literals
  3. Underscores in numeric literals
  4. Multi-catch
  5. More precise rethrow (see details below)
  6. Improved type inference for generic instance creation (diamond)
  7. try-with-resources statement
  8. Simplified varargs method invocation (see details below)

The grammar passes samples of code containing features 5 and 8 (More precise rethrow and Simplified varargs method invocation respectively), although no particular changes were made (relative to the Java6 grammar). Because of this, the grammar can not specifically distinguish these constructs (from the containing Java6 feature).

Actions to Print Feature-Name and Line-Number

Simple actions have been added to the grammar that print out a short message giving the feature-name and line-number whenever any of the new Java7 language features is recognized. The darkened rectangular areas in Figure-1 below (in which the VisualLangLab GUI is being used as a run-time environment for the Java7 grammar) illustrate some such output. There are no such actions for features 5 and 8 (see list above) since the grammar can not specifically distinguish these features. 

The output of the actions (darkened areas in Figure-1 below) appear before the status line itself. Thus, in Figure-1 below, the four Diamond announcements belong to the file PlatformComponent.java , while the next set of announcements (two Multi-Catch and one Diamond) belong to the file ManagementFactory.java.

Action output

Figure-1. Action output indicating new language feature use and location

The last part of this article (Where are the Actions?) describes how you can locate and inspect the action-code functions that produce the highlighted output in Figure-1 above.

Grammar Changes

The following changes (to the contents of Chapter-18. Syntax) were required to make the grammar accept all source files of the Oracle JDK 7u3. The changed grammar rules are reproduced below. Additions to the original grammar are underlined, while deletions are struck out. Certain changes were made even after this blog was first published; notes within the grammar identify these changes. The attached grammar file (jls-se7-NN.txt, see below) has also been updated as needed.

Changes to rule Primary reversed in jls-se7-38.vll (2012-MAR-09 08:40 IST)
  
Expression3:
( Type ) Expression3
PrefixOp Expression3
( Expression | Type ) Expression3
Primary { Selector } { PostfixOp }
 
ForInit:
StatementExpression { , StatementExpression }
LocalVariableDeclaration
 
CatchType:
QualifiedIdentifier { | QualifiedIdentifier }
 
CreatedName:
( BasicType | ReferenceType ) Identifier [ TypeArgumentsOrDiamond ]
{ . ( BasicType | ReferenceType ) Identifier [ TypeArgumentsOrDiamond ] }
 
Expression:
Expression1 { [ AssignmentOperator Expression1 } ]
 
Expression2:
Expression3 { [ Expression2Rest } ]
 
Expression2Rest:
{ InfixOp Expression3 }
instanceof Type
 
ForControl:
ForVarControl
[ ForInit ] ; [ Expression ] ; [ ForUpdate ]
 
TypeArgument:
( ReferenceType { [] } | BasicType [] { [] } )
? [ ( extends | super ) ( ReferenceType { [] } | BasicType [] { [] } ) ]
 
Following changes incorportated in jls-se7-37.vll (2012-MAR-06 10:40 IST)
 
BlockStatement:
LocalVariableDeclarationStatement ;
ClassOrInterfaceDeclaration
[ Identifier : ] Statement
 
LocalVariableDeclarationStatement:
{ VariableModifier } Type VariableDeclarators ;
 
 

Try it Yourself

To try it yourself, proceed as follows:

  1. Download the latest version of VisualLangLab: VLL4J.jar (you must have version 10.37 (20th June 2012) or later, as earlier versions do not recover from java bug 5050507). 
    VisualLangLab is started just by double-clicking VLL4J.jar. You must have a JRE (6.0 or later) installed, and users on Linux, UNIX, Mac OS, etc. will need to enable execution (chmod +x ...) first
  2. Get the VisualLangLab Java7 grammar: jls-se7-40_0.txt. After the file has been downloaded, rename it to jls-se7-40_0.vll (".vll" is the standard file-extension for VisualLangLab grammar files, but java.net blogs do not permit attachment of such files). Within VisualLangLab, open the grammar file by clicking the Open button (near the red "1" in Figure-2 below) or invoking File -> Open from the main menu, selecting jls-se7-40_0.vll in the file-chooser dialog presented, and clicking the Open button. (Grammar file updated 20th June 2012 17:00 IST)
  3. Unzip the file src.zip from the Oracle JDK into a directory with a well-known name.
  4. Within VisualLangLab, click the Parse file button (near the red "2" in Figure-2 below) or select Test -> Parse file from the main menu, select the directory containing the JDK source files (from step 3 above), and click the Open button. 

VisualLangLab dredges up all the files contained in the chosen directory tree (last step above), and parses them one by one. You should see a growing/scrolling list of status information (one line per file) in the Parser Log area (bottom right of GUI), as in Figure-3 and Figure-4 below. The time taken to complete parsing of all 7485 files will vary depending on the power of your computer. On my desktop computer (Pentium Dual-Core E5700 @ 3.00 GHz with 2 Gb memory, running Ubuntu 10.10) it takes approximately 11 minutes.

Important note: The top-level parser rule CompilationUnit must be selected in the toolbar's dropdown-list (as in Figure-2 below) when parsing is started (step 4 above).

VisualLangLab buttons

Figure-2. VisualLangLab buttons

Analyzing the Results

When parsing the Oracle JDK 7u3's source files you should see 16 failures (see the status line at the bottom of the GUI after all files are parsed).  A group of 14 failures occur because the files contain C (source and header) code belonging to Java's launcher. The red status lines in Figure-3 below show this group.

Error parsing C source and header files 

Figure-3. Parse failures of C source and header files under directory launcher

In addition, you may see a few more failures that occur as a consequence of java bug 5050507 within VisualLangLab's lexical-analyzer. Figure-4 below shows some such failures. The number of these failures is not consistent -- being dependent on the amount of memory available to the JRE, the JRE version, etc.

Errors from Java bug 5050507 

Figure-4. Parsing failures due to Java bug 5050507

Which Java7 Features are Used in JDK 7u3?

For greater flexibility in analyzing the Parser Log information, you should copy it into a text-file first. The logged information can be copied to the clipboard by clicking the Copy log button (near the red "3" in Figure-2 above) or selecting Log -> Copy log from the main menu. You can then paste (Edit -> Paste in most editors) the copied information into an empty text file.

Source-files that failed to parse can be found by searching for the string ": ERROR" (without the quote marks, and with one blank between the colon and the 'E'). Source-files that use specific Java7 language features can be found by searching for the following strings:

  • multi-catch
  • try-with-resource
  • case-with-string
  • diamond
  • underscore-numeric-literal
  • binary-literal

My own analysis of the Parser Log produced the following results:

  1. diamond - most used Java7 language feature
  2. try-with-resource - used in 7 files
  3. multi-catch - used in 5 files
  4. case-with-string - used in just 1 file
  5. underscore-numeric-literal - not used in Oracle JDK 7u3
  6. binary-literal - not used in Oracle JDK 7u3

Where are the Actions?

If you want to locate and understand the actions that produce the messages shown above, this section is for you.

Parser-rules that contain one or more actions are distinguished with a small, green icon with a white arrow shape as in Figure-5 below (above the red "1" in the figure). After selecting such a parser-rule, look for rule-tree nodes with the action annotation (like the one above the red "2" in Figure-5). Selecting (clicking on) such a node causes its action-code function to be displayed under the Action Code panel (top right of GUI, at red "3" in the figure). 

Inspecting actions 

Figure-5. Inspecting action-code functions

Action-code functions are explained fully in Action Code Design. The action-code functions used with this grammar vary widely in complexity. The structure/complexity of the action-code reflects the structure and complexity of the AST produced by the parser (which is explained in AST Structure and Action Code). 

A tutorial that explains parser development with VisualLangLab can be found in VisualLangLab - A Quick Tour. If you are a Scala user, you may also find Rapid Prototyping for Scala Parser Combinators interesting. 

AttachmentSize
VisualLangLab-Buttons.png18.07 KB
Launcher-Files-Errors.png83.42 KB
Action-Output.png72.01 KB
Java-Bug-5050507-Errors.png92.6 KB
Inspecting-Actions.png84.42 KB
jls-se7-35.txt49.57 KB
jls-se7-37.txt49.12 KB
jls-se7-38.txt49.09 KB
jls-se7-40.txt52.61 KB