Java 1.5 Parser - Grammar Rules - Part 5

<- back

Next Chapter Grammar Rules - part 6

Creating Grammar Rules

In this part the Syntax for Java 1.5 from the Java Specification will be in focus. Coco/R grammar must be in the EBNF format, and the specification is, more or less, in BNF format.

The Java Syntax contains a lot of grammar rule. Instead of adding them all at once, compile and try to coupe with all warnings, we will add them in small portions each time, compile and fix any warnings. We start at the top with the rule containing the token package.

CompilationUnit

The Java Specification Syntax rules begins with the rule CompilationUnit

Java Syntax Rule
CompilationUnit:
[[Annotations] package QualifiedIdentifier ; ] {ImportDeclaration}
{TypeDeclaration}

Note that package and the ;-sign are tokens in the definition above. The CompilationUnit will be named Java15 in the Coco/R grammar.

Coco/R EBFN Rule
PRODUCTIONS

Java15 =
[[Annotations] Package QualifiedIdentifier SemiColon] {ImportDeclaration}
{TypeDeclaration}.

Compiling the above rules will give the following errors:

checking
  Java15 deletable
  No production for Annotations
  No production for QualifiedIdentifier
  No production for ImportDeclaration
  No production for TypeDeclaration
5 errors detected

A few "dummy" rules are needed to make this to compile

Coco/R EBFN Rule
PRODUCTIONS

Annotations =
Equal.

QualifiedIdentifier =
Identifier.

ImportDeclaration =
Import.

TypeDeclaration =
Assignment.

Different tokens are used to minimise conflicts. Compiling the grammar file now will display

checking
  Java15 deletable
parser generated
0 errors detected

Java15 deletable is displayed because the token stream may contain no tokens, and that is a legal java file. So, the Java15 deletable will be ignored.

This is how the grammar looks like at this starting point

java_15_parser_70.png

Abstract Syntax Tree

This is what the UML class diagram for how to represent the grammar rules.

sourceFile_a.png

Source File

To be able to test our grammar a minimal Abstract Syntax Tree must be built. The top node of the tree is the SourceFile node. It contains package name, import statements, annotations and defined types like classes, interfaces and enumerations. SourceFile implements ISourceFile.

This minimal SourceFile only contains package name and import statements.

org.structuredparsing.java15grammar.ast.ISourceFile:

package org.structuredparsing.java15grammar.ast;
 
import java.util.List;
 
public interface ISourceFile {
   public IQualifiedIdentifier getPackageContent();
   public List< IImportStatement > getImportList();
}

org.structuredparsing.java15grammar.ast.SourceFile:

package org.structuredparsing.java15grammar.ast;
 
import java.util.ArrayList;
import java.util.List;
 
public class SourceFile implements ISourceFile {
   private QualifiedIdentifier packageContent;
   private List< IImportStatement > importList;
 
   public SourceFile() {
      importList = new ArrayList< IImportStatement >();
   }
 
   @Override
   public IQualifiedIdentifier getPackageContent() {
      return this.packageContent;
   }
 
   public void setPacakgeContent( QualifiedIdentifier packageContent ) {
     this.packageContent = packageContent;
   }
 
   @Override
   public List< IImportStatement > getImportList() {
      return this.importList;
   }
 
   public void addImportContent( IImportStatement importContent ) {
      importList.add( importContent );
   }
}

IImportStatement and ImportStatement

These definitions holds information about import statements.

org.structuredparsing.java15grammar.ast.IImportStatement:

package org.structuredparsing.java15grammar.ast;
 
public interface IImportStatement {
   public IQualifiedIdentifier getImportURL();
   public boolean isImportAllInFolderSet();
   public boolean isStatic();
}

org.structuredparsing.java15grammar.ast.ImportStatement:

package org.structuredparsing.java15grammar.ast;
 
public class ImportStatement implements IImportStatement {
   private QualifiedIdentifier url;
   private boolean bImportAllInFolder;
   private boolean bStatic;
 
   public ImportStatement() {
      bStatic = false;
   }
 
   @Override
   public IQualifiedIdentifier getImportURL() {
      return url;
   }
 
   public void setImportUrl( QualifiedIdentifier url ) {
      this.url = url;
   }
 
   @Override
   public boolean isImportAllInFolderSet() {
      return bImportAllInFolder;
   }
 
   public void setImportAllInFolder( boolean bImportAllInFolder ) {
      this.bImportAllInFolder = bImportAllInFolder;
   }
 
   @Override
   public boolean isStatic() {
      return bStatic;
   }
 
   public void setStatic( boolean bStatic ) {
      this.bStatic = bStatic;
   }
}

IQualifiedIdentifier and QualifiedIdentifier

org.structuredparsing.java15grammar.ast.IQualifiedIdentifier:

package org.structuredparsing.java15grammar.ast;
 
import java.util.List;
 
public interface IQualifiedIdentifier {
   public List< String > getIdentifierList();
}

org.structuredparsing.java15grammar.ast.QualifiedIdentifier:

package org.structuredparsing.java15grammar.ast;
 
import java.util.ArrayList;
import java.util.List;
 
public class QualifiedIdentifier implements IQualifiedIdentifier, IExpression {
   private List< String > qualifiedIdentifier;
 
   public QualifiedIdentifier() {
      qualifiedIdentifier = new ArrayList< String >();
   }
 
   public void addIdentifier( String sIdentifier ) {
      qualifiedIdentifier.add( sIdentifier );
   }
 
   public List< String > getIdentifierList() {
      return qualifiedIdentifier;
   }
}

Update the grammar file for the AST

The grammar file must be updated so that a SourceFile object is created. A method must also be defined so it is possible to get hold of the SourceFile object (or even better, an ISourceFile object) from the parser object.

Coco/R EBFN Rule
package org.structuredparsing.java15grammar.cocor.parser_jflex_scanner;

import org.structuredparsing.java15grammar.ast.ISourceFile;
import org.structuredparsing.java15grammar.ast.SourceFile;
import org.structuredparsing.java15grammar.ast.ImportStatement;

COMPILER Java15

private SourceFile sourceFile = new SourceFile();

ISourceFile getSourceFile() {
return sourceFile;
}

Unittest

The empty java source file should be legal

    private static final String sEmptyString = null;
 
    @Test
    public void testParser_empty_java_file() throws UnsupportedEncodingException {
        System.out.println("testParser_empty_java_file");
        // Initialize
        String sContent = "";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertTrue( sourceFile.getImportList().isEmpty() );
    }

QualifiedIdentifier

Examples of QualifiedIdentifier:

org.structuredparsing.parser
org.junit
Student

This is defined as

Java Syntax Rule
Identifier:
IDENTIFIER

QualifiedIdentifier:
Identifier { . Identifier }

This is defined in Coco/R grammar

Coco/R EBFN Rule
PRODUCTIONS

QualifiedIdentifier =
Identifier { Period Identifier }.

Identifier and Period are tokens defined in the JFlex grammar.

Building the AST for QualifiedIdentifier

Actions must be added to QualifiedIdentifier rule. How ever, the content of the QualifiedIdentifier must be stored somewhere. A perfect jobb for a StringBuilder object.

Coco/R EBFN Rule
PRODUCTIONS

QualifiedIdentifier<out StringBuilder sbId> =
Identifier (. sbId = new StringBuilder(); sbId.append( t.val ); .)
{ Period Identifier (. sbId.append( "." + t.val ); .)
}.

Before unittesting Java15 must be updated with actions

Coco/R EBFN Rule
PRODUCTIONS

Java15 = (. StringBuilder sbId = null; .)
[{Annotation} Package
QualifiedIdentifier<out sbId>
(. sourceFile.setPacakgeContent( sbId.toString() ); .)
SemiColon] {ImportDeclaration} {TypeDeclaration}.

Unittesting QualifiedIdentifier

This is verified by a combination of different qualified identifiers

    @Test
    public void testParser_package_simple_url() throws UnsupportedEncodingException {
        System.out.println("testParser_package_simple_url");
        // Initialize
        String sUrl = "parser";
        String sContent = "package " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sUrl, sourceFile.getPackageContent() );
        assertTrue( sourceFile.getImportList().isEmpty() );
    }
 
    @Test
    public void testParser_package_complex_url() throws UnsupportedEncodingException {
        System.out.println("testParser_package_complex_url");
        // Initialize
        String sUrl = "org.structuredparsing.java15grammar.parser";
        String sContent = "package " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sUrl, sourceFile.getPackageContent() );
        assertTrue( sourceFile.getImportList().isEmpty() );
    }

ImportDeclaration

Examples of ImportDeclaration:

import java.util.*;
import org.structuredparsing.cocor.visualizer.Rule;

This is defined as

Java Syntax Rule
ImportDeclaration:
import [ static] Identifier { . Identifier } [ . * ] ;

This is defined in Coco/R grammar

Coco/R EBFN Rule
PRODUCTIONS

ImportDeclaration =
Import [ Static ] Identifier { Period Identifier } [ Period Asterix ] SemiColon.

Import, Static, Identifier, Period, Asterix and SemiColon are tokens defined in the JFlex grammar.

Compiling this grammar will display

checking
  Java15 deletable
  LL1 warning in ImportDeclaration: Period is start & successor of deletable structure
parser generated
0 errors detected

No errors, but one warning. Consider the ImportDeclaration rule. Coco/R uses a LL(1) parser algorithm. That means that it use one look-ahead token to determine what rule path to go down to. Say, the parser parses the following import statement

import org.structuredparsing.cocor.visualizer.*;

It has parsed import and org (import is an Import token, org is an Identifer token). Now it is about to parse the Period token. The problem is that it can't decide if it should use the { Period Identifier } or the [ Period Asterix ]. If nothing is done, it will use the first rule, { Period Identifier } , and that will make the parser to fail when a .* encountered. The solution is to use the Coco/R IF-clause:

Coco/R EBFN Rule
PRODUCTIONS

ImportDeclaration =
Import [ Static ] Identifier { IF(isPeriodFollowedByIdentifier()) Period
Identifier } [ Period Asterix ] SemiColon.

Compiling will display

checking
  Java15 deletable
parser generated
0 errors detected

However, the IF(isPeriodFollowedByIdentifier()) IF-clause implies some java code that must be written

Coco/R EBFN Rule
COMPILER Java15

boolean isPeriodFollowedByIdentifier() {
Token la_next = scanner.Peek();
return la.kind == Parser._Period && la_next.kind == Parser._Identifier;
}

PRODUCTIONS

Adding Actions to the rule ImportDeclaration

To be able to test ImportDeclaration actions my be added to store the information into the AST:

Coco/R EBFN Rule
PRODUCTIONS

ImportDeclaration = (. StringBuilder sbImportContent = new StringBuilder();
ImportStatement importStmt = new ImportStatement(); .)
Import Static (. importStmt.setStatic( true ); .)
] Identifier (. sbImportContent.append( t.val ); .)
{ IF(isPeriodFollowedByIdentifier()) Period Identifier
(. sbImportContent.append( "." + t.val ); .)
} [ Period Asterix (. sbImportContent.append( "." + t.val ); .)
] SemiColon (. importStmt.setImportUrl( sbImportContent.toString() );
sourceFile.addImportContent( importStmt ); .)
.

Unittesting ImportDeclaration

This is verified by a combination of different import statement, with and without asterix and static

    @Test
    public void testParser_import_simple_url() throws UnsupportedEncodingException {
        System.out.println("testParser_import_simple_url");
        // Initialize
        String sUrl = "Scanner";
        String sContent = "import " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_import_complex_url() throws UnsupportedEncodingException {
        System.out.println("testParser_import_complex_url");
        // Initialize
        String sUrl = "org.structuredparsing.java15grammar.Scanner";
        String sContent = "import " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_import_simple_url_with_asterix() throws UnsupportedEncodingException {
        System.out.println("testParser_import_simple_url_with_asterix");
        // Initialize
        String sUrl = "java15grammar.*";
        String sContent = "import " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_import_complex_url_with_asterix() throws UnsupportedEncodingException {
        System.out.println("testParser_import_complex_url_with_asterix");
        // Initialize
        String sUrl = "org.structuredparsing.java15grammar.*";
        String sContent = "import " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_import_static_simple_url() throws UnsupportedEncodingException {
        System.out.println("testParser_static_import_simple_url");
        // Initialize
        String sUrl = "Scanner";
        String sContent = "import static " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertTrue( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_import_static_complex_url() throws UnsupportedEncodingException {
        System.out.println("testParser_static_import_complex_url");
        // Initialize
        String sUrl = "org.structuredparsing.java15grammar.Scanner";
        String sContent = "import static " + sUrl + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 1, sourceFile.getImportList().size() );
        assertEquals( sUrl, sourceFile.getImportList().get( 0 ).getImportURL() );
        assertTrue( sourceFile.getImportList().get( 0 ).isStatic() );
    }
 
    @Test
    public void testParser_several_import_statements() throws UnsupportedEncodingException {
        System.out.println("testParser_static_import_complex_url");
        // Initialize
        String sUrl_1 = "org.structuredparsing.java15grammar.Scanner";
        String sUrl_2 = "org.structuredparsing.*";
        String sUrl_3 = "java.lang.*";
        String sContent = "import " + sUrl_1 + ";\nimport " + sUrl_2 + ";\nimport " + sUrl_3 + ";";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner scanner = new Scanner(is);
        Parser instance = new Parser( scanner );
        int nIndex = 0;
        // Test
        instance.Parse();
        ISourceFile sourceFile = instance.getSourceFile();
        // Validate
        assertEquals( sEmptyString, sourceFile.getPackageContent() );
        assertEquals( 3, sourceFile.getImportList().size() );
        assertEquals( sUrl_1, sourceFile.getImportList().get( nIndex ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( nIndex ).isStatic() );
        ++nIndex;
        assertEquals( sUrl_2, sourceFile.getImportList().get( nIndex ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( nIndex ).isStatic() );
        ++nIndex;
        assertEquals( sUrl_3, sourceFile.getImportList().get( nIndex ).getImportURL() );
        assertFalse( sourceFile.getImportList().get( nIndex ).isStatic() );
    }

Next Chapter Grammar Rules - part 6


<- back

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License