Java 1.5 Parser - Scanner and Lexer - Part 3

<- back

Next Chapter Scanner and Lexer - part 4

Literals: Integer and Floating-Point

Integer literals

Chapter 3.10.1 Integer Literals defines integer literals. Integer literals may have 10 (decimal), 16 (hexadecimal) or 8 (octal) as base:

Java Syntax Rule
IntegerLiteral:
DecimalIntegerLiteral
HexIntegerLiteral
OctalIntegerLiteral

DecimalIntegerLiteral:
DecimalNumeral IntegerTypeSuffixopt

HexIntegerLiteral:
HexNumeral IntegerTypeSuffixopt

OctalIntegerLiteral:
OctalNumeral IntegerTypeSuffixopt

IntegerTypeSuffix: one of
l L

Decimal intiger literal is defined as

Java Syntax Rule
DecimalNumeral:

NonZeroDigit Digitsopt

Digits:
Digit
Digits Digit

Digit:
0
NonZeroDigit

NonZeroDigit: one of
1 2 3 4 5 6 7 8 9

Hexadecimal literal is defined as

Java Syntax Rule
HexNumeral:
0 x HexDigits
0 X HexDigits

HexDigits:
HexDigit
HexDigit HexDigits

Octal literal is defined as

Java Syntax Rule
OctalNumeral:
0 OctalDigits

OctalDigits:
OctalDigit
OctalDigit OctalDigits

OctalDigit: one of
0 1 2 3 4 5 6 7

These rules are converted the following JFlex grammar

JFlex grammar Rule
%%

DecimalIntegerLiteral = (0 | [1-9][0-9]*)(l|L)?
HexIntegerLiteral = 0(x|X)[0-9a-fA-F]+(l|L)?
OctalIntegerLiteral = 0[0-7]+(l|L)?
%%
<YYINITIAL> {
{DecimalIntegerLiteral} { return new Token(Parser._IntigerLiteral, yycolumn + 1, yyline + 1, yychar, yytext()); }
{HexIntegerLiteral} { return new Token(Parser._IntigerLiteral, yycolumn + 1, yyline + 1, yychar, yytext()); }
{OctalIntegerLiteral} { return new Token(Parser._IntigerLiteral, yycolumn + 1, yyline + 1, yychar, yytext()); }
}

The Coco/r parser must also be updated

Coco/R EBFN Rule
TOKENS
IntigerLiteral

Unittests

A good idea would be to use random numbers of random length. If all combinations of l, L, 0x and 0X will be verified a quite a few test will be required.

Here are some

@Test
    public void testScan_token_IntigerLiteral_0() throws UnsupportedEncodingException {
        System.out.println("testScan_token_IntigerLiteral_0");
        // Initialize
        String sContent = "0";
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._IntigerLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    @Test
    public void testScan_token_IntigerLiteral_random_decimal_literal() throws UnsupportedEncodingException {
        System.out.println("testScan_token_IntigerLiteral_random_decimal_literal");
        // Initialize
        String sContent = createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE );
        System.out.println("  Random integer literal: " + sContent );
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._IntigerLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    @Test
    public void testScan_token_IntigerLiteral_random_decimal_literal_l() throws UnsupportedEncodingException {
        System.out.println("testScan_token_IntigerLiteral_random_decimal_literal_l");
        // Initialize
        String sContent = createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE ) + "l";
        System.out.println("  Random integer literal: " + sContent );
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._IntigerLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    private enum eNUMBER_OF_DIGITS { E_AT_LEAST_ONE, E_MAY_BE_ZERO };
 
    private String createRandomIntigerNumber( eNUMBER_OF_DIGITS eNumberOfDigits ) {
        StringBuilder sb = new StringBuilder();
        char n = Character.forDigit(randomGenerator.nextInt(9) + 1, 10);
        if ( eNumberOfDigits == eNUMBER_OF_DIGITS.E_AT_LEAST_ONE ) {
            sb.append(n);
        }
        int maxDigits = randomGenerator.nextInt(6);
        for ( int i=0; i<maxDigits; ++i ) {
            n = Character.forDigit(randomGenerator.nextInt(10), 10);
            sb.append(n);
        }
        return sb.toString();
    }

The same principles could be done to hexadecimal and octal numbers.

Floating-point literals

The Chapter 3.10.2 Floating-Point Literals describe floating-point literals

Java Syntax Rule
FloatingPointLiteral:
DecimalFloatingPointLiteral
HexadecimalFloatingPointLiteral

DecimalFloatingPointLiteral:
Digits . Digitsopt ExponentPartopt FloatTypeSuffixopt
. Digits ExponentPartopt FloatTypeSuffixopt
Digits ExponentPart FloatTypeSuffixopt
Digits ExponentPartopt FloatTypeSuffix

ExponentPart:
ExponentIndicator SignedInteger

ExponentIndicator: one of
e E

SignedInteger:
Signopt Digits

Sign: one of
+ -

FloatTypeSuffix: one of
f F d D

HexadecimalFloatingPointLiteral:
HexSignificand BinaryExponent FloatTypeSuffixopt

HexSignificand:
HexNumeral
HexNumeral .
0x HexDigitsopt . HexDigits
0X HexDigitsopt . HexDigits

BinaryExponent:
BinaryExponentIndicator SignedInteger

BinaryExponentIndicator:one of
p P

These rules are converted the following JFlex grammar

JFlex grammar Rule
%%

FloatingPointLiteral = [0-9]+\.[0-9]*((e|E)(\+|-)?[0-9]+)?(f|F|d|D)? |
\.[0-9]+((e|E)(\+|-)?[0-9]+)?(f|F|d|D)? |
[0-9]+((e|E)(\+|-)?[0-9]+)(f|F|d|D)? |
[0-9]+((e|E)(\+|-)?[0-9]+)?(f|F|d|D) |
(0(x|X)[0-9a-fA-F]+\.?|0(x|X)[0-9a-fA-F]*\.[0-9a-fA-F]+)(p|P)(\+|-)?[0-9]+(f|F|d|D)?
%%
<YYINITIAL> {
{FloatingPointLiteral} { return new Token(Parser._FloatingPointLiteral, yycolumn + 1, yyline + 1, yychar, yytext()); }

The Coco/r parser must also be updated

Coco/R EBFN Rule
TOKENS
FloatingPointLiteral

Unittests

The below rules verifies the rule [0-9]+((e|E)(\+|-)?[0-9]+)?(f|F|d|D).

@Test
    public void testScan_token_FloatingPointLiteral_digits_exp() throws UnsupportedEncodingException {
        System.out.println("testScan_token_FloatingPointLiteral_digits_exp");
        // Initialize
        String sFirstDigits = createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE );
        String sContent = sFirstDigits + randomExponent();
        System.out.println("  Random floating point literal: " + sContent );
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._FloatingPointLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    @Test
    public void testScan_token_FloatingPointLiteral_digits_exp_suffix() throws UnsupportedEncodingException {
        System.out.println("testScan_token_FloatingPointLiteral_digits_exp_suffix");
        // Initialize
        String sFirstDigits = createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE );
        String sContent = sFirstDigits + randomExponent() + randomFloatTypeSuffix();
        System.out.println("  Random floating point literal: " + sContent );
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._FloatingPointLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    @Test
    public void testScan_token_FloatingPointLiteral_digits_suffix() throws UnsupportedEncodingException {
        System.out.println("testScan_token_FloatingPointLiteral_digits_exp_suffix");
        // Initialize
        String sFirstDigits = createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE );
        String sContent = sFirstDigits + randomFloatTypeSuffix();
        System.out.println("  Random floating point literal: " + sContent );
        InputStream is = new ByteArrayInputStream(sContent.getBytes("UTF-8"));
        Scanner instance = new Scanner(is);
        Token expected = new Token( Parser._FloatingPointLiteral, 0, 0, 0, sContent );
        // Test
        Token result = instance.Scan();
        // Validate
        assertNotNull( result );
        assertEquals( expected.kind, result.kind );
    }
 
    private String randomExponent() {
        StringBuilder sb = new StringBuilder();
        String sExp = null;
        int nExp = randomGenerator.nextInt(2);
        switch( nExp ) {
            case 0:
                sExp = "e";
                break;
            default:
                sExp = "E";
        }
        sb.append(sExp);
        String sSign = null;
        int nSign = randomGenerator.nextInt(3);
        switch( nSign ) {
            case 0:
                sSign = "+";
                break;
            case 1:
                sSign = "-";
                break;
            default:
            sSign = "";
        }
        sb.append(sSign);
        sb.append( createRandomIntigerNumber( eNUMBER_OF_DIGITS.E_AT_LEAST_ONE ) );
        return sb.toString();
    }

Next Chapter Scanner and Lexer - part 4


<- back

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License