jfun.parsec

Class Lexers

public final class Lexers extends Object

Provides some predefined basic lexer objects. A lexer is a character level parser that returns a token based on the recognized character range.

Author: Ben Yu Dec 19, 2004

Method Summary
static Parser<Tok>allInteger()
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok>allInteger(String name)
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok>charLiteral()
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.
static Parser<Tok>charLiteral(String name)
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.
static Parser<Tok>decimal()
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.
static Parser<Tok>decimal(String name)
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.
static Parser<Tok>decInteger()
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static Parser<Tok>decInteger(String name)
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static WordsgetCaseInsensitive(String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static WordsgetCaseInsensitive(Parser<?> wscanner, String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static WordsgetCaseInsensitive(Parser<?> wscanner, String[] ops, String[] keywords, FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static WordsgetCaseSensitive(String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static WordsgetCaseSensitive(Parser<?> wscanner, String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static WordsgetCaseSensitive(Parser<?> wscanner, String[] ops, String[] keywords, FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static WordsgetOperators(String... ops)
Creates a Words object for lexing the operators with names specified in ops.
static Parser<Tok>hexInteger()
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.
static Parser<Tok>hexInteger(String name)
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.
static Parser<Tok>integer()
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token.
static Parser<Tok>integer(String name)
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token.
static Parser<Tok>lexDecLong()
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static Parser<Tok>lexDecLong(String name)
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static Parser<Tok[]>lexeme(String name, Parser<?> delim, Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s.
static Parser<Tok[]>lexeme(Parser<?> delim, Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s.
static Parser<Tok>lexer(String name, Parser<?> s, Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok>lexer(Parser<?> s, Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok>lexer(Parser<?> s, Tokenizer tn, String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok>lexer(String name, Parser<?> s, Tokenizer tn, String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok>lexHexLong()
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.
static Parser<Tok>lexHexLong(String name)
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.
static Parser<Tok>lexLong()
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok>lexLong(String name)
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok>lexOctLong()
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.
static Parser<Tok>lexOctLong(String name)
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.
static Parser<Tok>lexSimpleStringLiteral()
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok>lexSimpleStringLiteral(String name)
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok>octInteger()
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.
static Parser<Tok>octInteger(String name)
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.
static Parser<Tok>quoted(String name, char open, char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.
static Parser<Tok>quoted(char open, char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.
static Parser<Tok>sqlStringLiteral()
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.
static Parser<Tok>sqlStringLiteral(String name)
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.
static Parser<Tok>stringLiteral()
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok>stringLiteral(String name)
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok>word()
returns the lexer that's gonna parse any word. and convert the string to a TokenWord.
static Parser<Tok>word(String name)
returns the lexer that's gonna parse any word. and convert the string to a TokenWord.

Method Detail

allInteger

public static Parser<Tok> allInteger()

Deprecated: Use lexLong.

returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Returns: the lexer.

allInteger

public static Parser<Tok> allInteger(String name)

Deprecated: Use lexLong.

returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Parameters: name the lexer name.

Returns: the lexer.

charLiteral

public static Parser<Tok> charLiteral()
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.

Returns: the lexer.

charLiteral

public static Parser<Tok> charLiteral(String name)
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.

Parameters: name the lexer name.

Returns: the lexer.

decimal

public static Parser<Tok> decimal()
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.

Returns: the lexer.

decimal

public static Parser<Tok> decimal(String name)
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.

Parameters: name the lexer name.

Returns: the lexer.

decInteger

public static Parser<Tok> decInteger()

Deprecated: Use lexDecLong.

returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns: the lexer.

decInteger

public static Parser<Tok> decInteger(String name)

Deprecated: Use lexDecLong.

returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters: name the lexer name.

Returns: the lexer.

getCaseInsensitive

public static Words getCaseInsensitive(String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord. A word is defined as an alpha numeric string that starts with [_a-zA-Z], with 0 or more [0-9_a-zA-Z] following.

Parameters: ops the operator names. keywords the keyword names.

Returns: the Words instance.

getCaseInsensitive

public static Words getCaseInsensitive(Parser<?> wscanner, String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters: wscanner the scanner for a word in the language. ops the operator names. keywords the keyword names.

Returns: the Words instance.

getCaseInsensitive

public static Words getCaseInsensitive(Parser<?> wscanner, String[] ops, String[] keywords, FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters: wscanner the scanner for a word in the language. ops the operator names. keywords the keyword names. toWord the FromString object used to create a token for non-key words recognized by wscanner.

Returns: the Words instance.

getCaseSensitive

public static Words getCaseSensitive(String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord. A word is defined as an alpha numeric string that starts with [_a-zA-Z], with 0 or more [0-9_a-zA-Z] following.

Parameters: ops the operator names. keywords the keyword names.

Returns: the Words instance.

getCaseSensitive

public static Words getCaseSensitive(Parser<?> wscanner, String[] ops, String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters: wscanner the scanner for a word in the language. ops the operator names. keywords the keyword names.

Returns: the Words instance.

getCaseSensitive

public static Words getCaseSensitive(Parser<?> wscanner, String[] ops, String[] keywords, FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters: wscanner the scanner for a word in the language. ops the operator names. keywords the keyword names. toWord the FromString object used to create a token for non-key words recognized by wscanner.

Returns: the Words instance.

getOperators

public static Words getOperators(String... ops)
Creates a Words object for lexing the operators with names specified in ops. Operators are lexed as TokenReserved.

Parameters: ops the operator names.

Returns: the Words instance.

hexInteger

public static Parser<Tok> hexInteger()

Deprecated: Use lexHexLong.

returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Returns: the lexer.

hexInteger

public static Parser<Tok> hexInteger(String name)

Deprecated: Use lexHexLong.

returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Parameters: name the lexer name.

Returns: the lexer.

integer

public static Parser<Tok> integer()
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns: the lexer.

integer

public static Parser<Tok> integer(String name)
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters: name the lexer name.

Returns: the lexer.

lexDecLong

public static Parser<Tok> lexDecLong()
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns: the lexer.

lexDecLong

public static Parser<Tok> lexDecLong(String name)
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters: name the lexer name.

Returns: the lexer.

lexeme

public static Parser<Tok[]> lexeme(String name, Parser<?> delim, Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s. Parser s has to be a lexer object that returns a Tok object. The result Tok objects are collected and returned in a Tok[] array.

Parameters: name the name of the new Parser object. delim the delimiter Parser object. s the Parser object.

Returns: the new Parser object.

lexeme

public static Parser<Tok[]> lexeme(Parser<?> delim, Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s. Parser s has to be a lexer object that returns a Tok object. The result Tok objects are collected and returned in a Tok[] array.

Parameters: delim the delimiter Parser object. s the Parser object.

Returns: the new Parser object.

lexer

public static Parser<Tok> lexer(String name, Parser<?> s, Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters: name the name of the new Scanner. tn the Tokenizer object. s the scanner to transform.

Returns: the new Scanner.

lexer

public static Parser<Tok> lexer(Parser<?> s, Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters: s the scanner to transform. tn the Tokenizer object.

Returns: the new Scanner.

lexer

public static Parser<Tok> lexer(Parser<?> s, Tokenizer tn, String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters: s the scanner to transform. tn the Tokenizer object. err the error message when the tokenizer returns null.

Returns: the new Scanner.

lexer

public static Parser<Tok> lexer(String name, Parser<?> s, Tokenizer tn, String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters: name the name of the new Scanner. s the scanner to transform. tn the Tokenizer object. err the error message when the tokenizer returns null.

Returns: the new Scanner.

lexHexLong

public static Parser<Tok> lexHexLong()
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Returns: the lexer.

lexHexLong

public static Parser<Tok> lexHexLong(String name)
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Parameters: name the lexer name.

Returns: the lexer.

lexLong

public static Parser<Tok> lexLong()
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Returns: the lexer.

lexLong

public static Parser<Tok> lexLong(String name)
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Parameters: name the lexer name.

Returns: the lexer.

lexOctLong

public static Parser<Tok> lexOctLong()
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Returns: the lexer.

lexOctLong

public static Parser<Tok> lexOctLong(String name)
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Parameters: name the lexer name.

Returns: the lexer.

lexSimpleStringLiteral

public static Parser<Tok> lexSimpleStringLiteral()
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Returns: the lexer.

lexSimpleStringLiteral

public static Parser<Tok> lexSimpleStringLiteral(String name)
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Parameters: name the lexer name.

Returns: the lexer.

octInteger

public static Parser<Tok> octInteger()

Deprecated: Use lexOctLong.

returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Returns: the lexer.

octInteger

public static Parser<Tok> octInteger(String name)

Deprecated: Use lexOctLong.

returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Parameters: name the lexer name.

Returns: the lexer.

quoted

public static Parser<Tok> quoted(String name, char open, char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.

Parameters: name the lexer name. open the opening character. close the closing character.

Returns: the lexer.

quoted

public static Parser<Tok> quoted(char open, char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.

Parameters: open the opening character. close the closing character.

Returns: the lexer.

sqlStringLiteral

public static Parser<Tok> sqlStringLiteral()
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.

Returns: the lexer.

sqlStringLiteral

public static Parser<Tok> sqlStringLiteral(String name)
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.

Parameters: name the lexer name.

Returns: the lexer.

stringLiteral

public static Parser<Tok> stringLiteral()

Deprecated: Use lexSimpleStringLiteral

returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Returns: the lexer.

stringLiteral

public static Parser<Tok> stringLiteral(String name)

Deprecated: Use lexSimpleStringLiteral

returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Parameters: name the lexer name.

Returns: the lexer.

word

public static Parser<Tok> word()
returns the lexer that's gonna parse any word. and convert the string to a TokenWord. A word starts with an alphametic character, followed by 0 or more alphanumeric characters.

Returns: the lexer.

word

public static Parser<Tok> word(String name)
returns the lexer that's gonna parse any word. and convert the string to a TokenWord. A word starts with an alphametic character, followed by 0 or more alphanumeric characters.

Parameters: name the lexer name.

Returns: the lexer.