Parsing Engine

danbikel.lisp
Class WordTokenizer

java.lang.Object
  extended by danbikel.lisp.WordTokenizer
Direct Known Subclasses:
SexpTokenizer

public class WordTokenizer
extends Object

A simple tokenizer for words only (no numbers and no significant eol chars). Many of the methods and some of the data members of StreamTokenizer exist in this class, but at present, this class does not extend StreamTokenizer. This class recognizes comments as beginning with an optionally-specified comment character. A comment is a line where the first non-whitespace character is the comment character.


Field Summary
 StreamTokenizer javadocHack
          Included as a public data member so that javadoc can resolve external links to members of the StreamTokenizer class.
 String sval
          Contains the most recent word tokenized by this tokenizer.
 int ttype
          The type of the last token read, using the type definitions in StreamTokenizer.
 
Constructor Summary
WordTokenizer(Reader inStream)
          Creates a new tokenizer object.
 
Method Summary
 void close()
          Closes the underlying stream.
 void commentChar(int ch)
          Specifies a character to be treated as the start of a comment on the current line.
 int lineno()
          Returns the line number of the underlying character stream.
 int nextToken()
          Reads the next token from the underlying character stream and returns its type, which is also stored in ttype.
 long numCharsRead()
          Returns the number of characters read from the underlying reader for this word tokenizer.
 void ordinaryChar(char ch)
          Specifies a character to treated as a token delimiter, to be contained in ttype after it is read.
 void ordinaryChars(int low, int hi)
          Specifies a range of characters to treated as a token delimiter, to be contained in ttype after it is read.
 void pushBack()
          Causes the most recent token read (either a word or ordinary character) to be pushed back, so that it is the next token returned by nextToken().
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

javadocHack

public StreamTokenizer javadocHack
Included as a public data member so that javadoc can resolve external links to members of the StreamTokenizer class.


ttype

public int ttype
The type of the last token read, using the type definitions in StreamTokenizer. Because this tokenizer only reads ordinary characters and words, the value of ttype will only ever be StreamTokenizer.TT_EOF, StreamTokenizer.TT_WORD or an ordinary character.

See Also:
ordinaryChar(char), ordinaryChars(int, int)

sval

public String sval
Contains the most recent word tokenized by this tokenizer.

Constructor Detail

WordTokenizer

public WordTokenizer(Reader inStream)
Creates a new tokenizer object.

Parameters:
inStream - the stream to be tokenized.
Method Detail

commentChar

public void commentChar(int ch)
Specifies a character to be treated as the start of a comment on the current line. The comment character must have an integer value greater than 0.

Parameters:
ch - the character to be treated as the start of a single-line comment

ordinaryChar

public void ordinaryChar(char ch)
Specifies a character to treated as a token delimiter, to be contained in ttype after it is read. The character must be in the range of 0 <= ch <= Byte.MAX_VALUE.

Parameters:
ch - the character to be tested

ordinaryChars

public void ordinaryChars(int low,
                          int hi)
Specifies a range of characters to treated as a token delimiter, to be contained in ttype after it is read. The characters must be in the range of 0 <= ch <= Byte.MAX_VALUE.

Parameters:
low - the lowest-valued character in a range to be treated as token delimiters
hi - the highest-valued character in a range to be treated as token delimiters

nextToken

public int nextToken()
              throws IOException
Reads the next token from the underlying character stream and returns its type, which is also stored in ttype.

Returns:
the type of the token that was just read (also stored in ttype)
Throws:
IOException - if there was a problem reading the next token from the underlying stream

pushBack

public void pushBack()
Causes the most recent token read (either a word or ordinary character) to be pushed back, so that it is the next token returned by nextToken().


lineno

public int lineno()
Returns the line number of the underlying character stream.

Returns:
the line number of the underlying character stream.

close

public void close()
           throws IOException
Closes the underlying stream.

Throws:
IOException - if the underlying stream throws an IOException while being closed

numCharsRead

public long numCharsRead()
Returns the number of characters read from the underlying reader for this word tokenizer.

Returns:
the number of characters read

Parsing Engine

Author: Dan Bikel.