Parsing Engine

danbikel.parser.lang
Class AbstractWordFeatures

java.lang.Object
  extended by danbikel.parser.lang.AbstractWordFeatures
All Implemented Interfaces:
WordFeatures, Serializable
Direct Known Subclasses:
SimpleWordFeatures, SimpleWordFeatures, SimpleWordFeatures, WordFeatures, WordFeatures

public abstract class AbstractWordFeatures
extends Object
implements WordFeatures, Serializable

Provides a default abstract implementation of the WordFeatures interface.

See Also:
Serialized Form

Field Summary
protected static Symbol unknownWordSym
          The unique symbol to represent unknown words.
 
Constructor Summary
protected AbstractWordFeatures()
          Default constructor, to be called by subclasses (usually implicitly).
 
Method Summary
abstract  Symbol defaultFeatureVector()
          The symbol that represents the case where none of the features fires for a particular word.
 Symbol features(Symbol word, boolean firstWord)
          Returns a symbol representing the orthographic and/or morphological features of the specified word.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

unknownWordSym

protected static Symbol unknownWordSym
The unique symbol to represent unknown words. The default value is the return value of Symbol.add("+unknown+"); if this maps to an actual word in a particular language or Treebank, this data member should be reassigned in a subclass.

Constructor Detail

AbstractWordFeatures

protected AbstractWordFeatures()
Default constructor, to be called by subclasses (usually implicitly).

Method Detail

features

public Symbol features(Symbol word,
                       boolean firstWord)
Returns a symbol representing the orthographic and/or morphological features of the specified word. This default implementation simply returns the unknown word symbol.

Specified by:
features in interface WordFeatures
Parameters:
word - the word whose features are to be computed
firstWord - whether word is the first word in the sentence (useful when computing capitalization features for certain languages, such as English)
Returns:
a symbol representing the orthographic and/or morphological features of word
See Also:
unknownWordSym

defaultFeatureVector

public abstract Symbol defaultFeatureVector()
The symbol that represents the case where none of the features fires for a particular word.

Specified by:
defaultFeatureVector in interface WordFeatures

Parsing Engine

Author: Dan Bikel.