Parsing Engine

danbikel.parser.lang
Class AbstractTreebank

java.lang.Object
  extended by danbikel.parser.lang.AbstractTreebank
All Implemented Interfaces:
Treebank, Serializable
Direct Known Subclasses:
BrokenTreebank, Treebank, Treebank, Treebank

public abstract class AbstractTreebank
extends Object
implements Treebank, Serializable

A collection of mostly-abstract methods to be implemented by a langauge-specific subclass. A Treebank implementation provides data and methods specific to the structures found in a particular Treebank.

See Also:
Serialized Form

Field Summary
protected  BitSet augmentationDelimSet
          A BitSet indexed by character (that is, whose size is Character.MAX_VALUE), where for each character c of the string returned by augmentationDelimiters(), augmentationDelimSet.get(c) returns true.
protected  Symbol canonicalAugDelimSym
          A Symbol created from the first character of Treebank.augmentationDelimiters().
protected  Symbol[] nonterminalExceptionSet
          A set of nonterminal labels (Symbol objects) that defaultParseNonterminal(Symbol,Nonterminal) should use when determining the base nonterminal label.
 
Constructor Summary
AbstractTreebank()
          No-arg constructor, to be called by all subclasses of this abstract class.
 
Method Summary
 void addAugmentation(Nonterminal nonterminal, Symbol augmentation)
          Adds the specified augmentation to the end of the (possibly empty) augmentation list of the specified Nonterminal object.
abstract  String augmentationDelimiters()
          Returns a string whose characters are the set of delimiters for complex nonterminal labels.
abstract  Symbol baseNPLabel()
          Returns the symbol with which Training.addBaseNPs(Sexp) will relabel core NPs.
 char canonicalAugDelimiter()
          Returns the first character of the string returned by augmentationDelimiters(), which will be considered the "canonical" augmentation delimiter when adding new augmentations, such as the argument augmentations added by implementations of Training.identifyArguments(Sexp).
 Sexp constructPreterminal(Word word)
          Converts a Word object into a preterminal subtree.
 boolean containsAugmentation(Symbol nonterminal, Symbol augmentation)
          Provides an efficient, thread-safe method for testing whether the specified nonterminal contains the specified augmentation (without parsing the nonterminal).
 void defaultParseNonterminal(Symbol label, Nonterminal nonterminal)
          Fills in the specified Nonterminal object to represent all the components of a complex nonterminal annotation: the base label, any augmentations and any index.
abstract  Symbol getCanonical(Symbol label)
          Returns a canonical mapping for the specified nonterminal label; if label already is in canonical form, it is returned.
abstract  Symbol getCanonical(Symbol label, boolean stripAugmentations)
          Returns a canonical version of the specified nonterminal label; if label already is in canonical form, it is returned.
 Symbol getTag(Sexp preterminal)
          Gets the component of the preterminal tree that corresponds to the part of speech tag.
 int getTraceIndex(Sexp preterm, Nonterminal nonterminal)
          Returns the index of a trace for the specified null element preterminal.
 boolean isAugDelim(Sexp sexp)
          Returns whether the specified S-expression is a symbol that is an augmentation delimiter for a complex nonterminal label.
 boolean isBaseNP(Symbol label)
          Returns whether the specified label is for a base NP.
abstract  boolean isComma(Symbol word)
          Returns true if the specified word is a comma.
abstract  boolean isConjunction(Symbol label)
          Returns true if the canonical version of the specified label is a conjunction tag or nonterminal in a particular Treebank.
abstract  boolean isLeftParen(Symbol word)
          Returns true if the specified word is a left parenthesis.
abstract  boolean isNP(Symbol label)
          Returns true if the canonical version of the specified label is an NP for this language’s Treebank.
abstract  boolean isNullElementPreterminal(Sexp tree)
          Returns true if the specified S-expression represents a preterminal whose terminal element is the null element for this language’s Treebank.
abstract  boolean isPossessivePreterminal(Sexp tree)
          Returns true if the specified S-expression represents a preterminal that is the possessive part of speech.
abstract  boolean isPreterminal(Sexp tree)
          Returns whether tree represents a preterminal subtree in the parse trees for this language's Treebank.
abstract  boolean isPuncToRaise(Sexp preterm)
          Returns true if the specified S-expression represents a preterminal and a part-of-speech tag that indicates punctuation to be raised when running Training.raisePunctuation(Sexp).
abstract  boolean isPunctuation(Symbol tag)
          Returns true if the specified part of speech tag is one for which isPuncToRaise(Sexp) would return true.
abstract  boolean isRightParen(Symbol word)
          Returns true if the specified word is a right parenthesis.
abstract  boolean isSentence(Symbol label)
          Returns true is the specified nonterminal label represents a sentence in this language’s Treebank.
abstract  boolean isVerb(Sexp preterminal)
          Returns true if the specified preterminal is that of a verb.
abstract  boolean isVerbTag(Symbol tag)
          Returns true if the specified symbol is the part of speech tag of a verb.
abstract  boolean isWHNP(Symbol label)
          Returns true if the canonical version of the specified label is an NP that undergoes WH-movement in a particular Treebank.
 Word makeWord(Sexp preterminal)
          Constructs a Word object from the specified preterminal subtree.
 char nonTreebankDelimiter()
          Returns a delimiter not already in use by the current treebank, for use when constructing lexicalized nonterminals when the Settings.decoderOutputHeadLexicalizedLabels is true.
 char nonTreebankLeftBracket()
          Returns a left-bracket character that is not an existing metacharacter in the current treebank, for use when the Settings.decoderOutputHeadLexicalizedLabels is true.
 char nonTreebankRightBracket()
          Returns a right-bracket character that is not an existing metacharacter in the current treebank, for use when constructing lexicalized nonterminals when the Settings.decoderOutputHeadLexicalizedLabels is true.
abstract  Symbol NPLabel()
          Returns the symbol that Training.addBaseNPs(Sexp) should add as a parent if a base NP is not dominated by an NP.
 Nonterminal parseNonterminal(Symbol label)
          Returns a Nonterminal object to represent all the components of a complex nonterminal annotation: the base label, any augmentations and any index.
abstract  Nonterminal parseNonterminal(Symbol label, Nonterminal nonterminal)
          Identical to parseNonterminal(Symbol), except that instead of returning a newly-created Nonterminal object, this method merely modifies the specified Nonterminal object.
 boolean removeAugmentation(Nonterminal nonterminal, Symbol augmentation)
          Removes the specified augmentation from the augmentation list of the specified Nonterminal object, and the previous augmentation delimiter.
 Sexp removeAugmentation(Sexp sexp, Nonterminal nonterminal, Symbol augmentation)
          Removes the specified nonterminal augmentation from the specified S-expression, using the specified Nonterminal object for temporary storage.
abstract  Symbol sentenceLabel()
          Returns the canonical label for a sentence, for de-transforming sentences that were transformed via Training.relabelSubjectlessSentences(Sexp).
 Symbol stripAllButIndex(Symbol label)
          Returns a symbol identical to the specified label, except all augmentations other than the index will be removed.
 Symbol stripAllButIndex(Symbol label, Nonterminal nonterminal)
          Identical to stripAllButIndex(Symbol), except that instead of creating a new Nonterminal object for use by parseNonterminal(Symbol,Nonterminal), this method uses the specified nonterminal object.
 Symbol stripAugmentation(Symbol label)
          Returns the Symbol created by stripping off all augmentations, that is all characters after and including the first character that appears in the string returned by augmentationDelimiters().
 Symbol stripIndex(Symbol label)
          Returns label, but stripped of any index augmentation.
 Symbol stripIndex(Symbol label, Nonterminal nonterminal)
          Identical to stripIndex(Symbol), except that instead of creating a new Nonterminal object for use by parseNonterminal(Symbol,Nonterminal), this method simply passes the specified nonterminal object.
abstract  Symbol subjectAugmentation()
          Returns the symbol that is used to augment nonterminals to indicate matrix subjects in this language’s Treebank.
abstract  Symbol subjectlessSentenceLabel()
          Returns the symbol with which Training.relabelSubjectlessSentences(Sexp) will relabel sentences when they have no subjects.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

augmentationDelimSet

protected BitSet augmentationDelimSet
A BitSet indexed by character (that is, whose size is Character.MAX_VALUE), where for each character c of the string returned by augmentationDelimiters(),
 augmentationDelimSet.get(c)
 
returns true. The default constructor of this abstract class will appropriately initialize this data member.


canonicalAugDelimSym

protected final Symbol canonicalAugDelimSym
A Symbol created from the first character of Treebank.augmentationDelimiters().


nonterminalExceptionSet

protected Symbol[] nonterminalExceptionSet
A set of nonterminal labels (Symbol objects) that defaultParseNonterminal(Symbol,Nonterminal) should use when determining the base nonterminal label. If this behavior is desired, this array should be assigned in in the constructor of a subclass. This hook into the behavior of defaultParseNonterminal is primarily intended for the unfortunate case when Treebank designers have nonterminal labels that contain the delimiters used for augmenting nonterminal labels (as is the case with the English Treebank in the form of -LRB- and -RRB-).

Constructor Detail

AbstractTreebank

public AbstractTreebank()
No-arg constructor, to be called by all subclasses of this abstract class. This constructor fills in the data member augmentationDelimSet based on the string returned by augmentationDelimiters().

Method Detail

isPreterminal

public abstract boolean isPreterminal(Sexp tree)
Returns whether tree represents a preterminal subtree in the parse trees for this language's Treebank. Typically, preterminals are part-of-speech tags.

Specified by:
isPreterminal in interface Treebank

getTag

public Symbol getTag(Sexp preterminal)
Gets the component of the preterminal tree that corresponds to the part of speech tag. This default implementation returns the symbol that is returned by
 preterminal.list().get(0).symbol();
 
If this is not appropriate for a particular Treebank, then this method should be overridden.

Specified by:
getTag in interface Treebank
Parameters:
preterminal - a tree that is assumed to be a preterminal
Returns:
the symbol in preterminal that is a part of speech

makeWord

public Word makeWord(Sexp preterminal)
Constructs a Word object from the specified preterminal subtree. This default implementation creates a Word object from the terminal and preterminal symbols (word and part-of-speech tag symbols):
 Words.get(preterminal.list().get(1).symbol(),
               preterminal.list().get(0).symbol());
 
If a particular Treebank requires a different type of word object to be constructed, or has a different preterminal tree structure, this method should be overridden.

Specified by:
makeWord in interface Treebank
Parameters:
preterminal - a tree that is assumed to be a preterminal
Returns:
the symbol in preterminal that is a part of speech

constructPreterminal

public Sexp constructPreterminal(Word word)
Converts a Word object into a preterminal subtree. This default implementation creates a tree whose sole nonterminal is the part of speech of the specified word object and whose terminal is the word component of the specified word object.

Specified by:
constructPreterminal in interface Treebank
Parameters:
word - the word object from which to create a preterminal subtree
Returns:
a preterminal subtree constructed from word

getCanonical

public abstract Symbol getCanonical(Symbol label)
Returns a canonical mapping for the specified nonterminal label; if label already is in canonical form, it is returned. This method is intended to be used by implementations of HeadFinder.findHead(Sexp).

Specified by:
getCanonical in interface Treebank
Parameters:
label - the label to be canonicalized
See Also:
HeadFinder.findHead(Sexp)

getCanonical

public abstract Symbol getCanonical(Symbol label,
                                    boolean stripAugmentations)
Description copied from interface: Treebank
Returns a canonical version of the specified nonterminal label; if label already is in canonical form, it is returned.

Specified by:
getCanonical in interface Treebank
Parameters:
label - the label to be canonicalized
stripAugmentations - indicates whether to strip any augmentations from the specified label before attempting to get its canonical form
Returns:
the canonical version of the specified label

isSentence

public abstract boolean isSentence(Symbol label)
Returns true is the specified nonterminal label represents a sentence in this language’s Treebank. This method is intended to be used by implementations of Training.relabelSubjectlessSentences(Sexp).

Specified by:
isSentence in interface Treebank

sentenceLabel

public abstract Symbol sentenceLabel()
Returns the canonical label for a sentence, for de-transforming sentences that were transformed via Training.relabelSubjectlessSentences(Sexp).

Specified by:
sentenceLabel in interface Treebank

subjectlessSentenceLabel

public abstract Symbol subjectlessSentenceLabel()
Returns the symbol with which Training.relabelSubjectlessSentences(Sexp) will relabel sentences when they have no subjects.

Specified by:
subjectlessSentenceLabel in interface Treebank

subjectAugmentation

public abstract Symbol subjectAugmentation()
Returns the symbol that is used to augment nonterminals to indicate matrix subjects in this language’s Treebank.

Specified by:
subjectAugmentation in interface Treebank
See Also:
Training.relabelSubjectlessSentences(Sexp)

isNullElementPreterminal

public abstract boolean isNullElementPreterminal(Sexp tree)
Returns true if the specified S-expression represents a preterminal whose terminal element is the null element for this language’s Treebank. This method is intended to be used by implementations of Training.relabelSubjectlessSentences(Sexp).

Specified by:
isNullElementPreterminal in interface Treebank
See Also:
Training.relabelSubjectlessSentences(Sexp)

getTraceIndex

public int getTraceIndex(Sexp preterm,
                         Nonterminal nonterminal)
Returns the index of a trace for the specified null element preterminal. This default implementation assumes trace indicies are marked on trace terminals that can be parsed by parseNonterminal(Symbol,Nonterminal). If this is not true for a particular Treebank, this method should be overridden. If preterm is not a null element preterminal (that is, a preterminal for which isNullElementPreterminal(Sexp) returns false), the semantics of this method are undefined. This method is used by the default implementation of AbstractTraining.hasGap(Sexp,Sexp,ArrayList), which is a helper method for the default implementation of Training.addGapInformation(Sexp).

Specified by:
getTraceIndex in interface Treebank
Parameters:
preterm - the null element preterminal whose trace index is to be returned
nonterminal - the object used as the second argument to parseNonterminal(Symbol,Nonterminal)
Returns:
the index of the trace of the terminal contained in preterm, or -1 if the null element does not have an index

isPuncToRaise

public abstract boolean isPuncToRaise(Sexp preterm)
Returns true if the specified S-expression represents a preterminal and a part-of-speech tag that indicates punctuation to be raised when running Training.raisePunctuation(Sexp). If punctuation raising is not desirable for a particular language package, this method may be implemented simply to return false.

Specified by:
isPuncToRaise in interface Treebank
Parameters:
preterm - the preterminal to test
See Also:
Training.raisePunctuation(Sexp)

isPunctuation

public abstract boolean isPunctuation(Symbol tag)
Returns true if the specified part of speech tag is one for which isPuncToRaise(Sexp) would return true.

Specified by:
isPunctuation in interface Treebank
Parameters:
tag - the part of speech to test
See Also:
isPuncToRaise(Sexp)

isPossessivePreterminal

public abstract boolean isPossessivePreterminal(Sexp tree)
Returns true if the specified S-expression represents a preterminal that is the possessive part of speech. This method is intended to be used by implementations of Training.addBaseNPs(Sexp).

Specified by:
isPossessivePreterminal in interface Treebank
See Also:
Training.addBaseNPs(Sexp)

isNP

public abstract boolean isNP(Symbol label)
Returns true if the canonical version of the specified label is an NP for this language’s Treebank.

Specified by:
isNP in interface Treebank
Parameters:
label - the label to test
See Also:
Training.addBaseNPs(Sexp)

baseNPLabel

public abstract Symbol baseNPLabel()
Returns the symbol with which Training.addBaseNPs(Sexp) will relabel core NPs.

Specified by:
baseNPLabel in interface Treebank
See Also:
Training.addBaseNPs(Sexp)

isBaseNP

public boolean isBaseNP(Symbol label)
Returns whether the specified label is for a base NP. The default implementation here simply tests for object equality between the specified label and the label returned by baseNPLabel(). If a particular language package can have various types of base NP labels (such as those bearing node augmentations), then this method should be overridden.

Specified by:
isBaseNP in interface Treebank
Parameters:
label - the label to test
Returns:
whether the specified label is for a base NP.

isWHNP

public abstract boolean isWHNP(Symbol label)
Returns true if the canonical version of the specified label is an NP that undergoes WH-movement in a particular Treebank. This method is used by Training.addGapInformation(Sexp). If a particular language package does not require gap information, then this method may be implemented simply to return false.

Specified by:
isWHNP in interface Treebank
See Also:
Training.addGapInformation(Sexp)

NPLabel

public abstract Symbol NPLabel()
Returns the symbol that Training.addBaseNPs(Sexp) should add as a parent if a base NP is not dominated by an NP.

Specified by:
NPLabel in interface Treebank
See Also:
Training.addBaseNPs(Sexp)

isConjunction

public abstract boolean isConjunction(Symbol label)
Returns true if the canonical version of the specified label is a conjunction tag or nonterminal in a particular Treebank.

Specified by:
isConjunction in interface Treebank

isVerb

public abstract boolean isVerb(Sexp preterminal)
Returns true if the specified preterminal is that of a verb. This method is used by HeadTreeNode to determine if a particular subtree contains a verb, which is in turn used by Trainer to calculate the distance metric, which depends on whether a verb occurs in the subtrees of the previous modifiers. It is the responsibility of the caller to insure that preterminal is a Sexp object for which isPreterminal(Sexp) returns true.

Specified by:
isVerb in interface Treebank
See Also:
HeadTreeNode, Trainer

isVerbTag

public abstract boolean isVerbTag(Symbol tag)
Returns true if the specified symbol is the part of speech tag of a verb. This method should return true for exactly the same parts of speech for which isVerb(Sexp) returns true, and is used to calculate the distance metric while decoding.

Specified by:
isVerbTag in interface Treebank
See Also:
CKYItem.containsVerb(), Decoder

isComma

public abstract boolean isComma(Symbol word)
Returns true if the specified word is a comma. This method is used by the Decoder class when performing the comma constraint on chart items.

Specified by:
isComma in interface Treebank
Parameters:
word - the word to test
See Also:
Settings.decoderUseCommaConstraint

isLeftParen

public abstract boolean isLeftParen(Symbol word)
Returns true if the specified word is a left parenthesis. This method is used by the Decoder class when performing the comma constraint on chart items.

Specified by:
isLeftParen in interface Treebank
Parameters:
word - the word to test
See Also:
Settings.decoderUseCommaConstraint

isRightParen

public abstract boolean isRightParen(Symbol word)
Returns true if the specified word is a right parenthesis. This method is used by the Decoder class when performing the comma constraint on chart items.

Specified by:
isRightParen in interface Treebank
Parameters:
word - the word to test
See Also:
Settings.decoderUseCommaConstraint

augmentationDelimiters

public abstract String augmentationDelimiters()
Returns a string whose characters are the set of delimiters for complex nonterminal labels.

Specified by:
augmentationDelimiters in interface Treebank
See Also:
stripAugmentation(Symbol), defaultParseNonterminal(Symbol,Nonterminal)

canonicalAugDelimiter

public char canonicalAugDelimiter()
Returns the first character of the string returned by augmentationDelimiters(), which will be considered the "canonical" augmentation delimiter when adding new augmentations, such as the argument augmentations added by implementations of Training.identifyArguments(Sexp).

Specified by:
canonicalAugDelimiter in interface Treebank

nonTreebankLeftBracket

public char nonTreebankLeftBracket()
Returns a left-bracket character that is not an existing metacharacter in the current treebank, for use when the Settings.decoderOutputHeadLexicalizedLabels is true. The default implementation here returns '['.

Specified by:
nonTreebankLeftBracket in interface Treebank
Returns:
a left-bracket character that is not an existing metacharacter in the current treebank

nonTreebankRightBracket

public char nonTreebankRightBracket()
Returns a right-bracket character that is not an existing metacharacter in the current treebank, for use when constructing lexicalized nonterminals when the Settings.decoderOutputHeadLexicalizedLabels is true. The default implementation here returns ']'.

Specified by:
nonTreebankRightBracket in interface Treebank
Returns:
a right-bracket character that is not an existing metacharacter in the current treebank

nonTreebankDelimiter

public char nonTreebankDelimiter()
Description copied from interface: Treebank
Returns a delimiter not already in use by the current treebank, for use when constructing lexicalized nonterminals when the Settings.decoderOutputHeadLexicalizedLabels is true.

Specified by:
nonTreebankDelimiter in interface Treebank
Returns:
a delimiter not already in use by the current treebank

stripAugmentation

public Symbol stripAugmentation(Symbol label)
Returns the Symbol created by stripping off all augmentations, that is all characters after and including the first character that appears in the string returned by augmentationDelimiters().

Specified by:
stripAugmentation in interface Treebank
Parameters:
label - the potentially-complex nonterminal label to be stripped
Returns:
a version of label with all augmentations removed

stripIndex

public Symbol stripIndex(Symbol label)
Returns label, but stripped of any index augmentation. This method assumes that the index will always be the final augmentation in a complex nonterminal label.
N.B.: This method will create a new Nonterminal object, to be filled in by stripIndex(Symbol,Nonterminal).

Specified by:
stripIndex in interface Treebank
Parameters:
label - the nonterminal to be stripped of any possible index
Returns:
a Symbol that is identical to label, except that all characters after and including the final delimiter are removed if the final augmentation is composed entirely of digits

stripIndex

public Symbol stripIndex(Symbol label,
                         Nonterminal nonterminal)
Identical to stripIndex(Symbol), except that instead of creating a new Nonterminal object for use by parseNonterminal(Symbol,Nonterminal), this method simply passes the specified nonterminal object. In a sequential run, this method provides maximum efficiency, as only one Nonterminal object need be created at the beginning of the run.

Specified by:
stripIndex in interface Treebank

stripAllButIndex

public Symbol stripAllButIndex(Symbol label)
Returns a symbol identical to the specified label, except all augmentations other than the index will be removed. If label had no index to begin with, then this method is functionally identical to stripAugmentation(Symbol).

Specified by:
stripAllButIndex in interface Treebank
Parameters:
label - the nonterminal label to strip of non-index augmentations

stripAllButIndex

public Symbol stripAllButIndex(Symbol label,
                               Nonterminal nonterminal)
Identical to stripAllButIndex(Symbol), except that instead of creating a new Nonterminal object for use by parseNonterminal(Symbol,Nonterminal), this method uses the specified nonterminal object. In a sequential run, this method provides maximum efficiency, as only one Nonterminal object need be created at the beginning of the run.

Specified by:
stripAllButIndex in interface Treebank

parseNonterminal

public Nonterminal parseNonterminal(Symbol label)
Returns a Nonterminal object to represent all the components of a complex nonterminal annotation: the base label, any augmentations and any index. If there are no augmentations, the augmentations field of the returned object will contain a list with zero elements; if there is no index, the value of index will be -1. A final requirement of the contract of this method is to represent all the delimiters in the list of augmentations; this requirement is met, for example, by the helper method defaultParseNonterminal(Symbol,Nonterminal).
Efficiency note: This method creates and returns a new Nonterminal object with every invocation.

Specified by:
parseNonterminal in interface Treebank
Parameters:
label - a (possibly complex) nonterminal label from a Treebank
Returns:
a Nonterminal object representing any and all components of the specified complex nonterminal
See Also:
Nonterminal

parseNonterminal

public abstract Nonterminal parseNonterminal(Symbol label,
                                             Nonterminal nonterminal)
Identical to parseNonterminal(Symbol), except that instead of returning a newly-created Nonterminal object, this method merely modifies the specified Nonterminal object. This method may be used for efficiency: in a particular, sequential training run, only one Nonterminal need be created, repeatedly passed in to this method for modification.

Specified by:
parseNonterminal in interface Treebank
Parameters:
label - a (possibly complex) nonterminal label from a Treebank
nonterminal - the representation of any and all components present in label

defaultParseNonterminal

public void defaultParseNonterminal(Symbol label,
                                    Nonterminal nonterminal)
Fills in the specified Nonterminal object to represent all the components of a complex nonterminal annotation: the base label, any augmentations and any index. If there are no augmentations, the augmentations field of the returned object will contain a list with no elements; if there is no index, the value of index will be -1. Augmentation delimiters are the characters in the string returned by augmentationDelimiters().
N.B.: This method assumes that the index, if one exists for the specified nonterminal, will always be the final augmentation in the label.
This method is intended to be used by implementations of parseNonterminal(Symbol,Nonterminal).

Specified by:
defaultParseNonterminal in interface Treebank
Parameters:
label - a (possibly complex) nonterminal label from a Treebank
See Also:
Nonterminal

containsAugmentation

public boolean containsAugmentation(Symbol nonterminal,
                                    Symbol augmentation)
Provides an efficient, thread-safe method for testing whether the specified nonterminal contains the specified augmentation (without parsing the nonterminal).

N.B.: This method assumes that the augmentation is preceded by the canonical augmentation delimiter. To search for an augmentation preceded by any of the possible augmentaion delimiters (as defined by augmentationDelimiters()), use

 parseNonterminal(nonterminal).augmentations.contains(augmentation)
 

Specified by:
containsAugmentation in interface Treebank

addAugmentation

public void addAugmentation(Nonterminal nonterminal,
                            Symbol augmentation)
Adds the specified augmentation to the end of the (possibly empty) augmentation list of the specified Nonterminal object. This method takes care to add the canonical augmentation delimiter before adding the augmentation itself, and also takes care to add these two elements before a final delimiter between the main augmentations and the index, if one exists.

Specified by:
addAugmentation in interface Treebank
Parameters:
nonterminal - the nonterminal to which to add an augmentation
augmentation - the augmentation to add to nonterminal's augmentation list

removeAugmentation

public boolean removeAugmentation(Nonterminal nonterminal,
                                  Symbol augmentation)
Removes the specified augmentation from the augmentation list of the specified Nonterminal object, and the previous augmentation delimiter. If the specified augmentation is not preceded by an augmentation delimiter, meaning it is the base label itself, then it is not removed.

Specified by:
removeAugmentation in interface Treebank
Parameters:
nonterminal - the nonterminal from which to remove an augmentation
augmentation - the augmentation to remove from nonterminal
Returns:
true if augmentation and a preceding augmentation delimiter was removed from nonterminal's augmentation list, or false otherwise

removeAugmentation

public Sexp removeAugmentation(Sexp sexp,
                               Nonterminal nonterminal,
                               Symbol augmentation)
Description copied from interface: Treebank
Removes the specified nonterminal augmentation from the specified S-expression, using the specified Nonterminal object for temporary storage. If the specified S-expression is a list, then each element will be destructively replaced with the return value of this method; otherwise, if the specified S-epxression is a symbol, its augmentation is removed and the new symbol is returned.

N.B.: While the description of the behavior of this method on lists is recursive, a concrete implementation need not use a recursive algorithm.

Specified by:
removeAugmentation in interface Treebank
Parameters:
sexp - the S-expression containing symbols whose augmentations are to be removed
nonterminal - an object used for temporary storage during the invocation of this method
augmentation - the augmentation to be removed from all symbols in the specified S-expression
Returns:
the specified S-expression, but with all symbols changed so that none has the specified augmentation

isAugDelim

public final boolean isAugDelim(Sexp sexp)
Description copied from interface: Treebank
Returns whether the specified S-expression is a symbol that is an augmentation delimiter for a complex nonterminal label.

Specified by:
isAugDelim in interface Treebank
Parameters:
sexp - the S-expression to be tested
Returns:
whether the specified S-expression is a symbol that is an augmentation delimiter.
See Also:
Treebank.augmentationDelimiters()

Parsing Engine

Author: Dan Bikel.