Parsing Engine

danbikel.parser.chinese
Class Training

java.lang.Object
  extended by danbikel.parser.lang.AbstractTraining
      extended by danbikel.parser.chinese.Training
All Implemented Interfaces:
Training, Serializable
Direct Known Subclasses:
NoNPBTraining, NPArgThreadTraining

public class Training
extends AbstractTraining

Provides methods for language-specific processing of Chinese training parse trees. This class uses all the defaults provided by the superclass AbstractTraining, exccept that it overrides AbstractTraining.relabelSubjectlessSentences(Sexp).

See Also:
Serialized Form

Field Summary
 
Fields inherited from class danbikel.parser.lang.AbstractTraining
addGapInfo, argAugmentations, argContexts, argNonterminals, baseNP, canonicalAugDelimSym, defaultArgAugmentation, delimAndGapStr, delimAndGapStrLen, gapAugmentation, headFinder, headPostSym, headPreSym, headSym, metadataPropertyPrefix, nodesToPrune, NP, prunedPreterms, prunedPunctuation, relabelHeadChildrenAsArgs, repairBaseNPs, semTagArgStopSet, traceTag, treebank, wordsToPrune
 
Constructor Summary
Training()
          The default constructor, to be invoked by Language.
 
Method Summary
protected  Sexp combineRightSiblingsOfDe5(Sexp tree)
          A method to create a new node if a DEG or DEC preterminal has more than one right sibling.
static void main(String[] args)
          Test driver for this class.
 Sexp relabelSubjectlessSentences(Sexp tree)
          We override AbstractTraining.relabelSubjectlessSentences(Sexp) so that we can make the definition of a subjectless sentence slightly more restrictive: a subjectless sentence not only must have a null-element child that is marked with the subject augmentation, but also its head must be a VP (this is Mike Collins’ definition of a subjectless sentence).
protected  Sexp unrepairBaseNPs(Sexp tree)
          A method to un-do the transformation provided by AbstractTraining.repairBaseNPs(Sexp) (for inclusion in an overridden definition of AbstractTraining.postProcess(Sexp), but currently unused by this class).
 
Methods inherited from class danbikel.parser.lang.AbstractTraining
addArgAugmentation, addBaseNPs, addGapInformation, argNonterminals, canonicalizeNonterminals, collectPreterms, createArgAugmentationsList, createArgNonterminalsSet, defaultArgAugmentation, gapAugmentation, getCanonicalArg, getCanonicalArg, getPrunedPreterms, getPrunedPunctuation, hasGap, hasGap, hasPossessiveChild, headPostSym, headPreSym, headSym, identifyArguments, isAllNodesToPrune, isArgument, isArgument, isArgument, isArgumentFast, isCoordinatedPhrase, isTypeOfSentence, isValidTree, needToAddNormalNPLevel, postProcess, preProcess, preProcessTest, printMetadata, prune, raisePunctuation, readMetadata, readMetadataHook, relabelArgChildren, removeArgAugmentation, removeArgAugmentation, removeGapAugmentation, removeNullElements, removeOnlyChildBaseNPs, removeWord, repairBaseNPs, repairBaseNPs, setUpFastArgMap, skip, startSym, startWord, staticSetUpFastArgMap, stopSym, stopWord, stripAugmentations, stripAugmentations, stripAugmentations, threadNPArgAugmentations, topSym, topWord, traceTag, transformSubjectNTs, unaryProductionsToNull
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Training

public Training()
         throws IOException
The default constructor, to be invoked by Language. This constructor looks for a resource named by the property metadataPropertyPrefix + language where metadataPropertyPrefix is the value of the constant AbstractTraining.metadataPropertyPrefix and language is the value of Settings.get(Settings.language). For example, the property for English is "parser.training.metadata.english".

Throws:
IOException - if there is a problem reading the metadata resource
Method Detail

relabelSubjectlessSentences

public Sexp relabelSubjectlessSentences(Sexp tree)
We override AbstractTraining.relabelSubjectlessSentences(Sexp) so that we can make the definition of a subjectless sentence slightly more restrictive: a subjectless sentence not only must have a null-element child that is marked with the subject augmentation, but also its head must be a VP (this is Mike Collins’ definition of a subjectless sentence).

Specified by:
relabelSubjectlessSentences in interface Training
Overrides:
relabelSubjectlessSentences in class AbstractTraining
Parameters:
tree - the parse tree in which to relabel subjectless sentences
Returns:
the same tree that was passed in, with subjectless sentence nodes relabeled
See Also:
Treebank.isSentence(Symbol), Treebank.subjectAugmentation(), Treebank.isNullElementPreterminal(Sexp), Treebank.subjectlessSentenceLabel()

unrepairBaseNPs

protected Sexp unrepairBaseNPs(Sexp tree)
A method to un-do the transformation provided by AbstractTraining.repairBaseNPs(Sexp) (for inclusion in an overridden definition of AbstractTraining.postProcess(Sexp), but currently unused by this class).

Parameters:
tree - the tree whose sentences that are right siblings of base NP nodes are to be re-inserted as rightmost children of their respective base NP nodes
Returns:
the specified tree, having been modified in-place

combineRightSiblingsOfDe5

protected Sexp combineRightSiblingsOfDe5(Sexp tree)
A method to create a new node if a DEG or DEC preterminal has more than one right sibling. The new node will be a new parent to all the right siblings of that DEG/DEC node, and will therefore be the sole right sibling of that DEG/DEC node.

Parameters:
tree - the tree in which to combine right siblings of DEG or DEC nodes into a newly-created parent
Returns:
the specified tree, having been modified in situ

main

public static void main(String[] args)
Test driver for this class.

Parameters:
args - usage: [-risan] <filename> where
-rraise punctuation
-iidentify arguments
-srelabel subjectless sentences
-astrip nonterminal augmentations
-nadd/relabel base NPs

Parsing Engine

Author: Dan Bikel.