|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdanbikel.parser.lang.AbstractTraining
danbikel.parser.english.BrokenTraining
public class BrokenTraining
Provides methods for language-specific processing of training parse trees.
This class’ primary purpose is simply to fill in the AbstractTraining.argContexts
, AbstractTraining.semTagArgStopSet
and AbstractTraining.nodesToPrune
data
members using a metadata resource. If this capability is desired in another
language package, this class may be subclassed.
Training.addBaseNPs(Sexp)
,
with an important change that is possibly only relevant to the Penn
Treebank.
Important note: This class is similar to Training
, except
that it is “broken” in the sense that instead of doing the
closest possible emulation of Collins’ parsing model, it only uses
details found in Collins’ published papers. See Intricacies
of Collins” Parsing Model for details.
Field Summary |
---|
Fields inherited from class danbikel.parser.lang.AbstractTraining |
---|
addGapInfo, argAugmentations, argContexts, argNonterminals, baseNP, canonicalAugDelimSym, defaultArgAugmentation, delimAndGapStr, delimAndGapStrLen, gapAugmentation, headFinder, headPostSym, headPreSym, headSym, metadataPropertyPrefix, nodesToPrune, NP, prunedPreterms, prunedPunctuation, relabelHeadChildrenAsArgs, repairBaseNPs, semTagArgStopSet, traceTag, treebank, wordsToPrune |
Constructor Summary | |
---|---|
BrokenTraining()
The default constructor, to be invoked by Language . |
Method Summary | |
---|---|
Sexp |
fixSubjectlessSentences(Sexp tree)
This method has been written to do nothing to the specified tree. |
Sexp |
identifyArguments(Sexp tree)
Marks certain nodes as arguments by appending a suffix to their respective labels. |
protected boolean |
isTypeOfSentence(Symbol label)
Unlike Mike's definition of a sentence for the purpose of relabeling subjectless sentences, which includes any label that starts with 'S', we strictly require here that the label strictly be S, or S with some augmentations. |
static void |
main(String[] args)
Test driver for this class. |
protected boolean |
needToAddNormalNPLevel(Sexp grandparent,
int parentIdx,
Sexp tree)
The following method has been overridden so that the two unpublished conditions under which one needs to add a normal NP level are overlooked. |
void |
postProcess(Sexp tree)
Post-processes a parse tree after decoding, eseentially undoing the steps performed in preprocessing. |
Sexp |
preProcess(Sexp tree)
The method to call before counting events in a training parse tree. |
protected Sexp |
unrepairBaseNPs(Sexp tree)
De-transforms NPs that were transformed by the Training.repairBaseNPs(Sexp) method. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public BrokenTraining() throws FileNotFoundException, IOException
Language
.
This constructor looks for a resource named by the property
metadataPropertyPrefix + language
where metadataPropertyPrefix
is the value of
the constant AbstractTraining.metadataPropertyPrefix
and language
is the value of Settings.get(Settings.language)
.
For example, the property for English is
"parser.training.metadata.english"
.
FileNotFoundException
IOException
Method Detail |
---|
public Sexp preProcess(Sexp tree)
AbstractTraining
AbstractTraining.prune(Sexp)
AbstractTraining.addBaseNPs(Sexp)
AbstractTraining.repairBaseNPs(Sexp)
AbstractTraining.addGapInformation(Sexp)
AbstractTraining.relabelSubjectlessSentences(Sexp)
AbstractTraining.removeNullElements(Sexp)
AbstractTraining.raisePunctuation(Sexp)
AbstractTraining.identifyArguments(Sexp)
AbstractTraining.stripAugmentations(Sexp)
AbstractTraining.addGapInformation(Sexp)
should be run after methods that
introduce new nodes, which in this case is AbstractTraining.addBaseNPs(Sexp)
, as
these new nodes may need to be used to thread the gap feature
AbstractTraining.relabelSubjectlessSentences(Sexp)
should be run after
AbstractTraining.addGapInformation(Sexp)
because only those sentences whose
empty subjects are not the result of WH-movement should be
relabeled
AbstractTraining.removeNullElements(Sexp)
should be run after any
methods that depend on the presence of null elements, such as
AbstractTraining.relabelSubjectlessSentences(Sexp)
because a sentence cannot
be determined to be subjectless unless a null element is present as
a child of a subject-marked node
AbstractTraining.addGapInformation(Sexp)
because the determination of
the location of a trace requires the presence of indexed null elements
AbstractTraining.raisePunctuation(Sexp)
should be run after
AbstractTraining.removeNullElements(Sexp)
because a null element that is a
leftmost or rightmost child can block detection of a punctuation element
that needs to be raised after removal of the null element (if a punctuation
element is the next-to-leftmost or next-to-rightmost child of an interior
node)
AbstractTraining.stripAugmentations(Sexp)
should be run after all methods
that may depend upon the presence of nonterminal augmentations: AbstractTraining.identifyArguments(Sexp)
, AbstractTraining.relabelSubjectlessSentences(Sexp)
and
AbstractTraining.addGapInformation(Sexp)
preProcess
in interface Training
preProcess
in class AbstractTraining
tree
- the parse tree to pre-process
tree
having been pre-processedpublic Sexp identifyArguments(Sexp tree)
identifyArguments
in interface Training
identifyArguments
in class AbstractTraining
tree
- the tree in which to identify argument nodes
Treebank.canonicalAugDelimiter()
protected boolean isTypeOfSentence(Symbol label)
isTypeOfSentence
in class AbstractTraining
label
- the nonterminal label to test
true
if the specified nonterminal represents a
sentence, false
otherwisepublic Sexp fixSubjectlessSentences(Sexp tree)
protected Sexp unrepairBaseNPs(Sexp tree)
Training.repairBaseNPs(Sexp)
method. This
method is currently unused.
tree
- the tree whose NPs are to be de-transformed
public void postProcess(Sexp tree)
Training
postProcess
in interface Training
postProcess
in class AbstractTraining
tree
- the tree to be post-processedprotected boolean needToAddNormalNPLevel(Sexp grandparent, int parentIdx, Sexp tree)
needToAddNormalNPLevel
in class AbstractTraining
grandparent
- the parent of the "parent" that is a
base NPparentIdx
- the index of the child of grandparent
that is the base NP (that is,
grandparent.list().get(parentIdx) == tree
tree
- the base NP, whose parent is grandparent
public static void main(String[] args)
|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |