|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdanbikel.parser.lang.AbstractHeadFinder
public abstract class AbstractHeadFinder
Provides a default abstract implementation of the HeadFinder
interface. Subclasses are encouraged to make use of the defaultFindHead(Symbol,SexpList)
method, which finds a head according to
head rules that are gotten from a resource specified in the parser's
settings file.
Nested Class Summary | |
---|---|
protected static class |
AbstractHeadFinder.HeadFindInstruction
Data structure for specifying a way to search for a head in a grammar production: a set of symbols to scan for and the direction of that scan. |
Field Summary | |
---|---|
protected static Symbol |
defaultSym
The wildcard character used in default head-finding instructions. |
protected static String |
fallbackDefaultHeadTableResource
The fallback default in case the property for specifying the path to the head table file or resource is not set. |
protected HashMap |
headFindInstructions
The map of parent nonterminals to their arrays of AbstractHeadFinder.HeadFindInstruction . |
static String |
headSuffix
The augmentation for new head nodes added by addHeadInformation(danbikel.lisp.Sexp) . |
protected static boolean |
LEFT
Constant to indicate a left-to-right scan (makes for more readable code for this class and its subclasses). |
protected static Symbol |
leftSym
The character from a head table's head-finding instruction that indicates a left-to-right scan. |
static String |
postHeadSuffix
The augmentation for new post-head nodes added by addHeadInformation(danbikel.lisp.Sexp) . |
static String |
preHeadSuffix
The augmentation for new pre-head nodes added by addHeadInformation(danbikel.lisp.Sexp) . |
protected double |
probRandom
The probability that the head child of a production will be chosen at random. |
protected Random |
rand
This class’ random number generator. |
protected static boolean |
RIGHT
Constant to indicate a right-to-left scan (makes for more readable code for this class and its subclasses). |
protected static Symbol |
rightSym
The character from a head table's head-finding instruction that indicates a right-to-left scan. |
protected boolean |
useRand
Set to true if probRandom is greater than 0.0; otherwise,
set to false. |
protected boolean |
warnDefaultRule
The value of Settings.headFinderWarnDefaultRule , cached here
for readability and convenience. |
Constructor Summary | |
---|---|
protected |
AbstractHeadFinder()
Constructs a head-finding object, getting the name of the head table from the value of Settings.get(Settings.headTablePrefix + language) ,
where language is the value of
Settings.get(Settings.language) . |
|
AbstractHeadFinder(Sexp headTableSexp)
Constructs a head-finding object with the specified head table. |
Method Summary | |
---|---|
Sexp |
addHeadInformation(Sexp tree)
Perform head-finding in tree , augmenting nodes that
are the head children of their respective parents. |
protected int |
defaultFindHead(Symbol lhs,
SexpList rhs)
Provides a default mechanism to use the head table to find a head in the specified grammar production. |
int |
findHead(Sexp tree)
Finds the head for the production at the root of the specified subtree. |
abstract int |
findHead(Sexp tree,
Symbol lhs,
SexpList rhs)
Finds the head for the grammar production lhs -> rhs . |
String |
headSuffix()
Returns the string "-HEAD" . |
protected void |
readHeadTable(Sexp headTableSexp)
Reads the head table contained in the specified S-expression. |
protected int |
scan(boolean direction,
SexpList rhs,
Symbol[] matchTags)
Scans the RHS of a production in the specified direction. |
protected int |
scanLeftToRight(SexpList rhs,
Symbol[] matchTags)
Scans the RHS of a production from left to right, returning the index of the first nonterminal that is in the matchTags array. |
protected int |
scanRightToLeft(SexpList rhs,
Symbol[] matchTags)
Scans the RHS of a production from right to left, returning the index of the first nonterminal that is in the matchTags array. |
protected boolean |
tagMatches(Symbol tag,
Symbol[] matchTags)
A helper method that returns true if any of the nonterminals
in matchTags is tag and returns
false otherwise. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final String fallbackDefaultHeadTableResource
protected static final boolean LEFT
protected static final boolean RIGHT
protected static final Symbol defaultSym
readHeadTable(danbikel.lisp.Sexp)
protected static final Symbol leftSym
readHeadTable(danbikel.lisp.Sexp)
protected static final Symbol rightSym
readHeadTable(danbikel.lisp.Sexp)
public static final String preHeadSuffix
addHeadInformation(danbikel.lisp.Sexp)
.
public static final String postHeadSuffix
addHeadInformation(danbikel.lisp.Sexp)
.
public static final String headSuffix
addHeadInformation(danbikel.lisp.Sexp)
.
protected HashMap headFindInstructions
AbstractHeadFinder.HeadFindInstruction
. When a head is being
found, each HeadFindInstruction
is applied in order until
one succeeds.
readHeadTable(danbikel.lisp.Sexp)
protected boolean warnDefaultRule
Settings.headFinderWarnDefaultRule
, cached here
for readability and convenience.
protected double probRandom
Settings.headFinderRandomProb
protected boolean useRand
probRandom
is greater than 0.0; otherwise,
set to false.
protected Random rand
Constructor Detail |
---|
protected AbstractHeadFinder() throws IOException, FileNotFoundException
Settings.get(Settings.headTablePrefix + language)
,
where language
is the value of
Settings.get(Settings.language)
.
The named head table is searched for in the locations that are searched
by the method Settings.getFileOrResourceAsStream(Class,String)
.
See readHeadTable(danbikel.lisp.Sexp)
for a BNF description of the syntax of head
table files.
This constructor will necessarily be invoked whenever the default
constructor of a language-specific HeadFinder
is invoked,
which is done upon initialization of the Language
class.
IOException
FileNotFoundException
readHeadTable(danbikel.lisp.Sexp)
,
Language
,
Settings.getFileOrResourceAsStream(java.lang.Class, java.lang.String)
public AbstractHeadFinder(Sexp headTableSexp)
Method Detail |
---|
protected int defaultFindHead(Symbol lhs, SexpList rhs)
N.B.: The list rhs
is, as are all SexpList
objects, 0-indexed; however, the
return value of this method is meant to map to the children of a node in an
S-expression tree, where child indices begin at 1. Therefore,
this method returns a 1-based index.
lhs
- the left-hand side of a grammar productionrhs
- a list of symbols representing the right-hand side of a
grammar production
rhs
that
is the head, according to the head table of this head-finding objectSettings.headFinderWarnDefaultRule
public int findHead(Sexp tree)
findHead(Sexp,Symbol,SexpList)
.
SexpList
containing the labels of the children of the root of
tree
and then calls findHead(Sexp,Symbol,SexpList)
.
findHead
in interface HeadFinder
tree
- the subtree for whose root production to find the head
findHead(Sexp,Symbol,SexpList)
public abstract int findHead(Sexp tree, Symbol lhs, SexpList rhs)
lhs -> rhs
. This
method may destructively modify rhs
.
findHead
in interface HeadFinder
tree
- the original subtree in which to find the head child, or
null
if the subtree is not availablelhs
- the nonterminal label that is the left-hand side of a grammar
productionrhs
- a list of symbols that is the right-hand side of a grammar
production
rhs
public Sexp addHeadInformation(Sexp tree)
tree
, augmenting nodes that
are the head children of their respective parents.
This method is useful for head-finding debugging.
addHeadInformation
in interface HeadFinder
tree
objectheadSuffix()
public String headSuffix()
"-HEAD"
. If this conflicts
with an existing nonterminal augmentation for a particular treebank or
an augmentation added by Training
during preprocessing, this
method should be overridden.
headSuffix
in interface HeadFinder
"-HEAD"
protected void readHeadTable(Sexp headTableSexp)
(<headrule>+)where
For a particular <parent>, each <instruction> is applied in the order in which it appears in the head table, looking for the first element of the right-hand side of a grammar production that is a member of the scanelement list, and the head is the result of the first instruction that succeeds. If the
<headrule> ::= (<parent> <instruction>+) <parent> ::= the parent whose head child is to be found or the symbol "*"
(which indicates a default rule)<instruction> ::= (<direction> <scanelement>*) <direction> ::= l | r
where l indicates left-to-right, r right-to-left<scanelement> ::= a child nonterminal label to scan for in the specified direction
scanelement
list is empty, then the rule indicates to
choose the first element in its scan (a default instruction). If none
of the head-finding instructions succeeds for a particular parent,
an implicit default instruction is added that is of the form (d),
where d is the direction specified by the last instruction for
the parent.
Bugs: We eventually need to replace default instructions such as (l) with a syntax that includes the wildcard character, such as (l *), which will be more consistent with the default rule, where the wildcard matches any parent. Also, we should possibly require default instructions, instead of allowing them to be implicit.
findHead(Sexp,Symbol,SexpList)
protected int scanLeftToRight(SexpList rhs, Symbol[] matchTags)
matchTags
array.
rhs
- a list of symbols representing the right-hand side (RHS) of
a grammar productionmatchTags
- an array of symbols representing the set of nonterminal
labels to try to match in rhs
rhs
that is in
the set of nonterminals contained in matchTags
when scanning
from left to right, or -1 if there is no such matching nonterminalprotected int scanRightToLeft(SexpList rhs, Symbol[] matchTags)
matchTags
array.
rhs
- a list of symbols representing the right-hand side (RHS) of
a grammar productionmatchTags
- an array of symbols representing the set of nonterminal
labels to try to match in rhs
rhs
that is in
the set of nonterminals contained in matchTags
when scanning
from right to left, or -1 if there is no such matching nonterminalprotected int scan(boolean direction, SexpList rhs, Symbol[] matchTags)
direction
- the direction of scan: LEFT
for left-to-right,
RIGHT
for right-to-left.rhs
- a list of symbols representing the right-hand side (RHS) of
a grammar productionmatchTags
- an array of symbols representing the set of nonterminal
labels to try to match in rhs
rhs
that is
in the set of nonterminals contained in matchTags
, or -1
if there is no such matching nonterminalprotected boolean tagMatches(Symbol tag, Symbol[] matchTags)
true
if any of the nonterminals
in matchTags
is tag
and returns
false
otherwise.
tag
- the tag to matchmatchTags
- an array of symbols
|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |