Parsing Engine

danbikel.parser.ms
Class BrokenModWordModelStructure

java.lang.Object
  extended by danbikel.parser.ProbabilityStructure
      extended by danbikel.parser.ms.BrokenModWordModelStructure
All Implemented Interfaces:
Serializable

public class BrokenModWordModelStructure
extends ProbabilityStructure

Provides the complete back-off structure for the submodel that generates the head words of modifying nonterminals. This class is just like ModWordModelStructure2 but is “broken” in the sense that it includes side information when generating histories for the last back-off level, as indicated by Collins’ thesis, but as was not implemented by Collins in his actual thesis parser, which collapsed all words when computing p(w | t).

The specific back-off structure provided by this class is as follows. If the parent P is not a base NP (NPB), then the back-off structure provided by this class is

If the parent P is a base NP (NPB), then the back-off structure provided by this class is

Please consult one of the following two references for an explanation of the notation used above.

See Also:
Serialized Form

Field Summary
 
Fields inherited from class danbikel.parser.ProbabilityStructure
additionalData, defaultModelClassName, defaultModelConstructor, doPruning, estimates, futureList, futures, futuresWithSubcats, histories, historiesWithSubcats, historyList, lambdas, prevHistCount, topLevelCacheSize, transitions
 
Constructor Summary
BrokenModWordModelStructure()
          Constructs a new instance.
 
Method Summary
 ProbabilityStructure copy()
          Returns a copy of this object.
 boolean doCleanup()
          Returns true, indicating that the Model that owns an instance of this class ought to call its Model.cleanup() method at the end of execution of its deriveCounts method.
 Event getFuture(TrainerEvent trainerEvent, int backOffLevel)
          Returns an event whose sole component is the word being generated as the head of some modifier nonterminal.
 Event getHistory(TrainerEvent trainerEvent, int backOffLevel)
          Returns the history event corresponding to the specified back-off level.
 int maxEventComponents()
          Returns 10.
 int numLevels()
          Returns 3.
 boolean removeHistory(int backOffLevel, Event history)
          In order to gather statistics for words that appear as the head of the entire sentence when estimating p^(w | t), the trainer “fakes” a modifier event, as though the root node of the observed tree was seen to modify the magical +TOP+ node.
 
Methods inherited from class danbikel.parser.ProbabilityStructure
cacheSize, defaultSmoothingParamsFilename, dontAddNewParameters, doPruning, getAdditionalData, getTopLevelCacheSize, getTransition, jointModel, lambdaFudge, lambdaFudgeTerm, lambdaPenalty, newModel, priorLevel, removeFuture, removeTransition, saveSmoothingParameters, setAdditionalData, smoothingParametersFile, useSmoothingParameters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BrokenModWordModelStructure

public BrokenModWordModelStructure()
Constructs a new instance.

Method Detail

maxEventComponents

public int maxEventComponents()
Returns 10.

Overrides:
maxEventComponents in class ProbabilityStructure
Returns:
1 (subclasses should override this method)
See Also:
MutableEvent.ensureCapacity(int)

numLevels

public int numLevels()
Returns 3.

Specified by:
numLevels in class ProbabilityStructure

getHistory

public Event getHistory(TrainerEvent trainerEvent,
                        int backOffLevel)
Returns the history event corresponding to the specified back-off level. If the parent P is not a base NP (NPB), then the back-off structure provided by this class is If the parent P is a base NP (NPB), then the back-off structure provided by this class is

Specified by:
getHistory in class ProbabilityStructure
Parameters:
trainerEvent - the maximal-context event from which to derive the history contexts used by the probability structure provided by this class
backOffLevel - the back-off level for which to return a history context
Returns:
the history context for the specified back-off level

getFuture

public Event getFuture(TrainerEvent trainerEvent,
                       int backOffLevel)
Returns an event whose sole component is the word being generated as the head of some modifier nonterminal.

Specified by:
getFuture in class ProbabilityStructure
Parameters:
trainerEvent - the maximal-context event for which to get a future
backOffLevel - the level of back-off for which a probability is being computed
Returns:
an event whose sole component is the word being generated as the head of some modifier nonterminal

doCleanup

public boolean doCleanup()
Returns true, indicating that the Model that owns an instance of this class ought to call its Model.cleanup() method at the end of execution of its deriveCounts method.

Overrides:
doCleanup in class ProbabilityStructure
Returns:
true
See Also:
ProbabilityStructure.removeHistory(int,Event), ProbabilityStructure.removeFuture(int,Event), ProbabilityStructure.removeTransition(int,Transition), Model.deriveCounts(CountsTable,danbikel.util.Filter,double,danbikel.util.FlexibleMap), Model.cleanup()

removeHistory

public boolean removeHistory(int backOffLevel,
                             Event history)
In order to gather statistics for words that appear as the head of the entire sentence when estimating p^(w | t), the trainer “fakes” a modifier event, as though the root node of the observed tree was seen to modify the magical +TOP+ node. For back-off levels 0 and 1, we will never use the derived counts whose history contexts contain +TOP+. This method allows for the removal of these “unnecessary” counts, which will never be used when decoding.

Overrides:
removeHistory in class ProbabilityStructure
Parameters:
backOffLevel - the back-off level of the history context being tested for removal
history - the history context being tested for removal
See Also:
Model.deriveCounts(CountsTable,danbikel.util.Filter,double,danbikel.util.FlexibleMap), Model.cleanup()

copy

public ProbabilityStructure copy()
Returns a copy of this object.

Specified by:
copy in class ProbabilityStructure

Parsing Engine

Author: Dan Bikel.