Parsing Engine

danbikel.parser
Class JointModel

java.lang.Object
  extended by danbikel.parser.Model
      extended by danbikel.parser.JointModel
All Implemented Interfaces:
Serializable

public class JointModel
extends Model

Provides a mechanism for grouping related Model objects in order to estimate the probability of some joint event. A probability estimate delivered by this class is the product of all the individually-smoothed probability estimates delivered by this class and all its contained Model objects. Crucially, this means that this class and all contained Model objects must be coherent, in the sense that all internal estimates of the elements of a joint event will be derived from the same TrainerEvent object. Typically, this class will provide the means to estimate a joint event via the chain rule, where it is desirable that all the separate estimates comprising the product be independently smoothed.

Note that a joint event may be estimated via a standard Model instance, by simply having the ProbabilityStructure.getFuture(danbikel.parser.TrainerEvent, int) method return an event that is a collection of elements (for example, a nonterminal symbol and a part-of-speech tag symbol). The crucial feature enabled by this class is to have the probability estimates for each element of a joint event to be smoothed individually.

Implementation note: An instance of this class (itself an instance of Model) will contain an internal collection of other Model objects whose probability structures are determined via the ProbabilityStructure.jointModel() method. The internal Model objects used by this class can be accessed via the getModel(int) method. Note that any of these internal Model instances may actually also be JointModel instances (although for efficiency reasons, such a structure should be avoided in general, if possible).

See Also:
ProbabilityStructure.jointModel(), Serialized Form

Field Summary
protected  int numOtherModels
           
protected  Model[] otherModels
           
 
Fields inherited from class danbikel.parser.Model
backOffMap, cache, cacheAccesses, cacheHits, canonicalEvents, counts, createHistBackOffMap, deleteCountsWhenPrecomputingProbs, dontAddNewParams, doPruning, globalDoPruning, histBackOffMap, historiesToPrune, lambdaFudge, lambdaFudgeTerm, lambdaPenalty, logOneMinusLambdaPenalty, numLevels, precomputedLambdas, precomputedNPBProbCalls, precomputedNPBProbHits, precomputedProbCalls, precomputedProbHits, precomputedProbs, precomputeProbs, printPrunedEvents, printUnprunedEvents, pruningThreshold, saveBackOffMap, saveHistBackOffMap, saveSmoothingParams, shortStructureClassName, smoothingParams, smoothingParamsFile, structure, structureClassName, topLevelCache, transitionsToPrune, useCache, useSmoothingParams, verbose, warnSmoothingHasHistoryNotInTraining
 
Constructor Summary
JointModel(ProbabilityStructure structure)
           
 
Method Summary
 void canonicalize(FlexibleMap map)
          Canonicalizes the objects of this Model, as well as all internal Model instances.
 void deriveCounts(CountsTable trainerCounts, Filter filter, double threshold, FlexibleMap canonical)
          Derives counts for this Model, as well as for all internal Model instances.
 void deriveCounts(CountsTable trainerCounts, Filter filter, double threshold, FlexibleMap canonical, boolean deriveOtherModelCounts)
          Derives counts for this Model and optionally for all internal Model instances.
 double estimateLogProb(int id, TrainerEvent event)
          Estimates a conditional probability in log-space from the specified maximal-context trainer event.
 double estimateNonJointLogProb(int id, TrainerEvent event)
          Estimates the log-probability of the specified event under this Model without adding the log-probabilities of the internal Model objects.
 double estimateNonJointProb(int id, TrainerEvent event)
          Estimates the probability of the specified event under this Model without multiplying the probabilities of the internal Model objects.
 double estimateProb(int id, TrainerEvent event)
          Estimates a conditional probability from the specified maximal-context trainer event.
 String getCacheStats()
          Returns a string representing the cache statistics for this and all other, internal Model objects.
 Model getModel(int idx)
          Returns this or any of the internal Model instances used to produce joint probability estimates.
 ProbabilityStructure getProbStructure()
          Returns the primary probability structure of this joint model, which is that used by this Model instance (as opposed to one of the internal Model instances).
 ProbabilityStructure getProbStructure(int idx)
          Returns a probability structure of this joint model, which is either that used by this Model instance, or a structure used by one of the internal Model instances.
 int numModels()
          Returns the number of models used to produce a joint probability estimate, including this Model instance.
 void precomputeProbs()
          Precomputes probabilities and smoothing values for this Model and for all internal Model instances.
 void setCanonicalEvents(FlexibleMap canonical)
          Sets the Model.canonicalEvents member of this object to be the specified FlexibleMap, as well as setting the same member of all internal Model objects.
 
Methods inherited from class danbikel.parser.Model
beQuiet, beVerbose, canonicalize, canonicalizeEvent, cleanup, computeHistoriesAndTransitionsToPrune, deriveDiversityCounts, deriveHistories, estimateLogProbUsingPrecomputed, estimateLogProbUsingPrecomputed, estimateProb, estimateProbOld, getCanonical, getTransitions, initializeSmoothingParams, precomputeProbs, precomputeProbs, precomputeProbs, pruneHistoriesAndTransitions, pruneHistoriesAndTransitionsOld, readSmoothingParams, readSmoothingParams, savePrecomputeData, share, storePrecomputedProbs, writeSmoothingParams, writeSmoothingParams
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

numOtherModels

protected int numOtherModels

otherModels

protected Model[] otherModels
Constructor Detail

JointModel

public JointModel(ProbabilityStructure structure)
Method Detail

canonicalize

public void canonicalize(FlexibleMap map)
Canonicalizes the objects of this Model, as well as all internal Model instances.

Overrides:
canonicalize in class Model
Parameters:
map - the map to be used for canonicalization

setCanonicalEvents

public void setCanonicalEvents(FlexibleMap canonical)
Sets the Model.canonicalEvents member of this object to be the specified FlexibleMap, as well as setting the same member of all internal Model objects.

Overrides:
setCanonicalEvents in class Model
Parameters:
canonical - the reflexive map of canonical Event objects
See Also:
ModelCollection.internalReadObject(java.io.ObjectInputStream)

deriveCounts

public void deriveCounts(CountsTable trainerCounts,
                         Filter filter,
                         double threshold,
                         FlexibleMap canonical)
Derives counts for this Model, as well as for all internal Model instances.

Overrides:
deriveCounts in class Model
Parameters:
trainerCounts - a map from TrainerEvent objects to their counts (as doubles) from which to derive counts
filter - used to filter out TrainerEvent objects whose derived counts should not be derived for this model
threshold - a (currently unused) count cut-off threshold
canonical - a reflexive map used to canonicalize objects created when deriving counts

deriveCounts

public void deriveCounts(CountsTable trainerCounts,
                         Filter filter,
                         double threshold,
                         FlexibleMap canonical,
                         boolean deriveOtherModelCounts)
Derives counts for this Model and optionally for all internal Model instances.

Overrides:
deriveCounts in class Model
Parameters:
trainerCounts - a map from TrainerEvent objects to their counts (as doubles) from which to derive counts
filter - used to filter out TrainerEvent objects whose derived counts should not be derived for this model
threshold - a (currently unused) count cut-off threshold
canonical - a reflexive map used to canonicalize objects created when deriving counts
deriveOtherModelCounts - indicates whether to derive counts for the internal Model instances contained in this joint model

precomputeProbs

public void precomputeProbs()
Precomputes probabilities and smoothing values for this Model and for all internal Model instances.

Overrides:
precomputeProbs in class Model
See Also:
precomputeProbs(MapToPrimitive.Entry, ...), storePrecomputedProbs

estimateLogProb

public double estimateLogProb(int id,
                              TrainerEvent event)
Estimates a conditional probability in log-space from the specified maximal-context trainer event. The estimate will use sub-contexts of the specified trainer event. The estimate returned will be the sum of the log probabilities returned by this and all contained Model instances.

Overrides:
estimateLogProb in class Model
Parameters:
id - the id of the decoding client calling this method
event - the maximal-context event from which to produce a conditional probability estimate of some element(s) of that context
Returns:
a log probability estimate of some joint event

estimateNonJointLogProb

public double estimateNonJointLogProb(int id,
                                      TrainerEvent event)
Estimates the log-probability of the specified event under this Model without adding the log-probabilities of the internal Model objects.

Parameters:
id - the id of the caller requesting the log-probability
event - the event containing the history context and future from which to estimate a conditional log-probability
Returns:
the log-probability of the specified event under this Model without adding the log-probabilities of the internal Model objects

estimateProb

public double estimateProb(int id,
                           TrainerEvent event)
Estimates a conditional probability from the specified maximal-context trainer event. The estimate will use sub-contexts of the specified trainer event. The estimate returned will be the product of the probabilities returned by this and all contained Model instances.

Overrides:
estimateProb in class Model
Parameters:
id - the id of the decoding client calling this method
event - the maximal-context event from which to derive a conditional probability estimate
Returns:
a conditional probability estimate of some joint event given some history, where both the joint event and history context are derived from the specified maximal-context event

estimateNonJointProb

public double estimateNonJointProb(int id,
                                   TrainerEvent event)
Estimates the probability of the specified event under this Model without multiplying the probabilities of the internal Model objects.

Parameters:
id - the id of the caller requesting the probability
event - the event containing the history context and future from which to estimate a conditional probability
Returns:
the probability of the specified event under this Model without multiplying the probabilities of the internal Model objects

numModels

public int numModels()
Returns the number of models used to produce a joint probability estimate, including this Model instance.

Overrides:
numModels in class Model
Returns:
the number of models used to produce a joint probability estimate, including this Model instance.

getModel

public Model getModel(int idx)
Returns this or any of the internal Model instances used to produce joint probability estimates.

Overrides:
getModel in class Model
Parameters:
idx - the index of the Model to return
Returns:
the Model at the specified index; if the specified index is 0, then this Model instance is returned; otherwise, one of the internal Model instances is returned
Throws:
ArrayIndexOutOfBoundsException - if the specified index is greater than numModels() - 1

getProbStructure

public ProbabilityStructure getProbStructure()
Returns the primary probability structure of this joint model, which is that used by this Model instance (as opposed to one of the internal Model instances).

Overrides:
getProbStructure in class Model
Returns:
the primary probability structure of this joint model, which is that used by this Model instance

getProbStructure

public ProbabilityStructure getProbStructure(int idx)
Returns a probability structure of this joint model, which is either that used by this Model instance, or a structure used by one of the internal Model instances.

Parameters:
idx - the index of the probability structure to return
Returns:
if the specified index is 0, then the probability structure of this Model instance is returned; otherwise, the probability structure of one of the internal Model instances is returned
Throws:
ArrayIndexOutOfBoundsException - if the specified index is greater than numModels() - 1

getCacheStats

public String getCacheStats()
Returns a string representing the cache statistics for this and all other, internal Model objects.

Overrides:
getCacheStats in class Model
Returns:
a string representing the cache statistics for this and all other, internal Model objects.

Parsing Engine

Author: Dan Bikel.