|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdanbikel.parser.AnalyzeDisns
public class AnalyzeDisns
An analysis and debugging class to analyze the probability distributions of
all Model
s in a ModelCollection
. It is important that
Model.precomputeProbs
was true,
Model.deleteCountsWhenPrecomputingProbs
was false and
Model.createHistBackOffMap
was true
ModelCollection
to be analyzed was created.
Model.deleteCountsWhenPrecomputingProbs
,
Model.createHistBackOffMap
Field Summary | |
---|---|
static int |
toPrevIdx
The BiCountsTable index for retrieving the Jensen-Shannon
divergence from a history context (distribution) at a particular back-off
level to its corresponding previous back-off level (greater context)
history context. |
static int |
toZeroIdx
The BiCountsTable index for retrieving the Jensen-Shannon
divergence from a history context (distribution) at a particular back-off
level to its corresponding zeroeth back-off level (maximal context) history
context. |
Method Summary | |
---|---|
static void |
analyzeModWordDisn(ModelCollection mc,
String eventStr)
A debugging method for analyzing a particular event in the modifier word model. |
static void |
computeEntropyAndJSStats(Model model,
CountsTable[] entropy,
BiCountsTable[] js)
A method invoked by Model when Settings.modelDoPruning is true: entropy values and
JS divergence values are used in the parameter-pruning method. |
static CountsTable[] |
computeModelEntropies(Model model)
A method to compute a model's entropy statistics for all estimated distributions. |
static CountsTable[] |
computeModelEntropies(Model model,
CountsTable[] entropy)
A method to compute a model's entropy statistics for all estimated distributions. |
static double |
entropy(double[] disn)
Returns the entropy of the specified distribution. |
static double |
entropy(double[] disn,
int endIdx)
Returns the entropy of the specified distribution. |
static double |
entropyFromLogProbs(double[] disn)
Returns the entropy of the specified distribution of log-probabilities. |
static double |
entropyFromLogProbs(double[] disn,
int endIdx)
Returns the entropy of the specified distribution of log-probabilities. |
static Set |
getFutures(Set futures,
Model model,
int level)
Returns all possible futures for the specified model at the specified back-off level, using the specified set for storage (the specified set is first cleared before futures are stored). |
static double[] |
getLogProbDisn(Model model,
int level,
Event hist,
Set futures,
double[] disn,
Transition tmpTrans)
Returns the smoothed log-probability distribution for the specified history at the specified back-off level in the specified model. |
static double |
klDistFromLogProbs(double[] disnP,
double[] disnQ)
Returns D( disnP || disnQ ),
where D is the Kullback-Leibler divergence (relative
entropy), and where each of the specified arguments is a distribution
of log-probabilities. |
static void |
main(String[] args)
Analyzes and saves information about every distribution in every Model contained in a ModelCollection . |
static CountsTable[] |
newEntropyCountsTables(Model model)
Returns an array of CountsTable instances in which to store the
entropy of every history at every back-off level. |
static BiCountsTable[] |
newJSCountsTables(Model model)
Returns an array of BiCountsTable instances in which to store the
JS divergence of every history at every back-off level, both to the
previous back-off level and to the zeroeth back-off level. |
static void |
outputHistories(Model model)
A debugging method that outputs all histories of the specified model to System.out . |
static void |
writeKLDistStats(Model model)
Creates two files named after the probability structure of the specified model, and writes Kullback-Leibler divergences (relative entropies) between the zeroeth-level back-off distributions and the other back-off distributions to one file and writes Jensen-Shannon divergences between zeroeth-level back-off distributions and the other back-off distributions to the other file. |
static void |
writeModelStats(Model model)
Creates a file named after the probability structure class of the specified model and writes information about every distribution contained in that model. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int toZeroIdx
BiCountsTable
index for retrieving the Jensen-Shannon
divergence from a history context (distribution) at a particular back-off
level to its corresponding zeroeth back-off level (maximal context) history
context.
public static final int toPrevIdx
BiCountsTable
index for retrieving the Jensen-Shannon
divergence from a history context (distribution) at a particular back-off
level to its corresponding previous back-off level (greater context)
history context.
Method Detail |
---|
public static double entropy(double[] disn)
disn
- an array containing the probabilites of a distribution
public static double entropy(double[] disn, int endIdx)
disn
- an array containing the probabilities of a distributionendIdx
- the last index plus one in the specified array of
probabilities
public static double entropyFromLogProbs(double[] disn)
disn
- an array containing the log-probabilites of a distribution
public static double entropyFromLogProbs(double[] disn, int endIdx)
disn
- an array containing the log-probabilities of a distributionendIdx
- the last index plus one in the specified array of
log-probabilities
public static double klDistFromLogProbs(double[] disnP, double[] disnQ)
disnP
|| disnQ
),
where D is the Kullback-Leibler divergence (relative
entropy), and where each of the specified arguments is a distribution
of log-probabilities.
disnP
- a distribution of log-probabilitiesdisnQ
- a distribution of log-probabilities
disnP
|| disnQ
)public static void analyzeModWordDisn(ModelCollection mc, String eventStr) throws IOException
mc
- the model collection from which to access the modifier word
modeleventStr
- the string to be converted to an S-expression that
represents the TrainerEvent
to be analyzed
IOException
public static void outputHistories(Model model)
System.out
.
model
- the model whose histories are to be outputpublic static Set getFutures(Set futures, Model model, int level)
futures
- the set in which to store futuresmodel
- the model from which to collect possible futureslevel
- the back-off level at which to collect possible futures
(should normally be irrelevant)
Set
having been destructively
modified to contain possible futures for the specified model at the
specified back-off levelpublic static double[] getLogProbDisn(Model model, int level, Event hist, Set futures, double[] disn, Transition tmpTrans)
model
- the model from which to get a distribution of smoothed
log-probability estimateslevel
- the back-off level of the specified historyhist
- the history for which a distribution is to be gottenfutures
- the set of possible futures for the specified historydisn
- the array in which to store all smoothed log-probability
estimatestmpTrans
- a temporary Transition
object, to be used
during the estimation of smoothed log-probabilities
double
, having been modified
to contain a distribution of log-probabilities at indices 0
through futures.size() - 1
ArrayIndexOutOfBoundsException
- if the specified array of
double
(the disn
parameter) is of length less
than futures.size()
public static CountsTable[] computeModelEntropies(Model model)
model
- the model whose entropy statistics are to be computed
public static CountsTable[] computeModelEntropies(Model model, CountsTable[] entropy)
model
- the model whose entropy statistics are to be computedentropy
- an array of length
model.getProbStructure().numLevels()
in which to store entropy statistics for every history of
every back-off level of the specified model
public static void writeModelStats(Model model) throws IOException
model
- the model whose distributions are to be analyzed
IOException
public static CountsTable[] newEntropyCountsTables(Model model)
CountsTable
instances in which to store the
entropy of every history at every back-off level. The array will
necessarily be of length model.getProbStructure().numLevels()
.
model
- the model for which entropies are to be computed
CountsTable
instances in which to store the
entropy of every history at every back-off levelpublic static BiCountsTable[] newJSCountsTables(Model model)
BiCountsTable
instances in which to store the
JS divergence of every history at every back-off level, both to the
previous back-off level and to the zeroeth back-off level. The array will
necessarily be of length model.getProbStructure().numLevels()
.
model
- the model for which entropies are to be computed
BiCountsTable
instances in which to store the
JS divergence of every history at every back-off leveltoPrevIdx
,
toZeroIdx
public static void computeEntropyAndJSStats(Model model, CountsTable[] entropy, BiCountsTable[] js)
Model
when Settings.modelDoPruning
is true: entropy values and
JS divergence values are used in the parameter-pruning method.
model
- the model whose entropies and JS divergence statistics
are to be computedentropy
- the array of counts tables in which to store the
entropies of the specified model's distributionsjs
- the array of BiCountsTable
objects in which to store
the JS divergence statistics of the specified model's distributionspublic static void writeKLDistStats(Model model) throws IOException
Specifically, the KL divergence file will contain one line for each zeroeth-level (maximal-context) history with the following elements, separated by tab characters:
For example, if a model has three back-off levels (a zeroeth, maximal-context level and two more levels, each with less context), then each line will contain 11 elements separated by tab characters, where the first element is the S-expression of the zeroeth back-off level history and with five elements for each of the other two back-off levels.
The JS divergence file will contain one line for each non-zeroeth-level history with the following four elements, separated by tab characters:
model
- the model whose distributions are to be analyzed
IOException
public static void main(String[] args)
Model
contained in a ModelCollection
. It is important that
Model.precomputeProbs
was true,
Model.deleteCountsWhenPrecomputingProbs
was false and
Model.createHistBackOffMap
was true
ModelCollection
to be analyzed was created.
usage: <derived data file>
where <derived data file> was produced by Trainer
.
args
- an array containing at least one element that is the name
of a model collection (derived data file) as produced by a Trainer
instanceTrainer
,
Trainer.writeModelCollection(ObjectOutputStream,String,String)
,
Trainer.loadModelCollection(String)
|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |