Parsing Engine

Uses of Class
danbikel.lisp.SexpList

Packages that use SexpList
danbikel.lisp Provides classes to create, read and manipulate symbolic expressions (S-expressions), including interned symbols. 
danbikel.parser Provides the core framework of this extensible statistical parsing engine. 
danbikel.parser.arabic Provides language-specific classes necessary to parse Arabic. 
danbikel.parser.chinese Provides language-specific classes necessary to parse Chinese. 
danbikel.parser.english Provides language-specific classes necessary to parse English. 
danbikel.parser.lang Provides default abstract base classes for the required interfaces of a language package. 
danbikel.parser.util Utility classes for displaying and manipulating parse trees. 
 

Uses of SexpList in danbikel.lisp
 

Subclasses of SexpList in danbikel.lisp
static class SexpList.HashCache
          A subclass of SexpList where a precomputed, cached hash value is stored with every instance.
 

Fields in danbikel.lisp declared as SexpList
static SexpList SexpList.emptyList
          An immutable object to represent the empty list.
 

Methods in danbikel.lisp that return SexpList
 SexpList SexpList.add(int index, Sexp sexp)
          Adds sexp at position index, shifting all elements to the right by one position to make room (an O(n) operation).
 SexpList SexpList.add(Sexp sexp)
          Appends sexp to the end of this list.
static SexpList SexpList.getCanonical(SexpList list)
          A simple canonicalization method that returns the unique object representing the empty list if the specified list contains no elements.
 SexpList Sexp.list()
          Returns this object cast to a SexpList.
 SexpList SexpList.listAt(int index)
          Returns the list at the specified index.
 SexpList SexpList.reverse()
          Performs an in-place reversal of the elements in this list.
 

Methods in danbikel.lisp with parameters of type SexpList
 boolean SexpList.addAll(int index, SexpList elementsToAdd)
          Adds all the elements in elementsToAdd to this list at the specified index.
 boolean SexpList.addAll(SexpList elementsToAdd)
          Appends all the elements in elementsToAdd to the end of this list.
static SexpList SexpList.getCanonical(SexpList list)
          A simple canonicalization method that returns the unique object representing the empty list if the specified list contains no elements.
 

Constructors in danbikel.lisp with parameters of type SexpList
SexpList(SexpList initialElements)
          Constructs a SexpList whose initial elements are those of initialElements.
 

Uses of SexpList in danbikel.parser
 

Subclasses of SexpList in danbikel.parser
 class SubcatList
          Implements subcats where requirements need to be met in the order in which they are added to this subcat (the strictest form of a subcat).
 

Fields in danbikel.parser declared as SexpList
 SexpList Nonterminal.augmentations
          A list of symbols representing any augmentations and delimiters.
protected  SexpList ProbabilityStructure.futureList
          Deprecated. Ever since the Event and MutableEvent interfaces were re-worked to include methods to add and iterate over event components and the SexpEvent class was retrofitted to these new specifications, this object became superfluous, as SexpEvent objects can now be efficiently constructed directly, by using the SexpEvent.add(Object) method.
protected  SexpList ProbabilityStructure.historyList
          Deprecated. Ever since the Event and MutableEvent interfaces were re-worked to include methods to add and iterate over event components and the SexpEvent class was retrofitted to these new specifications, this object became superfluous, as SexpEvent objects can now be efficiently constructed directly, by using the SexpEvent.add(Object) method.
protected  SexpList CKYItem.leftPrevMods
          The previous modifiers generated on the left of the head child.
protected  SexpList Decoder.originalSentence
          The original sentence, before preprocessing.
protected  SexpList Decoder.originalTags
          The original tag list, before preprocessing.
protected  SexpList Decoder.originalWords
          The original sentence, but with word removed to match pre-processing.
protected  SexpList Decoder.parentHeadSideLookupList
          A reusable object used for constructing parent-head-side triples when employing the simpler of two methods for determining whether a particular modifier is possible in the context of a particular parent-head-side combination.
protected  SexpList Decoder.partiallyLexedModLookupList
          A reusable object used for constructing a partially-lexicalized modifier nonterminal when employing the simpler of two methods for determining whether a particular modifier is possible in the context of a particular parent-head-side combination.
protected  SexpList Decoder.prevModLookupList
          A reusable object for constructing previous modifier lists for chart items.
protected  SexpList CKYItem.rightPrevMods
          The previous modifiers generated on the right of the head child.
protected  SexpList Parser.sent
          The current sentence being processed.
protected  SexpList Decoder.sentence
          The current sentence.
protected  SexpList Decoder.startList
          A list containing only Training.startSym(), which is the type of list that should be used when there are zero real previous modifiers (to start the Markov modifier process).
 

Methods in danbikel.parser that return SexpList
 SexpList CachingDecoderServer.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists, where the first element is the word itself, the second element is a word-feature vector, as determined by the implementation of WordFeatures.features(Symbol,boolean), and the third element is Constants.trueSym if this word was never observed during training or Constants.falseSym if it was observed at least once during training.
 SexpList DecoderServer.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists, where the first element is the word itself, the second element is a word-feature vector, as determined by the implementation of WordFeatures.features(Symbol,boolean), and the third element is Constants.trueSym if this word was never observed during training or Constants.falseSym if it was observed at least once during training.
 SexpList DecoderServerRemote.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists.
static SexpList Trainer.getCanonicalList(Map map, SexpList list)
          Returns a canonical version of the specified list from the specified reflexive map.
protected  SexpList Decoder.getPrevMods(CKYItem item, SLNode modChildren)
          Creates a new previous-modifier list given the specified current list and the last modifier on a particular side.
protected  SexpList Parser.getTagLists(SexpList sent)
          Returns a new list of the tag lists for each word when the specified sentence is in the format described in the comments for the Parser.sentContainsWordsAndTags(SexpList).
protected  SexpList Parser.getTagListsFromTree(Sexp tree)
          Collects a list of symbols that are the part-of-speech tags (preterminals) of the specified tree.
protected  SexpList Decoder.getTagSet(SexpList tags, int wordIdx, Symbol word, boolean wordIsUnknown, Symbol origWord, HashSet tmpSet)
          Gets the set of possible part-of-speech tags for a word in the sentence to be parsed.
protected  SexpList Parser.getWords(SexpList sent)
          Returns a new list containing only the words of the sentence to be parsed when the sentence is in the format described in the comment for the Parser.sentContainsWordsAndTags(SexpList) method.
protected  SexpList Parser.getWordsFromTree(Sexp tree)
          Returns a new list containing the word symbols from the specified tree.
protected  SexpList Parser.getWordsFromTree(SexpList wordList, Sexp tree)
          Gets the words of the sentence to be parsed from the specified parse tree.
 SexpList CKYItem.leftPrevMods()
          Returns a list of previously-generated unlexicalized modifiers on the left side of the head child in this item's set of derivations.
static SexpList Trainer.newStartList()
          Creates and returns a new start list.
protected  SexpList EMParser.preProcess(Sexp tree)
          Instead of simply invoking the Training.preProcess(Sexp) method, this method selectively invokes only some of the preprocessing methods of Training, so as to leave the rest of the transformations unconstrained.
 SexpList Training.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of tags.
 SexpList ModifierEvent.previousMods()
          Returns a list of modifiers that have already been generated.
 SexpList CKYItem.prevMods(boolean side)
          Returns the previous modifiers on the specified side of this item's head child.
 SexpList CKYItem.rightPrevMods()
          Returns a list of previously-generated unlexicalized modifiers on the right side of the head child in this item's set of derivations.
protected  SexpList Decoder.setUnion(SexpList l1, SexpList l2, Set tmpSet)
          Returns a new list that is the union of the two specified lists.
 

Methods in danbikel.parser with parameters of type SexpList
 boolean BrokenSubcatBag.addAll(SexpList list)
          Adds each of the symbols of list to this subcat bag, effectively calling BrokenSubcatBag.add(Symbol) for each element of list.
 boolean Subcat.addAll(SexpList list)
          Adds the specified list of nonterminals (symbols) to the required arguments of this subcat frame.
 boolean SubcatBag.addAll(SexpList list)
          Adds each of the symbols of list to this subcat bag, effectively calling SubcatBag.add(Symbol) for each element of list.
 boolean SubcatList.addAll(SexpList list)
          Adds the requirements (Symbol objects) of list to this subcat list.
 SexpList CachingDecoderServer.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists, where the first element is the word itself, the second element is a word-feature vector, as determined by the implementation of WordFeatures.features(Symbol,boolean), and the third element is Constants.trueSym if this word was never observed during training or Constants.falseSym if it was observed at least once during training.
 SexpList DecoderServer.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists, where the first element is the word itself, the second element is a word-feature vector, as determined by the implementation of WordFeatures.features(Symbol,boolean), and the third element is Constants.trueSym if this word was never observed during training or Constants.falseSym if it was observed at least once during training.
 SexpList DecoderServerRemote.convertUnknownWords(SexpList sentence)
          Replaces all unknown words in the specified sentence with three-element lists.
 int HeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs → rhs.
 Subcat BrokenSubcatBagFactory.get(SexpList list)
          Returns a SubcatBag initialized with the requirements contained in the specified list.
 Subcat SubcatBagFactory.get(SexpList list)
          Returns a SubcatBag initialized with the requirements contained in the specified list.
 Subcat SubcatFactory.get(SexpList list)
          Return a Subcat object created with its one-argument constructor, using the specified list.
 Subcat SubcatListFactory.get(SexpList list)
          Returns a SubcatList initialized with the requirements contained in the specified list.
static Subcat Subcats.get(SexpList list)
          Return a Subcat object created with its one-argument constructor, using the specified list.
static SexpList Trainer.getCanonicalList(Map map, SexpList list)
          Returns a canonical version of the specified list from the specified reflexive map.
protected  SexpList Parser.getTagLists(SexpList sent)
          Returns a new list of the tag lists for each word when the specified sentence is in the format described in the comments for the Parser.sentContainsWordsAndTags(SexpList).
protected  SexpList Decoder.getTagSet(SexpList tags, int wordIdx, Symbol word, boolean wordIsUnknown, Symbol origWord, HashSet tmpSet)
          Gets the set of possible part-of-speech tags for a word in the sentence to be parsed.
protected  SexpList Parser.getWords(SexpList sent)
          Returns a new list containing only the words of the sentence to be parsed when the sentence is in the format described in the comment for the Parser.sentContainsWordsAndTags(SexpList) method.
protected  SexpList Parser.getWordsFromTree(SexpList wordList, Sexp tree)
          Gets the words of the sentence to be parsed from the specified parse tree.
protected  void Decoder.initialize(SexpList sentence)
          Initializes the chart for parsing the specified sentence.
protected  void Decoder.initialize(SexpList sentence, SexpList tags)
          Initializes the chart for parsing the specified sentence, using the specified coordinated list of part-of-speech tags when assigning parts of speech to unknown words.
static WordList WordListFactory.newList(SexpList list)
          Returns a new WordList object containing Word objects constructed from the elements of the specified list, using the Word.Word(Sexp) constructor.
protected  Sexp Decoder.parse(SexpList sentence)
          Parses the specified sentence.
 Sexp Parser.parse(SexpList sent)
          Parses the specified sentence, which can be in one of three formats.
protected  Sexp Decoder.parse(SexpList sentence, SexpList tags)
          Parses the specified sentence using the supplied list of part-of-speech tags.
protected  Sexp Decoder.parse(SexpList sentence, SexpList tags, ConstraintSet constraints)
          Parses the specified sentence using the supplied list of part-of-speech tags and the supplied set of parsing constraints.
protected  CountsTable EMDecoder.parseAndCollectEventCounts(SexpList sentence)
          Constrain-parses the specified sentence and computes expected top-level (maximal context) event counts.
 CountsTable EMParser.parseAndCollectEventCounts(SexpList sent)
          Collect expected counts for the specified partial parse tree/sentence.
protected  CountsTable EMDecoder.parseAndCollectEventCounts(SexpList sentence, SexpList tags)
          Constrain-parses the specified sentence and computes expected top-level (maximal context) event counts.
protected  CountsTable EMDecoder.parseAndCollectEventCounts(SexpList sentence, SexpList tags, ConstraintSet constraints)
          Constrain-parses the specified sentence and computes expected top-level (maximal context) event counts.
protected  void Decoder.preProcess(SexpList sentence, SexpList tags)
          Performs all preprocessing to the specified coordinated lists of words and part-of-speech tags of the sentence that is about to be parsed.
 SexpList Training.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of tags.
 void Trainer.readStatsHook(SexpList event)
          A hook for subclasses to read an event of a newly-defined type (called by Trainer.readStats(SexpTokenizer)).
protected  void Decoder.removeWord(SexpList sentence, SexpList tags, int i)
          A helper method used by Decoder.preProcess(danbikel.lisp.SexpList, danbikel.lisp.SexpList) that removes words from the specified sentence and Decoder.originalWords lists, and also from the specified tags list, if it is not null.
 boolean Training.removeWord(Symbol word, Symbol tag, int idx, SexpList sentence, SexpList tags, SexpList originalTags, Set prunedPretermsPosSet, Map prunedPretermsPosMap)
          Invoked by the decoder as the first step in preprocessing (prior to the invocation of Training.preProcessTest(danbikel.lisp.SexpList, danbikel.lisp.SexpList, danbikel.lisp.SexpList)).
protected  void Decoder.seedChart(Symbol word, int wordIdx, Symbol features, boolean neverObserved, SexpList tagSet, boolean wordIsUnknown, Symbol origWord, ConstraintSet constraints)
          Adds a chart item for every possible part of speech for the specified word at the specified index in the current sentence.
protected  void EMDecoder.seedChart(Symbol word, int wordIdx, Symbol features, boolean neverObserved, SexpList tagSet, boolean wordIsUnknown, Symbol origWord, ConstraintSet constraints)
           
protected  boolean Parser.sentContainsWordsAndTags(SexpList sent)
          A method to determine if the sentence to be parsed is in the format where part-of-speech tags are supplied along with the words.
 void CKYItem.set(Symbol label, Word headWord, Subcat leftSubcat, Subcat rightSubcat, CKYItem headChild, SLNode leftChildren, SLNode rightChildren, SexpList leftPrevMods, SexpList rightPrevMods, int start, int end, boolean leftVerb, boolean rightVerb, boolean stop, double logTreeProb, double logPrior, double logProb)
          Sets all of the data members of this chart item.
 void EMItem.set(Symbol label, Word headWord, Subcat leftSubcat, Subcat rightSubcat, CKYItem headChild, SLNode leftChildren, SLNode rightChildren, SexpList leftPrevMods, SexpList rightPrevMods, int start, int end, boolean leftVerb, boolean rightVerb, boolean stop, double logTreeProb, double logPrior, double logProb)
          This method simply throws an UnsupportedOperationException, as the log probabilities of the superclass are not used by this class.
 void EMItem.set(Symbol label, Word headWord, Subcat leftSubcat, Subcat rightSubcat, CKYItem headChild, SLNode leftChildren, SLNode rightChildren, SexpList leftPrevMods, SexpList rightPrevMods, int start, int end, boolean leftVerb, boolean rightVerb, boolean stop, int unaryLevel, double insideProb)
          Sets all the data for this EM chart item.
 void CKYItem.setPrevMods(boolean side, SexpList prevMods)
          Sets the previous modifier list on the specified side of this chart item's head child.
 void CKYItem.setSideInfo(boolean side, Subcat subcat, SLNode children, SexpList prevMods, int edgeIndex, boolean verb)
          Sets all the side-specific information for one side of this chart item.
protected  SexpList Decoder.setUnion(SexpList l1, SexpList l2, Set tmpSet)
          Returns a new list that is the union of the two specified lists.
 void BaseNPAwareShifter.shift(TrainerEvent event, SexpList list, Sexp prevMod)
          The previous modifier is not shifted into the history if the current parent (as determined by TrainerEvent.parent()) is a base NP and the previous modifier is punctuation.
 void DefaultShifter.shift(TrainerEvent event, SexpList list, Sexp prevMod)
           
 void Shift.shift(TrainerEvent event, SexpList list, Sexp prevMod)
          Shifts the previously-generated modifier label into the history.
static void Shifter.shift(TrainerEvent event, SexpList list, Sexp prevMod)
          Uses the internal Shifter instance to shift the newly-generated (and therefore previously-generated) modifier into the history, which is a SexpList.
 

Constructors in danbikel.parser with parameters of type SexpList
BrokenSubcatBag(SexpList list)
          Constructs a subcat bag containing the number of occurrences of the symbols of list.
CKYItem(Symbol label, Word headWord, Subcat leftSubcat, Subcat rightSubcat, CKYItem headChild, SLNode leftChildren, SLNode rightChildren, SexpList leftPrevMods, SexpList rightPrevMods, int start, int end, boolean leftVerb, boolean rightVerb, boolean stop, double logTreeProb, double logPrior, double logProb)
          Constructs a CKY chart item with the specified data.
HeadEvent(Word headWord, Symbol parent, Symbol head, SexpList leftSubcat, SexpList rightSubcat)
          Constructs a new HeadEvent object, setting all its data members to the specified values.
ModifierEvent(Word modHeadWord, Word headWord, Symbol modifier, SexpList previousMods, WordList previousWords, Symbol parent, Symbol head, SexpList subcat, boolean verbIntervening, boolean side)
          Constructs a new ModifierEvent object, settings its data members to the values specified.
ModifierEvent(Word modHeadWord, Word headWord, Symbol modifier, SexpList previousMods, WordList previousWords, Symbol parent, Symbol head, Subcat subcat, boolean verbIntervening, boolean side)
          Constructs a new ModifierEvent object, settings its data members to the values specified.
ModifierEvent(Word modHeadWord, Word headWord, Symbol modifier, SexpList previousMods, WordList previousWords, Symbol parent, Symbol head, Subcat subcat, Word prevPunc, Word prevConj, boolean isConjPConj, boolean verbIntervening, boolean headAdjacent, boolean side)
          Constructs a new ModifierEvent object for use when outputting training events in the format of Mike Collins’ parser, settings its data members to the values specified.
Nonterminal(Symbol base, SexpList augmentations, int index)
          Sets the data members of this new object to the specified values
SubcatBag(SexpList list)
          Constructs a subcat bag containing the number of occurrences of the symbols of list.
SubcatList(SexpList list)
          Constructs a new subcat list from the requirements in the specified SexpList.
 

Uses of SexpList in danbikel.parser.arabic
 

Methods in danbikel.parser.arabic that return SexpList
 SexpList Training.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of part-of-speech tags, leaving the original sentence untouched but providing a modified version of the coordinated list of tags, where each tag has been mapped using the value of the original word and the original tag using TagMap.transformTag(Word).
 

Methods in danbikel.parser.arabic with parameters of type SexpList
 int HeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs -> rhs.
 SexpList Training.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of part-of-speech tags, leaving the original sentence untouched but providing a modified version of the coordinated list of tags, where each tag has been mapped using the value of the original word and the original tag using TagMap.transformTag(Word).
protected  void Training.readMetadataHook(Symbol dataType, int metadataLen, SexpList metadata)
          Reads the tag map metadata if the specified data type is equal to Training.tagMapSym.
 

Uses of SexpList in danbikel.parser.chinese
 

Methods in danbikel.parser.chinese with parameters of type SexpList
 int HeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs -> rhs.
 

Uses of SexpList in danbikel.parser.english
 

Methods in danbikel.parser.english with parameters of type SexpList
 int BrokenHeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs -> rhs.
 int HeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs -> rhs.
 boolean Training.removeWord(Symbol word, Symbol tag, int idx, SexpList sentence, SexpList tags, SexpList originalTags, Set prunedPretermsPosSet, Map prunedPretermsPosMap)
           
 

Uses of SexpList in danbikel.parser.lang
 

Fields in danbikel.parser.lang declared as SexpList
protected  SexpList AbstractTraining.argAugmentations
          A list representing the set of all argument augmentations.
 

Methods in danbikel.parser.lang that return SexpList
 SexpList AbstractTraining.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of tags.
 

Methods in danbikel.parser.lang with parameters of type SexpList
protected  int AbstractHeadFinder.defaultFindHead(Symbol lhs, SexpList rhs)
          Provides a default mechanism to use the head table to find a head in the specified grammar production.
abstract  int AbstractHeadFinder.findHead(Sexp tree, Symbol lhs, SexpList rhs)
          Finds the head for the grammar production lhs -> rhs.
 SexpList AbstractTraining.preProcessTest(SexpList sentence, SexpList originalWords, SexpList tags)
          Preprocesses the specified test sentence and its coordinated list of tags.
protected  void AbstractTraining.readMetadataHook(Symbol dataType, int metadataLen, SexpList metadata)
          A hook for subclasses to have their own custom metadata types.
protected  void AbstractTraining.relabelArgChildren(SexpList treeList, int headIdx, SexpList candidatePatterns)
          Relabels as arguments all immediately-dominated children in the specified subtree accoding to the specified argument-finding patterns.
 boolean AbstractTraining.removeWord(Symbol word, Symbol tag, int idx, SexpList sentence, SexpList tags, SexpList originalTags, Set prunedPretermsPosSet, Map prunedPretermsPosMap)
           
protected  int AbstractHeadFinder.scan(boolean direction, SexpList rhs, Symbol[] matchTags)
          Scans the RHS of a production in the specified direction.
protected  int AbstractHeadFinder.scanLeftToRight(SexpList rhs, Symbol[] matchTags)
          Scans the RHS of a production from left to right, returning the index of the first nonterminal that is in the matchTags array.
protected  int AbstractHeadFinder.scanRightToLeft(SexpList rhs, Symbol[] matchTags)
          Scans the RHS of a production from right to left, returning the index of the first nonterminal that is in the matchTags array.
 

Uses of SexpList in danbikel.parser.util
 

Methods in danbikel.parser.util with parameters of type SexpList
static void DebugChart.findConstituents(boolean downcaseWords, Chart chart, CKYItem topRankedItem, SexpList sentence, Sexp goldTree)
          Prints out to System.err which constituents of the specified gold-standard parse tree were found by the parser, according to the specified chart.
static void DebugChart.findConstituents(String prefix, boolean downcaseWords, Chart chart, CKYItem topRankedItem, SexpList sentence, Sexp goldTree)
          Prints out to System.err which constituents of the specified gold-standard parse tree were found by the parser, according to the specified chart.
static Sexp DebugChart.removePreterms(SexpList words, Sexp tree, int wordIdx)
          Removes preterminals from the specified tree that are not found in the specified list of words.
static void DebugChart.replaceWords(boolean downcaseWords, Sexp tree, SexpList sentence)
           
 


Parsing Engine

Author: Dan Bikel.