|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.rmi.server.RemoteObject
java.rmi.server.RemoteServer
danbikel.switchboard.AbstractSwitchboardUser
danbikel.switchboard.AbstractClient
danbikel.parser.Parser
public class Parser
A parsing client. This class parses sentences by implementing the
AbstractClient.process(Object)
method of its superclass
. All top-level probabilities are
computed by a DecoderServerRemote
object, which is either local
or is a stub whose methods are invoked via RMI. The actual
parsing is implemented in the Decoder
class.
AbstractClient
,
DecoderServerRemote
,
DecoderServer
,
Decoder
,
Serialized FormNested Class Summary |
---|
Nested classes/interfaces inherited from class danbikel.switchboard.AbstractSwitchboardUser |
---|
AbstractSwitchboardUser.Alive, AbstractSwitchboardUser.SBUserRetry |
Field Summary | |
---|---|
protected Decoder |
decoder
The internal Decoder that performs the actual parsing. |
protected static String |
derivedDataFilename
The derived data filename specified on the command line. |
protected PrintWriter |
err
A PrintWriter object wrapped around System.err for
printing in the proper character encoding. |
protected static boolean |
grabSBSettings
Indicates whether the user specified on the command line for this client to grab its settings from the switchboard. |
protected static String |
inputFilename
The input filename specified on the command line. |
protected String |
internalInputFilename
The name of the input file to be processed (only used when this parser is in stand-alone mode, not using the Switchboard . |
protected String |
internalOutputFilename
The name of the output file to be processed (only used when this parser is in stand-alone mode, not using the Switchboard . |
protected static Class[] |
intTypeArr
An array of types containing a single element, Integer.TYPE . |
protected static String |
invocationTargetExceptionMsg
|
protected boolean |
keepAllWords
Cached value of Settings.keepAllWords , for efficiency and
convenience. |
protected boolean |
localServer
Indicates whether the DecoderServerRemote instance is local
or remote (an RMI stub). |
protected static Class[] |
newDecoderTypeArr
An array of types used for fetching the constructor of Decoder
that takes two arguments of type int and of type
DecoderServerRemote . |
protected static int |
numClients
The number of parsing client to create in this JVM. |
protected static String |
outputFilename
The output filename specified on the command line. |
static String |
outputFilenameSuffix
The suffix to attach to input files by default when creating their associated output files. |
protected static Class |
parserClass
The subclass of Parser to be constructed by
the main(String[]) method of this class. |
protected SexpList |
sent
The current sentence being processed. |
protected DecoderServerRemote |
server
The server for the internal Decoder to use when parsing. |
protected static String |
settingsFilename
The settings file to use specified on the command line. |
protected static boolean |
standAlone
Indicates whether this is a stand-alone client, or is using a switchboard, as specified on the command line. |
protected static Class[] |
stringTypeArr
An array of types containing a single element, String.class . |
protected static String |
switchboardName
The bound RMI name of the Switchboard specified on the command line
(defaults to Switchboard.defaultBindingName ). |
Fields inherited from class danbikel.switchboard.AbstractClient |
---|
defaultNextObjectInterval, failover, faultTolerant, nextObjectInterval, rand, retries, serverId, sleepTime |
Fields inherited from class danbikel.switchboard.AbstractSwitchboardUser |
---|
aliveSynch, aliveTimeout, defaultMaxSwitchboardTries, defaultTimeout, dieSynch, id, infiniteTries, maxSwitchboardTries, registered, switchboard, timeout, timeToDie |
Fields inherited from class java.rmi.server.RemoteObject |
---|
ref |
Constructor Summary | |
---|---|
Parser(DecoderServerRemote server)
Constructs a new parsing client where the internal Decoder will
use the specified server. |
|
Parser(int timeout)
Constructs a new parsing client with the specified timeout value for its sockets (not needed with recent RMI implementations from Sun). |
|
Parser(int timeout,
int port)
Constructs a new parsing client with the specified timeout value for its sockets (not needed with recent RMI implementations from Sun) and with the specified listening port for receiving remote method invocations. |
|
Parser(int port,
RMIClientSocketFactory csf,
RMIServerSocketFactory ssf)
Constructs a new parsing client with the specified RMI port and client and server socket factories. |
|
Parser(String derivedDataFilename)
Constructs a new Parser instance that will construct an internal
DecoderServer for its Decoder to use when parsing. |
Method Summary | |
---|---|
protected static void |
checkSettings(Properties sbSettings)
Checks the specified settings and issues warnings to System.err when a current setting differs. |
protected Sexp |
convertUnknownWords(Sexp tree,
IntCounter currWordIdx)
Converts certain words (leaves) in the specified tree to their associated word-feature vectors. |
protected ConstraintSet |
getConstraintsFromTree(Sexp tree)
After converting unknown words in the specified parse tree, this method constructs a constraint set using the method ConstraintSets.get(Object) . |
protected static boolean |
getFailover(boolean defaultValue)
Returns the boolean value of Settings.serverFailover , or the
specified fallback default value if that property does not exist. |
static File |
getFile(String filename)
Returns a new File object for the specified filename, or
null if the specified file does not exist. |
static File |
getFile(String filename,
boolean verbose)
Returns a new File object for the specified filename, or
null if the specified file does not exist. |
protected Decoder |
getNewDecoder(int id,
DecoderServerRemote server)
|
protected static DecoderServer |
getNewDecoderServer(String derivedDataFilename)
Gets a new decoder server for when creating a stand-alone parsing client (i.e., a parsing client that creates its own DecoderServerRemote
instance). |
protected static Parser |
getNewParser(int timeout)
Returns a new parsing client constructed via its single- int
constructor using the specified timeout value as the argument. |
protected static Parser |
getNewParser(String derivedDataFilename)
Returns a new parsing client constructed via its single- String
constructor using the specified derived data filename as the argument. |
protected static int |
getRetries(int defaultValue)
Returns the integer value of Settings.serverMaxRetries , or the
specified fallback default value if that property does not exist. |
protected static int |
getRetrySleep(int defaultValue)
Returns the integer value of Settings.serverRetrySleep , or the
specified fallback default value if that property does not exist. |
protected void |
getServer()
Unless it is time to die, this method continually tries the switchboard until it can assign this client a server. |
protected SexpList |
getTagLists(SexpList sent)
Returns a new list of the tag lists for each word when the specified sentence is in the format described in the comments for the sentContainsWordsAndTags(SexpList) . |
protected SexpList |
getTagListsFromTree(Sexp tree)
Collects a list of symbols that are the part-of-speech tags (preterminals) of the specified tree. |
protected static int |
getTimeout()
Obtains the timeout from Settings . |
protected SexpList |
getWords(SexpList sent)
Returns a new list containing only the words of the sentence to be parsed when the sentence is in the format described in the comment for the sentContainsWordsAndTags(SexpList) method. |
protected SexpList |
getWordsFromTree(Sexp tree)
Returns a new list containing the word symbols from the specified tree. |
protected SexpList |
getWordsFromTree(SexpList wordList,
Sexp tree)
Gets the words of the sentence to be parsed from the specified parse tree. |
static void |
main(String[] args)
Contacts the switchboard, registers this parsing client and gets sentences from the switchboard, parses them and returns them, until the switchboard indicates there are no more sentences to process. |
Sexp |
parse(SexpList sent)
Parses the specified sentence, which can be in one of three formats. |
protected Object |
process(Object obj)
Parses the specified object, which must be a SexpList . |
protected void |
processInputFile(String inputFilename,
String outputFilename)
Parses the sentences contained in the specified input file, outputting the results to the specified output file. |
void |
run()
Runs this parsing client within its enclosing thread: if internalInputFilename is null , then this method invokes
AbstractClient.processObjectsThenDie() ; otherwise, this method processes internalInputFilename and outputs to internalOutputFilename by
invoking processInputFile(String,String) . |
protected boolean |
sentContainsWordsAndTags(SexpList sent)
A method to determine if the sentence to be parsed is in the format where part-of-speech tags are supplied along with the words. |
protected void |
setInternalFilenames(String inputFilename,
String outputFilename)
Sets the internalInputFilename and internalOutputFilename
members of this parsing client to the specified values. |
static void |
setSettingsFromSwitchboard(SwitchboardRemote sb)
Grabs the settings from the Switchboard instance and sets
to be the current run-time settings. |
protected void |
switchboardFailure()
Prints the sentence currently being parsed to System.err
as an emergency backup (in case processing took a long time and
it is highly undesirable to lose the work). |
protected void |
tolerateFaults(int retries,
int sleepTime,
boolean failover)
Wraps the current server in proxies that ensure the fault tolerance of calls to that server. |
void |
update(Map<String,String> changedSettings)
Invoked by this class to notify the requesting class that one or more settings have changed. |
protected boolean |
wordTagList(Sexp sexp)
Returns whether the specified S-expression is a list of length two whose first element is a symbol and whose second element is a list of one or more symbols. |
Methods inherited from class danbikel.switchboard.AbstractClient |
---|
cleanup, disableHttp, getFaultTolerantServer, processObjects, processObjectsThenDie, register, reRegister, serverDown, setNextObjectInterval, setPolicyFile, setPolicyFile, sleepRandom |
Methods inherited from class danbikel.switchboard.AbstractSwitchboardUser |
---|
alive, die, disableHttp, getAliveTimeout, getSwitchboard, getSwitchboard, getSwitchboard, getSwitchboard, getSwitchboard, getSwitchboard, host, id, nonZeroTimeout, setPolicyFile, setPolicyFile, startAliveThread, unexportWhenDead |
Methods inherited from class java.rmi.server.RemoteServer |
---|
getClientHost, getLog, setLog |
Methods inherited from class java.rmi.server.RemoteObject |
---|
equals, getRef, hashCode, toString, toStub |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface danbikel.switchboard.Client |
---|
serverDown |
Methods inherited from interface danbikel.switchboard.SwitchboardUser |
---|
alive, die, host, id |
Field Detail |
---|
protected static String invocationTargetExceptionMsg
protected static final Class[] stringTypeArr
String.class
. Used for fetching constructors of classes
that take a single argument of type String
.
getNewParser(String)
,
getNewDecoderServer(String)
protected static final Class[] intTypeArr
Integer.TYPE
. Used when fetching the approrpriate constructor
of this class in the static “named constructor” getNewParser(int)
.
protected static final Class[] newDecoderTypeArr
Decoder
that takes two arguments of type int
and of type
DecoderServerRemote
.
getNewDecoder(int,DecoderServerRemote)
protected boolean keepAllWords
Settings.keepAllWords
, for efficiency and
convenience.
public static final String outputFilenameSuffix
protected static Class parserClass
Parser
to be constructed by
the main(String[])
method of this class.
protected DecoderServerRemote server
Decoder
to use when parsing.
protected SexpList sent
protected Decoder decoder
Decoder
that performs the actual parsing.
protected boolean localServer
DecoderServerRemote
instance is local
or remote (an RMI stub).
protected String internalInputFilename
Switchboard
.
protected String internalOutputFilename
Switchboard
.
protected PrintWriter err
PrintWriter
object wrapped around System.err
for
printing in the proper character encoding.
protected static String switchboardName
Switchboard
specified on the command line
(defaults to Switchboard.defaultBindingName
).
protected static String derivedDataFilename
protected static String inputFilename
protected static String outputFilename
protected static String settingsFilename
protected static int numClients
protected static boolean standAlone
protected static boolean grabSBSettings
Constructor Detail |
---|
public Parser(String derivedDataFilename) throws RemoteException, ClassNotFoundException, NoSuchMethodException, InvocationTargetException, IllegalAccessException, InstantiationException
Parser
instance that will construct an internal
DecoderServer
for its Decoder
to use when parsing.
derivedDataFilename
- the name of the derived data file to pass to the
constructor of the DecoderServer
class
when creating an internal instance
RemoteException
ClassNotFoundException
- if getNewDecoderServer(String)
throws this exception
NoSuchMethodException
- if getNewDecoderServer(String)
throws this excception
InvocationTargetException
- if getNewDecoderServer(String)
throws this excception
IllegalAccessException
- if getNewDecoderServer(String)
throws this excception
InstantiationException
- if getNewDecoderServer(String)
throws this excceptionpublic Parser(DecoderServerRemote server) throws RemoteException
Decoder
will
use the specified server.
server
- the server for the Decoder
to use
RemoteException
public Parser(int timeout) throws RemoteException
timeout
- the timeout value for RMI client and server sockets
RemoteException
public Parser(int timeout, int port) throws RemoteException
timeout
- the timeout value for RMI client and server socketsport
- the port for this RMI server
RemoteException
public Parser(int port, RMIClientSocketFactory csf, RMIServerSocketFactory ssf) throws RemoteException
port
- the port on which to receive remote method invocationscsf
- the client socket factory for this RMI clientssf
- the server socket factory for this RMI server
RemoteException
Method Detail |
---|
public void update(Map<String,String> changedSettings)
Settings.Change
update
in interface Settings.Change
changedSettings
- the keys of this map are the settings that have
changed since the last time this method was
invoked, and the values are the old values for
those changed settingsSettings.register(Class,Settings.Change,Set)
,
Settings.register(Settings.Change)
protected Decoder getNewDecoder(int id, DecoderServerRemote server)
protected static DecoderServer getNewDecoderServer(String derivedDataFilename) throws ClassNotFoundException, NoSuchMethodException, InvocationTargetException, IllegalAccessException, InstantiationException
DecoderServerRemote
instance).
derivedDataFilename
- the name of the derived data file to pass to the
constructor of DecoderServer
that takes
this name as an argument
ClassNotFoundException
- if the class specified by Settings.decoderServerClass
cannot be
found
NoSuchMethodException
- if the class specified by Settings.decoderServerClass
has no
constructor that takes a single String
InvocationTargetException
- if the constructor of the class specified by
Settings.decoderServerClass
(the
invocation target) throws an exception
(which will be wrapped in this type of
exception)
IllegalAccessException
- if this class is not allowed to access the
single-string constructor of the class
specified by Settings.decoderServerClass
InstantiationException
- if there's a problem instantiating a new
instance of the class specified by Settings.decoderServerClass
protected void getServer() throws RemoteException
getServer
in class AbstractClient
RemoteException
AbstractClient.server
protected void tolerateFaults(int retries, int sleepTime, boolean failover)
AbstractClient
AbstractClient.retries
,
AbstractClient.sleepTime
and AbstractClient.failover
data members. This allows
the AbstractClient.reRegister()
method to properly re-wrap a new server
when it gets one after a switchboard failure. If this method is called
with retries == 0
and failover == false
,
then it simply returns, having done no proxy wrapping.
N.B.: This method re-assigns the protected
data member server
. If subclasses have cached a reference
to the server in a local data member, they should override this method
such that this implementation is called
(super.tolerateFaults(...)
) and then the server re-cached,
as shown in the following example code:
public class MyClient extends AbstractClient { // local reference to the server of a type implemented // by the actual (concrete) servers in a particular distributed system private MyServerInterface server; protected void tolerateFaults(int retries, int sleepTime, boolean failover) throws RemoteException { super.tolerateFaults(retries, sleepTime, failover); server = (MyServerInterface)super.server; } ... }
tolerateFaults
in class AbstractClient
retries
- the number of times to re-try the server in the event
of failure; a value of Retry.retryIndefinitely
will cause the proxy to re-try indefinitelysleepTime
- the time, in milliseconds, to sleep between retriesfailover
- indicates whether to wrap the server in a
failover proxyAbstractClient.server
public Sexp parse(SexpList sent) throws RemoteException
sentContainsWordsAndTags(SexpList)
.
Settings.constraintSetFactoryClass
setting.
sent
- the sentence to be parsed
null
if no parse could be produced using the current model
RemoteException
protected Sexp convertUnknownWords(Sexp tree, IntCounter currWordIdx)
Settings.keepAllWords
is true
, then only words that were
never observed in training are converted; otherwise, only words that were
observed less than the the
unknown word threshold are converted. This method intentionally performs
the same conversion of words as would be performed by DecoderServerRemote.convertUnknownWords(danbikel.lisp.SexpList)
. In fact,
it uses
DecoderServerRemote.convertUnknownWord(danbikel.lisp.Symbol,int)
as a helper method.
tree
- the tree whose low-frequency or unobserved words are to
be converted to feature vectorscurrWordIdx
- the threaded word index of the sentence
protected ConstraintSet getConstraintsFromTree(Sexp tree)
ConstraintSets.get(Object)
.
tree
- the tree from which to construct a set of parsing constraints
protected boolean sentContainsWordsAndTags(SexpList sent)
( <wordTagList>* ) where | ||
<wordTagList> | ::= | ( <word> <tagList> ) |
<word> | ::= | a symbol representing a word in the sentence to be parsed |
<tagList> | ::= | ( <tag>+ ) |
<tag> | ::= | a symbol that represents a possible part-of-speech tag for the word with which it is associated |
((John (NNP)) (sat (VBD VB)) (. (.)))Typically, only a single tag is supplied with each word.
sent
- the sentence to be tested
protected boolean wordTagList(Sexp sexp)
sexp
- the S-expression to be tested
protected SexpList getWords(SexpList sent)
sentContainsWordsAndTags(SexpList)
method.
sent
- the sentence whose words are to be extracted into a new list
sentContainsWordsAndTags(SexpList)
methodprotected SexpList getWordsFromTree(Sexp tree)
tree
- the tree from which to extract a list of its words
protected SexpList getWordsFromTree(SexpList wordList, Sexp tree)
wordList
- the list to which the words of the specified tree
are to be addedtree
- the tree from which to extract word symbols
wordList
object, having been modified
so that each word symbol from the specified tree was added to its endprotected SexpList getTagListsFromTree(Sexp tree)
tree
- the tree from which to extract a list of part-of-speech tags
protected SexpList getTagLists(SexpList sent)
sentContainsWordsAndTags(SexpList)
.
sent
- the sentence from which to extract tag lists
sentContainsWordsAndTags(SexpList)
protected Object process(Object obj) throws RemoteException
SexpList
.
process
in class AbstractClient
obj
- the SexpList
to parse
SexpList
, or null
if no parse was possible under the
current model
RemoteException
parse(SexpList)
protected void switchboardFailure()
System.err
as an emergency backup (in case processing took a long time and
it is highly undesirable to lose the work).
switchboardFailure
in class AbstractClient
protected static int getTimeout()
Settings
.
Settings.sbUserTimeout
protected static int getRetries(int defaultValue)
Settings.serverMaxRetries
, or the
specified fallback default value if that property does not exist.
defaultValue
- the fallback default value to return if Settings.serverMaxRetries
does not exist
Settings.serverMaxRetries
, or the
specified fallback default value if that property does not existprotected static int getRetrySleep(int defaultValue)
Settings.serverRetrySleep
, or the
specified fallback default value if that property does not exist.
defaultValue
- the fallback default to return if Settings.serverRetrySleep
does not exist
Settings.serverRetrySleep
, or the
specified fallback default value if that property does not exist.protected static boolean getFailover(boolean defaultValue)
Settings.serverFailover
, or the
specified fallback default value if that property does not exist.
defaultValue
- the fallback default value to return if Settings.serverFailover
does not exist
Settings.serverFailover
, or the
specified fallback default value if that property does not existpublic void run()
internalInputFilename
is null
, then this method invokes
AbstractClient.processObjectsThenDie()
; otherwise, this method processes internalInputFilename
and outputs to internalOutputFilename
by
invoking processInputFile(String,String)
.
run
in interface Runnable
protected void setInternalFilenames(String inputFilename, String outputFilename)
internalInputFilename
and internalOutputFilename
members of this parsing client to the specified values.
inputFilename
- the name of the input file to processoutputFilename
- the name of the output file to createprotected void processInputFile(String inputFilename, String outputFilename) throws IOException
inputFilename
- the input file to processoutputFilename
- the output file to create
IOException
- if there is a problem creating the input file stream or
writing to the created output file streamprotected static void checkSettings(Properties sbSettings)
System.err
when a current setting differs.
sbSettings
- the settings to compare with the current run-time
settingspublic static void setSettingsFromSwitchboard(SwitchboardRemote sb) throws RemoteException
Switchboard
instance and sets
to be the current run-time settings.
sb
- the switchboard from which to grab settings for this client
RemoteException
protected static Parser getNewParser(String derivedDataFilename) throws NoSuchMethodException, InvocationTargetException, IllegalAccessException, InstantiationException
String
constructor using the specified derived data filename as the argument. The
run-time type of the returned parsing client will be equal to the value of
parserClass
member.
derivedDataFilename
- the derived data filename with which to
construct a new parsing client instance
NoSuchMethodException
- if the class specified by parserClass
does not have a constructor
that accepts a single String
as
its argument
InvocationTargetException
- if the constructor of the class specified
by parserClass
throws an
exception
IllegalAccessException
- if the constructor of the class specified
by parserClass
cannot be
accessed
InstantiationException
- if there is a problem instantiating a new
instance of the class specified by parserClass
protected static Parser getNewParser(int timeout) throws NoSuchMethodException, InvocationTargetException, IllegalAccessException, InstantiationException
int
constructor using the specified timeout value as the argument. The
run-time type of the returned parsing client will be equal to the value of
parserClass
member.
timeout
- the timeout value for the client- and server-side sockets
of this RMI object
NoSuchMethodException
- if the class specified by parserClass
does not have a constructor
that accepts a single String
as
its argument
InvocationTargetException
- if the constructor of the class specified
by parserClass
throws an
exception
IllegalAccessException
- if the constructor of the class specified
by parserClass
cannot be
accessed
InstantiationException
- if there is a problem instantiating a new
instance of the class specified by parserClass
public static File getFile(String filename)
File
object for the specified filename, or
null
if the specified file does not exist. An error
message will be output to System.err
if the specified
file does not exist.
filename
- the filename for which to create a new File
object
File
object for the specified filename, or
null
if the specified file does not exist.public static File getFile(String filename, boolean verbose)
File
object for the specified filename, or
null
if the specified file does not exist.
file does not exist.
filename
- the filename for which to create a new File
objectverbose
- indicates whether to output an error message to
System.err
if the specified file does not exist
File
object for the specified filename, or
null
if the specified file does not exist.public static void main(String[] args)
java danbikel.parser.Parser -helpfor complete usage information. Input file formats: Input files processed by this class when it is in its stand-alone mode must contain a series of S-expressions, where each S-expression represents a sentence to be parsed. There are three acceptable input formats for these S-expressions, described in the documentation for the
parse(SexpList)
method.
|
Parsing Engine | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |