Using Utool

3 Using Utool

As you have seen in the tutorial, the Ubench GUI is rather straightforward to use. The tutorial has walked you through most of its functionality, and while it is by no means a complete documentation and there are various other aspects that we have not mentioned, it is probably enough for you to find your way around the rest by yourself.

However, the command-line and server modes of Utool are less self-explanatory. This chapter documents how to use them in more detail.

In the current release, Utool supports six commands: solve, solvable, classify, convert, server, and display. Each of these commands can be used from the command-line or in the Utool Server (but running the server command in the Utool Server doesn't do anything). In addition, Utool supports several auxiliary pseudo-commands, which display help information.

3.1 Command-line interface

When Utool is run from the command-line, it executes the single command you specify on the command line and then exits. For instance, the following call executes one solvable command and outputs some information about it:

$ utool solvable -s examples/chain3.clls

Commands typically require arguments (in the example, the filename chain3.clls of the USR that we want to solve), and accept certain options (here, the -s option, which instructs Utool to display statistical information). These arguments and options are written after the command on the command line.

3.2 Server mode

Alternatively, you can run Utool in server mode. In this mode, it will not execute any commands at first. Instead, it will accept network connections on a specific socket. Each time it is sent a command on this socket, it will execute this command, send the results back over the socket, and then close the socket. But the same Utool process keeps running and can execute many commands during its lifetime.

You start Utool in server mode by making Utool execute the server command:

$ utool server

Alternatively, you can also click the server button in the top right corner of a Ubench window to start or stop a server thread. The server doesn't output anything on the console on which you started it – unless you specified the --logging command-line option and no name for the logfile, in which case it will write some information to standard error.

By default, the server will listen for socket connections on port 2802, but the port can be specified using the --port option. The protocol for communicating with the server is as follows:

The client sends a command in an XML element of the following form:
```
<utool cmd="..." (more options)>
  (arguments of the command)
</utool>
```
The command, options, and arguments in this XML element correspond to the command, options, and arguments of a single run of the command-line tool.
Once you have transmitted a complete utool element (i.e., after the closing </utool> tag), the Utool Server processes the command. You may optionally close your side of the socket using the shutdown function to notify the server that you're finished writing, but this is no longer necessary. Notice that you don't want to close the socket yet at this point.
If the command is executed successfully, the server will respond with a message of the following form:
```
<result .... />
```
The particular attributes of this element depend on the command, and are described below. If an error occurred, the response will be a message of the following form instead:
```
<error code="..." explanation="..." />
```
Here the “code” attribute will be a numeric error code, and the “explanation” attribute will be a plain-text explanation of the error that occurred.
The server closes the socket.

If you just use Utool for a single call or experimentation, the command-line mode will typically be easier to use. However, if you need to process many Utool commands from a programme, it can be dramatically more efficient to keep a single Utool Server process running and send the commands to the server (see Section ??). We present a demo client for the Utool Server in Section 5.242Writing your own client for the Utool Serversubsection.5.2.

Notice that you'll need to use XML character entities when you send your USR to the server, and conversely decode the result that you get back. See Section ?? for some more hints on this.

3.3 Passing USRs

Most commands require the user to specify an USR that should be processed.

In command-line mode, you pass a specification of the USR as a command-line argument. If this specification starts with ex:, it is taken to represent one of the USRs that come bundled with Utool. For instance, when we used the specification ex:holesemantics-14.hs.pl in the tutorial, this instructed Utool to look for an example with the name holesemantics-14.hs.pl in the Jar file.

If the specification doesn't start with ex:, then it is up to the input codec how to interpret it. Most codecs take it to be a filename, and will attempt to read the USR from the file with this name. This is what happens when you unpack the Jar file and then execute utool solvable examples/chain3.clls. The only codec that is currently distributed with Utool that interprets its input specification differently is the chain codec, which interprets the string as the numeric chain length and not as a filename.

In server mode, you can't pass a filename because the server may run on a different machine than the client and may not have access to its filesystem. Instead, you pass the USR directly as an attribute of a usr element that is embedded into the utool element, like so:

<utool cmd="solvable">
  <usr codec="domcon-oz" string="[label(x f(y)) ...]" />
</utool>

The ex: syntax is not available in server mode. Notice that the attribute values must be valid XML attribute strings. This means that you must replace special characters by their respective character entities (see also Section 540Utool in Practicesection.5 for tips on this).

3.4 Exit codes

Each execution of a Utool command returns an exit code. The command-line version of Utool will return this as the programme exit code, which you can access e.g. in the $? variable in a Bash shell. The server version will return the exit code in the code attribute in the responses for many commands (and always when it reports an error).

Exit codes are numbers between 0 and 255. They are split up into ranges with different meanings as follows:

exit codes	meaning
0 – 127	command was executed successfully
128 – 191	an error occurred in the main programme
192 – 223	an error occurred in the input codec
224 – 255	an error occurred in the output codec

The exit codes for successful termination of a command are documented with the commands below. Among the codec error codes, the code 192 is special because it always signifies a parsing error in the input codec. The codes between 193 and 223 denote semantic errors in the input codec; they and all output codec error codes are documented with the codecs in Section 428Codecssection.4. The error codes for the main programme are as follows:

exit code	meaning
128	file I/O error
129	network I/O error
130	Ubench encountered an error while laying out a graph
140	error while configuring an (XML) parser
141	the command you specified was not recognised
142	error while registering a codec
143	error while parsing a builtin example
150	you didn't specify a USR, but the command requires one
151	you didn't specify an input codec, and Utool cannot guess it
152	there is no input codec of the name you specified
153	the input graph is not weakly normal or not compactifiable
160	you didn't specify an output codec, and Utool cannot guess it
161	there is no output codec of the name you specified
162	the specified output codec can't output multiple USRs
170	error while parsing an equivalence specification

3.5 The commands supported by Utool

We will now go through the six main commands and describe what each command does and what options it takes.

3.5.1 Solvable

This command converts the input USR into a dominance graph and checks whether this graph is solvable, i.e. has any solved forms. Linguistically, this corresponds to checking whether the sentence has any readings (ideally, it should!).

Solvability is determined by computing a dominance chart as described in [[]Koller and Thater2005]. This is typically much more efficient than actually solving the graph, i.e. enumerating its solved forms, because the chart is exponentially smaller than the set of all solved forms. This command only makes a yes/no decision about solvability and thus doesn't have to enumerate all the solved forms; it does, however, compute the total number of solved forms based on the chart. If you want the individual solved forms, see the “solve” command below.

Result.

In command-line mode, the Utool process will terminate with an exit code of 1 if the graph was solvable. It will terminate with an exit code of 0 if it wasn't.

In server mode, Utool will send a reply of the following form:

<result solvable='true' fragments='7' count='5' chartsize='10' time='30' />

The solvable attribute contains the string true if the graph was solvable, and false otherwise. The other attributes contain statistical information: the number of fragments of the graph, the number of solved forms, the number of splits in the chart, and the time in milliseconds it took to compute the chart.

Options.

The solvable command can take the following options:

input-codec: You can specify an input codec for this command. If you don't do this, the command-line version (but not the server) will try to guess the appropriate input codec from the filename extension if possible.

In command-line mode, specify the input codec with the option --input-codec <codecname> or -I <codecname>. In server mode, specify it as the codec attribute of the usr element.
input-codec-options: Some input codecs will accept options. For instance, the mrs-prolog input codec has an option that tells it whether you want to normalise the dominance graph it computes from an USR in MRS format. You typically want this, but you can switch off the normalisation by passing the value none in the normalisation codec option.

In command-line mode, you can specify input codec options with the option --input-codec-options <options>, where <options> is a comma-separated list of option-value pairs. In the example of the MRS codec above, you could pass the option --input-codec-options normalisation=none to switch off the normalisation. In server mode, you can set codec options by passing this comma-separated option list in the codec-options attribtue of the usr element.
display-statistics: You can make Utool display more detailed statistics when you call it on the command-line by passing the --display-statistics or -s option. All such statistics information will be written to standard error. This will do nothing in server mode; the server transmits statistical information in any case.
nochart: If you are only interested in checking solvability of an USR, you can speed up the answer by passing the --nochart option on the command line. In server mode, you achieve the same effect by passing true in the nochart attribute of the main utool element. This will make Utool compute an incomplete chart which is sufficient for determining solvability, but not for computing the number of solved forms. As a consequence, Utool will not print or return any statistics about chart size or number of solved forms.

3.5.2 Solve

This command converts the input USR into a dominance graph, computes the solved forms of this graph, and outputs them using the given output codec. It computes a dominance chart, and if the graph was solvable, proceeds to enumerate all solved forms.

Result.

In command-line mode, Utool will output all solved forms. By default, it will write them to standard output, but there is a command-line option for redirecting them into a file. This command will always terminate with an exit code of 0 if no errors occurred.

In server mode, Utool will send a reply of the following form:

<result solvable='true' fragments='7' count='5' chartsize='10'
        time-chart='30' time-extraction='100'>
  <solution string='....' />
  <solution string='....' />
</result>

The attributes of the result element are as for the solvable command, except that the runtime is now reported separately for computing the chart and for enumerating (extracting) the solved forms. The solutions are returned in attributes of solution elements below the result element. Notice that you may need to resolve XML character entities that were used in the attribute value strings.

Options.

The solve command can take the following options:

input-codec: see solvable
input-codec-options: see solvable
output-codec: You can specify the output codec which should be used to encode the solved forms. If you don't specify the output codec explicitly, the command-line tool will first try to guess the output codec from the output filename if you specify one. If this doesn't work, it will try to use the output codec with the same name as the input codec, if it exists. The server will not make such guesses and expects you to specify the codec explicitly in any case.

In command-line mode, specify the output codec with the option --output-codec <codecname> or -O <codecname>. In server mode, specify it by passing the codec name as the output-codec attribute of the main utool element.

Notice that the solve command potentially outputs more than one solved form. This means that the output codec you specify here must support the output of multiple solved forms. The utool -d command shows you for each output codec whether it does this.
output-codec-options: Some output codecs will accept options, in a way that works exactly the same as for input codecs. You can specify output codec options on the command line using the option --output-codec-options <options>. In server mode, you can pass the output codec options in the output-codec-options attribute of the main utool element.
output: By default, the command-line tool will write the encoded solved forms to standard output. You can override this by specifying an output file with the option --output <filename> or -o <filename>. This option is not applicable in server mode because it doesn't write into files anyway.
no-output: You can instruct Utool not to output the actual solved forms by specifying the “no-output” option. Utool will still compute all solved forms in this case, it will simply not output them. This can be useful for runtime measurements. Specify the option --no-output or -n on the command line. In server mode, you get this effect if you simply don't specify any output codec in the utool element.
display-statistics: see solvable

3.5.3 Convert

This command reads a USR and outputs it again. The point about this operation is that the input and output codecs may be different. This means that you can use it to convert USRs from one underspecification formalism to another (to the extent that this is supported by theory).

Result.

In command-line mode, Utool will output the USR using the specified output codec. By default, it will write it to standard output, but you can again redirect the output to a file. The command always returns an exit code 0 on successful completion.

In server mode, Utool will send a reply of the following form:

<result usr='....' />

The string in the usr attribute is the converted USR. Remember that you may need to resolve XML character entities that were used in the attribute value strings.

Options.

The convert command can take the following options:

input-codec: see solvable
input-codec-options: see solvable
output-codec: see solve, except that convert doesn't require that the output codec supports the output of multiple solved forms
output-codec-options: see solve
output: see solve
no-output: see solve
display-statistics: see solvable

3.5.4 Classify

The classify command checks whether a dominance graph belongs to certain particularly well-behaved classes. It currently recongises the following classes:

weakly normal: A dominance graph is weakly normal [[]Bodirsky etal.2004] if the tree edges form a forest and all dominance edges go into roots.
normal: A weakly normal dominance graph is normal [[]Althaus etal.2003] if all dominance edges also start in (unlabelled) holes.
compact: A weakly normal dominance graph is compact if all fragments have depth zero or one, i.e. no node has both incoming and outgoing tree edges.
compactifiable: Many weakly normal graphs that are not compact can nevertheless be made compact by removing internal nodes and labelled leaves of fragments, and adding tree edges from the roots to the holes. In particular, all normal graphs are compactifiable. If a graph is compactifiable, it is guaranteed that there is a one-to-one correspondence between the solved forms of the compactified graph and the solved forms of the original graph.
hypernormally connected: A normal graph is hypernormally connected [[]Koller etal.2003, []Koller2004] if each pair of nodes is connected by a simple hypernormal path, where a hypernormal path is an undirected path that uses no two dominance edges that are adjacent to the same hole. This graph class is a bit abstract, but immensely useful because these graphs (a) have a lot of nice structural properties, and (b) we believe that all graphs that are currently used in underspecification are or should be hypernormally connected [[]Fuchss etal.2004]. The translations of MRS and Hole Semantics into dominance graphs are only defined for USRs that translate into hypernormally connected graphs because they rely on the structural properties mentioned in (a).
leaf-labelled: A weakly normal graph is leaf-labelled [[]Koller etal.2003] if every (unlabelled) hole has an outgoing dominance edge. If a graph is both hypernormally connected and leaf-labelled, each of its solved forms has a configuration.

Result.

The classes to which a graph belongs are encoded as the bit-wise OR of the following values:

class	bit value
weakly normal	1
normal	2
compact	4
compactifiable	8
hypernormally connected	16
leaf-labelled	32

The command-line tool will return this value as its exit code upon successful completion.

The Utool Server will return a message of the following form:

<result code='63' weaklynormal='true' normal='true'
        compact='true' compactifiable='true'
        hypernormallyconnected='true' leaflabelled='true' />

Here code is the exit code described above, and the other attributes are either true or false.

Options.

The classify command can take the following options:

input-codec: see solvable
input-codec-options: see solvable
display-statistics: see solvable

3.5.5 Server

The server command starts a new Utool Server. By default, this server listens to new connections on port 2802, but you can specify a different port using the --port option. This command is ignored if Utool is already running in server mode; it can only be run from the command line.

Result.

This command doesn't terminate by itself; you have to shut down the server process by hand. Alternatively, if you have an open Ubench window (by running the display command), you can shut the server down by clicking on the server button in the top right corner.

Options.

The server command can take the following options:

port: By default, the server will listen on a socket on port 2802. You can specify a different port using the option --port <number> (or -p <number>), where <number> is the port number you want.
logging: Using this option, you can make the server log information about incoming commands and its responses. If you specify the option --logging (or -l) by itself, the log messages will be written to standard error. Alternatively, you can log the messages into a file by specifying the filename: --logging <filename> or -l <filename>.
warmup: If you pass the option --warmup to the server, it will solve a number of dominance graphs before doing anything else. The effect of this is to encourage the Java Virtual Machine to compile parts of the solver to native machine code, so when you solve USRs that you are actually interested in, Utool will do this faster and in a more predictable time. Utool will keep you informed about the warmup process, and will print the message “Utool is now warmed up” when the warmup is finished.

3.5.6 Display

The display command instructs Utool to display the input USR in the Underspecification Workbench (Ubench). If the Utool process has opened a Ubench window before, it will display the USR in a new tab of the same window; otherwise it will open a new Ubench window first.

Unlike the previous commands which accept USRs as input, the input USR argument is optional in the display command. This means that you may specify an input USR (in which case it is displayed right away), or you can call display without arguments. In this case, an empty Ubench window is displayed; you can still open USRs from the File/Open menu or by sending further display commands to the Utool Server.

Result.

This command doesn't terminate by itself; you have to quit Ubench by choosing the File/Quit menu entry or closing the main window.

Options.

The display command can take the following options:

input-codec: see solvable
input-codec-options: see solvable

3.6 Pseudo-commands

In addition to the six main commands, the command-line version of Utool will accept a number of “pseudo-commands”, which display help information. If you call Utool with a pseudo-command, it executes the pseudo-command and terminates immediately. You cannot specify both a command and a pseudo-command at the same time.

The following pseudo-commands exist:

help [command]: If you specify a command, this will display a brief help message for that command. Otherwise, it will display an overview over the possible commands.
--version: Displays the version of Utool.
--display-codecs or -d: Displays the installed codecs.
--help-options: Displays an overview over some frequently used options.

3.7 Advanced options

3.7.1 Redundancy Elimination

One classical challenge when working with scope ambiguities is that structurally different readings may be logically equivalent. This problem of “spurious ambiguity” is illustrated by sentences like the following:

A researcher of some company saw a sample of a product. For travellers going to Finnmark there is a bus service from Oslo to Alta through Sweden. (Rondane 1262) We quickly put up the tents in the lee of a small hillside and cook for the first time in the open. (Rondane 892)

Example (??) is an ambiguous sentence with fourteen different readings, but these readings are all logically equivalent because existential quantifiers in predicate logic can be permuted with each other without changing the interpretation. Examples (18Redundancy eliminationItem.6) and (424Redundancy EliminationItem.13) are sentences from the Rondane Treebank. The English Resource Grammar analyses them as having 3960 and 480 scope readings, respectively; but the first sentence is intuitively unambiguous, and the second one has two readings that differ in the relative scope of “the lee” and “a small hillside”. This surprisingly high number of readings comes from the fact that the ERG analyses all kinds of noun phrases, including proper names and pronouns, as scope bearing operators, and many of these don't in fact take scope. But the basic problem of spurious ambiguity has haunted semanticists ever since Montague's Quantifier Raising analysis.

Utool is able to efficiently eliminate spurious ambiguities to a limited degree. It implements the redundancy elimination algorithm described by [[]Koller and Thater2006], which assumes that (logical) equivalence is approximated as equivalence with respect to a system of term equations. For efficiency reasons, Utool when run as a main programme won't eliminate splits from the chart. Instead, it won't even add eliminable splits to the chart in the first place. However, the distribution includes classes for running a redundancy elimination on the complete chart (see the de.saar.chorus.domgraph.equivalence package).

In order to make Utool eliminate equivalences in an invocation from the commandline, you must pass it the --equivalences <equivfile> or -e <equivfile> option. Here <equivfile> is the name of a file which contains a specification of the equation system. In server mode, you pass an element <eliminate equations="..." /> as a child of the utool element, where the value of the equations attribute is the specification of the equation system.

An equation system specification is an XML document of the following form:

<?xml version="1.0" ?>
<equivalences style="(some name)">   
  .....
</equivalences>

You may nest two kinds of elements below the root element:

equivalencegroup: This element specifies a set of label-hole pairs that can all be permuted with each other. For instance, if you want to specify that existential quantifier in first-order logic can be permuted with each other, regardless of whether they are in each other's scopes or restrictions, you can use the following specification:
```
<equivalencegroup>
  <quantifier label="exists" hole="0" />
  <quantifier label="exists" hole="1" />
</equivalencegroup>
```
By contrast, universal quantifiers only permute with each other if they are plugged into each other's scope:
```
<equivalencegroup>
  <quantifier label="every" hole="1" />
</equivalencegroup>
```
You may have as many equivalencegroup elements as you like, and you may have as many entries in each equivalencegroup as you like.
permutesWithEverything: This element expresses that a certain quantifier permutes with every other quantifier. For instance, you can state that a proper name as analysed by the ERG permutes with every other quantifier as follows:
```
<permutesWithEverything label="proper_q" hole="1" />
```

The hole attribute specifies the index of the child that is used in the permutation system, starting at 0 for the leftmost child. Notice that redundancy elimination is only defined for compact dominance graphs in [[]Koller and Thater2006]. Utool automatically keeps track of the hole indices when it compactifies the input graph, so the compactification is transparent to the user. However, this translation step is not well-defined in the case of multiple holes below the same child of the labelled root.

An example equivalence system, which is appropriate for the MRSs generated by the ERG, and which we used for the evaluation in [[]Koller and Thater2006], is part of the Utool distribution (in examples/erg-equivalences.xml).

3.8 Options overview

We conclude this section with an overview over all options.

--input-codec <codecname> or -I <codecname>
(Server mode: codec attribute of the usr elements)
(applies to: solvable, solve, classify, convert)

Specify the input codec.
--input-codec-options <options>
(Server mode: codec-options attribute of the usr elements)
(applies to: solvable, solve, classify, convert)

If the selected input codec accepts options, use this option to specify them.
--output-codec <codecname> or -O <codecname>
(Server mode: output-codec attribute of the utool element)
(applies to: solve, convert)

Specify the output codec.
--output-codec-options <options>
(Server mode: output-codec-options attribute of the utool element)
(applies to: solve, convert)

If the selected output codec accepts options, use this option to specify them.
--no-output or -n
(Server mode: assumes this option if no output codec is specified)
(applies to: solve, convert)

Compute the output, but don't display it (useful for runtime measurements). If you specify this option, you don't need to specify the output codec (and if you do, it is ignored).
--output <filename> or -o <filename>
(Server mode: not applicable: the server doesn't write to an output file)
(applies to: solve, convert)

Write the output to the specified file, rather than to standard output. If you use this option and don't specify an output codec explicitly, Utool will try to guess the appropriate output codec from the filename extension.
--display-statistics or -s
(Server mode: not applicable: the server reports statistics information anyway)
(applies to: solvable, solve, classify, convert)

Display statistics information while executing the command, such as information about the graph classification, chart size, and runtimes. All statistics information is written to standard error.
--equivalences <equivfile> or -e <equivfile>
(Server mode: specify an element of the form <eliminate equations="..." /> as a child of the utool element)
(applies to: solvable, solve)

Run a redundancy elimination algorithm on the chart (see Section 3.7.124Redundancy Eliminationsubsubsection.3.7.1).
--nochart
(Server mode: pass the attribute nochart="true" in the main utool element)
(applies to: solvable)

Don't compute a complete chart to determine solvability. This is much faster, but makes it impossible for Utool to count solved forms.
--warmup
(Server mode: n/a)
(applies to: server)

Warm up the JVM by solving a number of USRs before accepting commands.