|
|
|
This memo discusses some problems (and solutions) related to network I/O:
Broken pipe signals - this section should be read by everyone who writes programs that perform network I/O.
Non-blocking I/O - this section describes a new TPOCC library package, QIO_UTIL, that makes asynchronous, non-blocking, network I/O relatively painless. The QIO_UTIL package allows a program to queue output requests to a network socket without having to wait for the actual writes to complete. Most TPOCC programs and TPOCC-based applications need this capability, whether they know it or not. For example, momentary bottlenecks in the events subsystem should not block a real-time, EKG-monitoring program in the middle of logging an event message. This section should at least be read by everyone who interfaces with the data server.
"I don't feel like telling you everything because I don't feel like telling you anything."- My 5-year old's response to the question, "How was school today?"
If, in the midst of your debugging, you come across a problem that could conceivably occur in programs other than your own, and, better yet, if you come up with a solution, please share it with everyone else.
For some time now, Steve has been having problems with the data server
mysteriously disappearing while he was debugging the display program.
Steve always figured it was the data server's fault; I always figured it
was Steve's fault. Just my luck - Steve gets the gold star this time! We
eventually figured out that, if the data server was in the middle of
writing data to the display program when the display program was terminated
(by exiting the debugger), the data server would exit because of a
SIGPIPE
signal.
More generally, attempting to write to a network connection that has gone
down (e.g., because the process on the other side broke the connection)
causes a SIGPIPE
(broken pipe) signal to be generated.
SIGPIPE
is one of those signals that, if not handled, silently
aborts a program. If your server program must survive broken connections
to clients, it should install a handler function to catch
SIGPIPE
signals. Installing a signal handler is accomplished
by a call to signal(2)
in your main routine:
#ifdef VXWORKS # include <sigLib.h> /* Signal definitions. */ #else # include <signal.h> /* Signal definitions. */ #endif extern void my_handler () ; /* External functions. */ ... main () { ... /* Install a handler function to field broken pipe signals. */ signal (SIGPIPE, my_handler) ; ... }
The signal handler function itself can be very simple:
#ifdef VXWORKS # include <sigLib.h> /* Signal definitions. */ #else # include <signal.h> /* Signal definitions. */ #endif void my_handler (sig, code, scp, addr) int sig, code ; struct sigcontext *scp char *addr ; { #ifdef SYSV /* Reinstall the signal handler. */ signal (sig, my_handler) ; #endif }
Note that, under System V UNIX (e.g., HP/UX), a signal handler must be reinstated each time its signal is raised.
You shouldn't depend upon SIGPIPE
signals to detect broken
connections. SIGPIPE
s are only generated when a
write(2)
is attempted on a broken connection. Trying to
read(2)
a broken connection doesn't generate a signal; it
simply returns zero bytes of input. This information is currently used by
TPOCC and TPOCC-based programs to detect broken network connections: if a
select(2)
call indicates the connection has data to read, but
read()
can't find anything, the connection must have gone
down.
Why hasn't the broken pipe problem been more evident? First, programs
typically spend most of their time waiting for input; consequently, broken
connections are more likely to be detected during a
select()
/read()
sequence than during a
write()
. Second, the problem has probably been occurring more
frequently than we realize. TSTOL has been known to quietly exit on
occassion; in retrospect, the symptoms point to broken pipe signals.
The subject of broken pipes rang a bell in my head - Steve thinks I have
lots of bells ringing in my head - so I ran grep(1)
on the
TPOCC source directory tree, looking for SIGPIPE
. Sure
enough, I found it. The programs in the events subsystem all have broken
pipe signal handlers. A lot of hair-pulling and hand-wringing on Steve's
part could have been avoided (Pete: "Can we vote on this?") had this
information been more widely publicized.
"The seaweed's always greener in someone else's lake."- Sebastian the Crab, in The Little Mermaid.
One of the benefits of working with UNIX is that you find out that the
seaweed really is greener in the VMS lake. (For those of you who
don't know any better, VMS is the operating system of choice for VAX
computers.) We recently discovered that the display program and the data
server are prone to deadlock when several pages are up on the screen and
Display tries to bring up another one. The data server is too busy
outputting DATA
packets to read ADD
requests,
while Display is too busy outputting ADD
requests to read
DATA
packets. They both block on write(2)
s when
their network buffers fill up:
DATA Display <---------- Data Process ----------> Server ADD
The smug grin spreading across my face was cut short by the realization
that knowing that VMS provides excellent support for asynchronous,
non-blocking I/O doesn't make up for UNIX's shortcomings. To remedy the
situation, a new package of routines, qio_util.c
, has been
added to TPOCC's libutilgen library. This package simulates the VMS
queued I/O (QIO) facility. Although the functions in TPOCC's QIO package
perform raw network I/O, they are easily layered underneath XDR for
applications that communicate using that protocol.
Under VMS, writing to a channel (e.g., a file or a network connection) results in a write request being queued up to the device driver:
Device I/O I/O I/O Application Driver <---- Request Request ... Request <---- Program ^ ^ | | Front of Rear of Queue Queue
User-level function calls add read and write requests to the end of the queue. When finished with an I/O request, the device driver pops the next one from the front of the queue and begins processing it.
Once an I/O request is issued by a program (through a FORTRAN
WRITE
statement, for example), the program can do one of two
things:
Synchronous I/O - wait for the I/O request to complete before continuing with processing.
Asynchronous I/O - continue processing immediately.
In the case of asynchronous I/O, a program may be notified of the
completion of an I/O request by the setting of an event flag
(similar to a UNIX semaphore) or by the invocation of a user-specified
asynchronous trap (AST) function. An AST is like a UNIX signal; an
AST function for asynchronous I/O is like a UNIX SIGIO handler. AST
functions are specified on a per-request basis, so a program can have
different AST functions for different channels (or even for different
requests queued to the same channel). UNIX's SIGIO
signal, on
the other hand, is raised when I/O on any channel completes; a
SIGIO
handler must typically poll all of a program's open
channels to find out which one generated the signal.
Okay, now that you know more about VMS than both Steve and Paul put
together, how does this apply to TPOCC? Suppose the data server could queue
up output requests for DATA
packets and go on about its
business, such as reading ADD
commands from Display.
Likewise, suppose Display could queue up ADD
commands and get
back to what it has to do: read and display DATA
packets
received from the data server. The result: no blocking and no deadlock in
the scenario described at the beginning of this section.
The new TPOCC QIO utilities provide some of the aforementioned VMS capabilities and are implemented so that they can be incorporated into existing programs with only minimal changes to the target programs. For example, the data server was modified to use the QIO utilities by:
qio_init()
to the main routine.
qio_flush()
in the server's
select(2)
loop.
qio_configure()
after a new client's
connection request is answered.
xll_write()
in the data server's
low-level XDR output routine by a call to qio_write()
.
15 minutes work for someone on a VT 320; it may take longer on an X Windows console.
Now, when a client of the data server (e.g., Display) cannot immediately read what the data server is sending, the sampled data packets pile up in the data server's output queue. Once the client begins reading again, the pent-up packets are flushed to the network.
Since the TPOCC QIO utilities are new code, they probably need to be subjected to a formal walkthrough. Just one problem - who is qualified to pass judgement on this VMS-inspired code? Don "DEC? Bleecch!@#?!" Slater and Steve "if not Gould then HP" Gibson are misfits - oops, I mean - unfit for this task. Linda, Miriam, and Pete are possible candidates; Pete would qualify on his Amiga experience alone.
Making use of TPOCC's new QIO utilities is very easy. You need to:
qio_init()
- Initialize the QIO package's internal data structures. This step is optional, since these structures are pre-initialized, static variables.
qio_configure()
- After you open a connection (via
net_answer()
ornet_call()
, for instance) that is to use QIOs, configure the connection for buffered, non-blocking I/O.qio_write()
- Rather than directly writing to a configured connection, queue up a write request on that connection.
qio_flush()
- Periodically check for and attempt to complete I/O requests that haven't been completed yet.
For example, the following program queues up and writes 100 messages to the tpocc_display network server:
#include <stdio.h> /* Standard I/O definitions. */ main () { /* Local variables. */ char buffer[64] ; int connection, i ; /* Contact the network server and configure the connection for buffered, non-blocking I/O. */ net_call (NULL, "tpocc_display", &connection) ; qio_configure (connection, 1, 1, -1) ; /* Queue up 100 messages for output. */ for (i = 0 ; i < 100 ; i++) { sprintf (buffer, "%d Motifs do not an Open Look make.", i) ; qio_write (connection, buffer, strlen (buffer), NULL, NULL) ; } /* Wait until all the messages have been output to the network. */ while (qio_pend () > 0) qio_flush () ; }
qio_pend()
is a function that returns the number of pending
I/O requests. Error checking is not shown in the example above, but most
of the QIO functions (as well as net_call()
) return function
values of zero if no errors occurred and ERRNO if one did.
Note that qio_flush()
must be called periodically in order to
flush any uncompleted I/O requests. This was no problem for the data
server, which has to wake up every tenth of a second anyway; the call to
qio_flush()
was just added to the data server's
select()
loop. Other programs may find it a little more
difficult.
When you use the QIO functions, be sure and read their prologs (in
qio_util.c
in TPOCC's libutilgen library).
qio_configure()
deserves special mention here. It has 4
arguments:
channel
- is the UNIX file descriptor for the output device (e.g., a network connection).
is_nonblocking
- specifies whether or not I/O on this channel will block if a read or write is attempted and the channel is not ready.
is_buffered
- specifies whether or not
qio_write()
should buffer the data for your requests.max_outstanding
- specifies the maximum number of outstanding I/O requests the channel's queue will hold. -1 means there is no limit.
If you designate a channel as non-blocking, qio_configure()
will automatically configure the UNIX file descriptor for non-blocking I/O
via an ioctl(2)
system call. The QIO package also works with
blocking I/O, although there is a performance penalty, since
qio_flush()
must call select()
to see if a
connection is ready before attempting a read()
or
write()
. (Incidentally, VxWorks supports non-blocking I/O;
furthermore, the QIO utilities have been tested under VxWorks, HP/UX, and
SunOS.)
If a write request is queued to a channel marked as "buffered",
qio_write()
will malloc(3)
a "system" buffer for
the user's data; the caller can then reuse the buffer passed into
qio_write()
, without having to wait for the QIO to complete.
If a connection is "unbuffered", the user must perform his or her own
buffering. If the dynamics of your system are such that buffering could
lead to excessive memory usage, the max_outstanding
parameter
can be used to limit the number of uncompleted I/O requests in a queue.
The sample program shown on the previous page performed raw
write()
s to the network. Most of the TPOCC-based programs,
however, use the XDR protocol to exchange data. Fortunately, the QIO
package can be "layered in" underneath the XDR calls our programs make. To
show how this is accomplished, the changes to the data server are shown
below.
First, whenever a new network connection is opened, you must call
qio_configure()
to configure the socket for buffered,
non-blocking I/O. In the data server, this is done immediately after the
net_answer()
in new_client.c
:
/* Answer connection request from client. */ if (net_answer (server, 99, &sock, &clnt_sock)) { vsend_event (SCKT_ANSW_ERR, "answering", NULL) ; return (NULL) ; } /* Configure connection for queued I/O. */ if (qio_configure (clnt_sock, 1, 1, -1)) { vsend_event (SCKT_ANSW_ERR, "configuring", NULL) ; close (clnt_sock) ; return (NULL) ; }
Next, the low-level write routine called by the XDR functions must be
changed to issue QIOs instead of writing directly to the network. The data
server's low-level write function, writedstcp()
(defined in
new_client.c
), originally looked as follows:
static int writedstcp (client, buf, len) struct client_data *client ; char *buf ; int len ; { return (xll_write (client->client_sock, buf, len, 0, &client->client_error.re_status, &client->client_error.re_errno)) ; }
In the new version of the function, the call to xll_write()
was replaced by a call to qio_write()
:
static int writedstcp (client, buf, len) struct client_data *client ; char *buf ; int len ; { if (qio_write (client->client_sock, buf, len, NULL, NULL)) { vperror ("(writedstcp) Error queueing write request for %d bytes to channel %d.\nqio_write: ", len, client->client_sock) ; client->client_error.re_status = RPC_CANTSEND ; client->client_error.re_errno = errno ; return (-1) ; } client->client_error.re_status = RPC_SUCCESS ; client->client_error.re_errno = 0 ; return (len) ; }
Now, whenever an XDR record is ready to be sent, QIOs are issued for the
data. As mentioned earlier, the data server periodically calls
qio_flush()
to actually write the data to the network. The
call to qio_flush()
occurs in the data server's
select()
loop (in data_server.c
):
while (TRUE) { ... construct read mask and set timeout for 1/10 second ... switch (select (FD_SETSIZE, &read_mask, ...)) { case 0: ... sample and send any data that is ready to be sent ... qio_flush () ; continue ; case -1: ... error ... } ... check for, read, and process commands from clients ... }
In addition to the QIO functions shown in the previous sections, the QIO
package has a number of other public functions; a complete list is
presented below. qio_read()
and qio_seek()
were
added for the sake of completeness; they are untested and I'm not sure how
or if qio_read()
could be used in conjunction with XDR.
qio_termc()
should be called when a connection is closed.
qio_init()
- initializes QIO_UTIL's internal data structures.
qio_configure()
- configures a previously-opened channel for buffered/unbuffered, blocking/non-blocking I/O.
qio_read()
- adds a read request to a channel's I/O queue.
qio_seek()
- adds a seek request to a channel's I/O queue.
qio_write()
- adds a write request to a channel's I/O queue.
qio_flush()
- attempts to complete QIOs pending on any channel.
qio_flushc()
- attempts to complete pending QIOs in a specific channel's I/O queue.
qio_pend()
- returns the number of QIOs pending on all channels.
qio_pendc()
- returns the number of QIOs pending on a specific channel.
qio_term()
- deletes all of the I/O queues and any pending QIOs in those queues.
qio_termc()
- deletes a specific channel's I/O queue and any pending QIOs in that queue.
qio_ast()
- is a sample AST function, invoked when an I/O request completes.
qio_dump()
- dumps the contents of the I/O queues to standard output.
A global debug flag, qio_util_debug
, can be set to enable
debug output from these functions, including a dump of all data read from
or written to the network. qio_init()
resets the debug flag,
so be sure and enable debug after calling qio_init()
.
Under VAX/VMS, when an I/O request completes, either successfully or
unsuccessfully, the program can be notified in one of two ways: by an
asynchronous trap (AST) or by the setting of an event flag. An AST is
basically a software interrupt that invokes an interrupt handler. The
TPOCC QIO package supports an AST-like mechanism by allowing a program to
specify an AST function when qio_read()
,
qio_write()
, or qio_seek()
are called. For
example, a write request is issued by the following call:
extern void AST_function () ; ... qio_write (channel, buffer, length, AST_function, AST_argument) ;
AST_function
is a pointer to an AST function, which can be:
NULL
, if no AST function is to be invoked, or
AST_argument
is an arbitrary, user-specified argument, cast as
a VOID *
pointer, which will be passed to the AST function.
When the write requested by the call above is completed,
qio_flush()
calls AST_function
, passing it
AST_argument
, as well as several other arguments. An AST
function should be defined as follows:
void AST_function (error_code, channel, operation, buffer, length, AST_argument) char *buffer ; int channel, error_code, length, operation ; void *AST_argument ; { ... }
Descriptions of the arguments can be found in the prolog for
qio_ast()
, a sample AST function in qio_util.c
which prints out a debug message when an I/O operation completes. An AST
function is free to do pretty nearly anything it wants to do. "Event
flags" can be simulated by specifying an AST function that signals a
semaphore.
Clients such as the TPOCC display program interface with the data server
through TPOCC's data services library, libds.
submit_cmd()
is called to send data requests to the data
server; request_data()
is called to receive sampled data from
the data server. Although the data server has been modified to use queued
I/O for outputting data, it might be a good idea if data clients themselves
used QIOs for sending commands to the data server. For example, the
display program will block trying to write ADD
commands if the
data server is not reading them fast enough; this bodes ill for user
responsiveness.
Fortunately, there is a solution. The data services library has been upgraded to allow an application to choose between:
Selecting one or the other is as simple as setting a global flag,
libds_use_qios
, to true (non-zero) or false (zero). Of
course, if an application decides to use QIOs for data services, it must
remember to periodically call qio_flush()
.
To illustrate the modifications that need to be made to a program, the
following discussion shows how TPOCC's display program could be modified to
use QIOs when sending commands to the data server. First, in
xtpdsp.c
(Display's main routine), the following lines need to
be added to set the "use QIOs" flag:
extern int libds_use_qios ; ... libds_use_qios = 1 ; /* Tell the data services library to use QIOs. */
Next, qio_flush()
must be called periodically.
Periodic()
, in Display's DataComm library, is the
obvious place for this call, nestled in between the calls to
XtpPollData()
and XFlush()
:
void Periodic (display, id) ... { XtpPollData () ; qio_flush () ; ... XFlush (display) ; ... }
Done! And in less time than it takes an X Windows programmer to tell you
how busy he is! Note that the call to qio_flush()
in the
periodic function will not affect display programs that don't use QIOs;
since there would be no I/O queues, qio_flush()
would return
immediately.
They don't work. Actually, the QIO utilities work fine with files, it's
just that non-blocking I/O doesn't seem to mean anything with respect to
files. If you queue up many writes to a file, a single call to
qio_flush()
will flush them all. Sorry! (The new HP/UX adds
an asynchronous file I/O capability.)
What else can be done with the QIO utilities? The event logging interface
(vsend_event()
, etc.) ought to be converted to use QIOs for
writing messages to the event logger. There are only two ways of doing
this, as far as I can see:
qio_flush()
.
vsend_event()
call qio_flush()
after
"sending" an event message.
Lots of programs would have to be modified to support Option #1. Option #2
would not affect existing programs, but, if an event message could not be
sent on one call to vsend_event()
, it would not get
transmitted until the next call.
Since stream_svr()
, in TPOCC's libutilgen library, is
so widely used, I haven't yet modified it to use QIOs. This function
already has a "non-blocking?" argument which could be used to enable queued
I/O. Any objections? (stream_svr()
's non-blocking feature is
#ifdef
ed for Sun OS only; I don't think anyone has ever used
it.)
Bill Stratton wrote some non-blocking XDR functions
(xnb_util.c
) that, unlike the QIO functions, throw away XDR
records that can't be written without blocking. This capability could
possibly be simulated by a function, qio_prune()
, which simply
deletes all I/O requests except the current one from a queue. As with the
XNB utilities, there is no way to guarantee that a write request covers a
whole record and nothing but the record. By the way, Bill, we're glad to
see you doing something useful again - it's been a long time since we've
seen the likes of your flat list and hash routines!
The QIO utility package was written kind of off the cuff, so if you have any suggestions, comments, or questions, please let me know.