sup_util - "Server Up?" Utilities

The SUP_UTIL "Server Up?" functions provide a high-level interface to the VMS Lock Management facility for the purpose of determining if a server process is up.

A VMS lock is basically a shared data structure accessible from any computer in a cluster. With the SUP_UTIL functions, a server process gets and holds exclusive WRITE access (protected write mode in VMS terminology) to a named lock:

    sup_avail ("name", "options") ;

When sup_avail() is granted WRITE access to the lock, it stores the server's host computer name in the lock. The server retains exclusive WRITE access to the lock until it is explicitly released by a call to sup_unavail() or until the server process terminates and VMS automatically releases the lock.

A client process on any computer in the cluster can determine if the server process is up and running by the following call:

    sup_check ("name", &generation, &host) ;

If a host name is returned, the server process is up on that host. If NULL is returned, the server is not up (or isn't ready for "serving"). Any number of client processes can query one or more server's locks as often as they wish.

A generation count is kept in the lock and incremented whenever a new server comes up. A client can tell if its old server has been replaced by a new server by comparing the generation numbers returned by sup_check():

    char *host ;
    int  generation ;
    static  int  old_generation = 0 ;
    if (!sup_check ("name", &generation, &host) &&
        (host != NULL) && (generation != old_generation)) {
        ... new server ...

If a client process checks a server's status before the server process is up, the client creates the lock and flags its value (i.e., the server's host name) as invalid. Any client calling sup_check() will be told that the server is not up. When the server finally does come up, the lock value will become valid and be updated with the server's host name.

If a server process that is up goes down (voluntarily or involuntarily!), the lock value is automatically marked by VMS as invalid. Alternatively, a server process can explicitly signal its "down" state:

    sup_unavail ("name") ;


The server-up mechanism is implemented by a single lock that logically contains 3 items of information: a "server is up" flag, a generation count, and a host name. The "server is up" flag is not explicitly represented. Instead, if the clients come up before the server, a generation count of zero indicates that the server is not yet up. After the first server comes up, the generation count keeps incrementing and the "server is NOT up" condition is represented by the lock's value being flagged by VMS as invalid (when the server terminates or explicitly dequeues its exclusive access to the lock).

Note that the single-lock implementation assumes that the lock value is initialized to all zeroes when VMS creates the lock, so that, if a client creates the lock before the first server is up, the client sees a generation count of zero. Test runs exhibited this behavior, but I could find nothing in the VMS documentation that might validate this assumption. Ideally, VMS should (but doesn't) flag a new lock's value as invalid until the first update of the lock's value. Should my assumption prove to be false, I think the server-up mechanism could be reliably implemented using two locks: the first acting as a "server is up" flag and the second holding the generation count and host name.

When requesting a null lock on a resource, sup_check() uses the LCK$M_EXPEDITE modifier to avoid being blocked by outstanding sup_avail() requests for protected write access to the same resource. LCK$M_EXPEDITE was described in the VMS 5.0 System Services Reference Manual, but disappeared from the VMS 5.5 documentation. Although LCK$M_EXPEDITE is not defined in <lckdef.h>, a one-line MACRO program ("$LCKDEF") produced a value of 2048. Warning: LCK$M_EXPEDITE may be obsolete!

Public Procedures

sup_avail() - signals availability of a server process and advertises its host name.
sup_check() - checks the status of a server process and retrieves the host name of the computer on which the server is running.
sup_unavail() - announces that a server process is no longer available.

Source Files


Alex Measday  /  E-mail