C Packages

Implementing Ada-style Packages in C

Published in The C Users Journal, June 1992

[Magazine cover]

This article is adapted from my 1991 "Naming Conventions" memo. The techniques suggested in this article were gradually refined into a simple, but successful, form of object-based programming in C, as described briefly in my software library's design and coding conventions. Nothing original on my part, just practices shaped by seeing others' good code (especially Meng Lin's event logging library, as mentioned in my memo) and reading books like Bertrand Meyer's Object-Oriented Software Construction (see my OOP tutorial). The article is also available at the C/C++ Users Journal web site.

In an effort to bring NASA into the space age, our company was asked to build a generic satellite control center, readily adaptable to new missions. The POCC (Payload Operations Control Center) we designed and built makes use of X Windows to provide an up-to-date operator interface and networked UNIX workstations to provide some measure of hardware and software portability. Naturally, the POCC's software is written almost exclusively in C.

Naming conventions are a necessity on a large software project and, as part of the quality assurance program they set up for the customer, the main contractor (we're a subcontractor) established a set of standards for file and function names in our software. The standards can be boiled down into two basic rules: (i) each function has a 3-character prefix that identifies the subsystem of which it is a part and (ii) the name of a source file should match the name of the function within. The standards document, while acknowledging the common C practice of grouping related functions in a single file, encouraged programmers to only store one function per file.

There are some very good reasons, however, for placing more than one C function in a source file, reasons that touch on the "nature of C", as Art Shipman calls it (The C Users Journal, "Questions & Answers", February 1991).

Encapsulation and data hiding are important techniques in software engineering for decreasing the coupling between modules in a program. The weaker the coupling between two modules, the less one will be affected by changes to the other. These techniques are exemplified in Ada, which hides the implementation of a capability in a package, thus shielding clients of the capability from changes to the implementation. An Ada package consists of two parts, a package specification whose declarations constitute the public interface to the package and a package body which hides the package's actual implementation.

A C source file is analogous to an Ada package body. Static, non-local variables (i.e., those not declared in the scope of a function) in a C source file are like variables declared in the body of an Ada package: client modules have no knowledge of such variables and no access to them, except through declared procedures. Static functions in a C source file are like Ada procedures which are defined in the body of a package but not in the package specification: internal to the package, these procedures cannot be called by client modules.

For example, Listing 1 shows an Ada package for managing a symbol table. The table is implemented as a simple list of name/value pairs; an internal variable, SYMBOL_LIST, points to the list.

The representation of the list is not important; it could be a fixed-size array, a dynamically-allocated array, a linear linked list, a binary tree, a hash table, or even a skip list! SYM_ADD() is a procedure that adds a symbol to the symbol table; SYM_DELETE() deletes a symbol from the table. SYM_LOOKUP() is a function that returns the value assigned to a symbol. All 3 functions call an internal function, SYM_LOCATE(), that locates a symbol in the list and returns a pointer to the symbol's list node. SYM_UTIL_DEBUG is a global debug switch that a program can set to turn on debug output in the SYM_UTIL functions.

A comparable C "package", stored in a single source file, is shown in Listing 2. The symbol_list pointer and the sym_locate() function are declared static, so they are unknown outside of this file. The remaining functions and the global debug flag are all accessible to the public.

Clients (users) of the symbol table package (in either language) cannot reference the symbol_list variable, have no knowledge of the structure of list nodes, and cannot call the internal procedure, sym_locate(). These restrictions are not just a matter of the design methodology you follow - they are enforced by the compiler. Breaking the C functions out into separate files would require that the static variables be made global, that the structure of list nodes be made common knowledge, and that the sym_locate() function become callable from anywhere in a program. You can, of course, trust to people's good intentions and ignore the potential for malicious access to these "hidden" variables and functions, but what about programs that access them out of necessity or as a shortcut? Any changes to the implementation of the symbol table could have a major impact on such closely-coupled software.

Particularly effective use of the C package concept is exhibited in our POCC's event logging utilities. An application program's access to the event logging facility is only possible through two routines, evt_init() and evt_send(), found in one file, evt_util.c. evt_init() initializes the interface to the event logger. The fact that evt_init() loads event message information from 3 database files and establishes a network connection to the event logger is immaterial to the application program. evt_init() could just as well be opening a disk file for the event log and the texts of event messages could be hard-coded in a string array. The details of how evt_send() looks up the text of an event message, formats the message arguments, and writes the event packet out on the network are also of no consequence to the application program. The internal implementation of the event logging utilities could be completely revamped without affecting any of the applications software; the applications would have to be relinked to the updated library, but recompilation would not be necessary.

Our data services library took the opposite approach. This library, which manages network connections to multiple data servers, stores each of its functions in a different source file. Although the functions' calling sequences shield client applications from implementation details to a certain extent, the "internal" data structures are all global. The lack of a function for building an I/O selection mask for the managed connections forced application programs themselves to scan the library's list of connected servers. This kludge produced an unhealthy dependence of an application on the internals of the data services library. A new function, ds_mask(), was added that obviated the need for the kludge, but tracking down and eliminating the use of such kludges could be a major maintenance headache in some cases.

C packages are, in general, a good thing. However, several caveats should be noted. First, don't lock the door and throw away the key. It's possible for a package to be closed up too tightly. For example, as our project progressed, we found the need for certain applications to switch to a different event logger (on another computer) in mid-stream. Doing so requires the application program to close the network connection to the current event logger and to open a connection to the new event logger. Hidden inside the events package, the file descriptor for the event logger connection is inaccessible to application programs. Adding a new function, evt_reconnect(), easily solved the problem, but software changes may not, for organizational or configuration reasons, always be an available option.

Another word to the wise: don't hide what shouldn't be hidden. The UNIX hashing functions, hsearch(3), for example, manage a single hash table. While the details of the hash table are commendably hidden from the calling program, the calling program cannot have multiple tables in use simultaneously. The program must destroy one table before creating another. Rather than storing the hash table within the package, as it does, hcreate() would be better off returning an opaque, void * pointer to each hash table it creates. This "handle" could then be passed into hsearch() for adding and recalling entries from that particular table. Different hash tables would have different handles and could coexist peacefully.

Using C packages can be viewed as a primitive form of object-oriented programming. (Ken Pugh alluded to this in his "Questions & Answers" column, CUJ, February 1991.) In the object-oriented approach, a program is composed of objects. An object consists of instance variables, that represent the state of the object, and methods, which are functions used to modify or query the object's state. Objects communicate by passing messages to each other; a message specifies a method to be executed by the receiving object and arguments, if any, for the method.

In a C "object", the static, non-local variables are instance variables and the functions are the methods. Calling a function is logically equivalent to sending a message to a method. Figure 1 illustrates this object-oriented view of our symbol table package. The global debug switch, not shown in the figure, might be considered a class variable; i.e., a variable common to all instances of a particular object type (class).

The preceding discussion has shown how the C language supports good software design principles. Rather than being encouraged to put separate functions in separate files, C programmers should be encouraged to encapsulate functionality and data in C function "packages".

About me: I am a programmer/analyst at Integral Systems, Inc. (Lanham, MD), which builds satellite ground systems for NASA, NOAA, and the Air Force. I've been a professional programmer for about 9 years, developing satellite image processing systems in VAX/VMS FORTRAN, automated test equipment for satellite components in PL/M-286 (I wish C had "based" variables!), and, currently, satellite control center software in C under UNIX. I can be reached at 1100 West Street, Laurel MD 20707. (301) 497-2413.

Listing 1: Ada Symbol Table Package

    package SYM_UTIL is
                                  -- Global debug switch.
                                  -- Add symbol to table.
        procedure SYM_ADD (NAME: STRING, VALUE: INTEGER)
                                  -- Delete symbol from table.
        procedure SYM_DELETE (NAME: STRING) ;
                                  -- Lookup a symbol.
        function SYM_LOOKUP (NAME: STRING) return INTEGER ;
    end SYM_UTIL ;

    package body SYM_UTIL is
                                  -- Internal variables.
        type SYMBOL_NODE is record
        end record ;
        SYMBOL_LIST: access SYMBOL_NODE := null ;
                                  -- Public functions.
        procedure SYM_ADD (NAME: STRING, VALUE: INTEGER) is
            ... adds NAME/VALUE pair to the symbol table ...
        end SYM_ADD ;

        procedure SYM_DELETE (NAME: STRING) is
            ... deletes NAME from the symbol table ...
        end SYM_DELETE ;

        function SYM_LOOKUP (NAME: STRING) return INTEGER is
            ... returns NAME's value from the symbol table ...
        end SYM_LOOKUP ;
                                  -- Internal function called
                                  -- by the other functions.
        function SYM_LOCATE (NAME: STRING)
            return access SYMBOL_NODE is
            ... locates NAME's node in the symbol list ...
        end SYM_LOCATE ;

    end SYM_UTIL ;

Listing 2: C Symbol Table Package

    int  sym_util_debug = 0 ;     /* Global debug switch. */

                                  /* Internal variables. */
    typedef  struct  symbol_node {
    } symbol_node ;
    static  symbol_node  *symbol_list = NULL ;

                                  /* Public functions. */
    void  sym_add (), sym_delete () ;
    int  sym_lookup () ;
                                  /* Internal functions. */
    static  symbol_node  *sym_locate () ;

    void  sym_add (name, value)
        char  *name ;
        int  value ;
        ... adds NAME/VALUE pair to the symbol table ...

    void  sym_delete (name)
        char  *name ;
        ... deletes NAME from the symbol table ...

    int  sym_lookup (name)
        char  *name ;
        ... returns NAME's value from the symbol table ...
                             /* Internal function called
                                by the other functions. */
    static  symbol_node  *sym_locate (name)
        char  *name ;
        ... locates NAME's node in the symbol list ...

©1992  /  Charles A. Measday  /  E-mail