|
|
|
The Thematic Mapper image processing subsystem (TIPS) of the LANDSAT-4/5 remote sensing satellite ground station is responsible for the radiometric and geometric correction of the satellite imagery.
The performance of these tasks requires large amounts of supplementary data extracted from the databases of the mission management facility (MMF) of the ground station. The data, resident on the MMF's DEC-2060 computer, is transmitted to one of two TIPS VAX-11/780 computers via a DECnet link or by magnetic tape (see Figure 1). Because the DEC-2060 is a 36-bit machine and the VAX-11/780 is a 32-bit machine, the binary representations of numbers on the two types of computers are not compatible. All numerical data, therefore, is transferred over the network in ASCII format: MMF converts the binary numbers to ASCII text prior to the transfer; TIPS converts the ASCII text to binary numbers after the transfer.
.----------. ! MMF ! ! DEC-2060 ! `----------' / \ / \ DECnet/Tape / \ / \ .------------. .------------. ! TIPS-1 ! ! TIPS-2 ! ! VAX-11/780 ! ! VAX-11/780 ! `------------' `------------' |
The quantity of data (see Table I) being moved made the file conversion process (incorporation) on the TIPS computers extremely slow, requiring 1.2 minutes of CPU time per scene (the wall-clock time could be considerably worse, depending on other system activity). While only an annoyance with our present throughput of 12 scenes a day, this time factor would be unacceptable when we reach our 100 scenes per day capability by 1985, at which time incorporation would require 2 hours of CPU time per day simply to prepare the data for use. Speeding up the incorporation program was clearly a necessity.
A work order is a request for processing of one or more intervals of imagery. An interval consists of one or more scenes. The numbers below are for the incorporation of a single work order containing one interval of 4 scenes.
.---------------.--------------.------------. ! Scope ! Type ! Quantity ! !---------------+--------------+------------! ! ! ! ! Miscellaneous Parameters ! Work Order ! INTEGER ! 3,993 ! (Primarily radiometric ! ! REAL*4 ! 3,298 ! correction parameters) ! ! REAL*8 ! 9 ! ! ! ! ! Telemetry Data ! Interval ! INTEGER ! 96 ! ! ! REAL*4 ! 3,026 ! ! ! REAL*8 ! 336 ! ! ! ! ! Geometric Correction Data ! Scene ! INTEGER ! 4,610 ! ! ! REAL*4 ! 107,888 ! ! ! REAL*8 ! 1,516 ! ! ! ! ! `---------------`--------------+------------! Total ! 124,772 ! `------------' |
The PROFILE program mentioned in the next paragraph was written by Bob Wilson. I briefly described this incredible program in a 1993 comp.os.vms posting. My understanding from Bob back in the 1980s was that PROFILE attached itself to the VAX's clock interrupt and sampled the program counter. Arthur McClinton's response points out that it could be attached to any interrupt. His company licensed PROFILE way back then and he maintained it up through several newer VMS versions until it finally stopped working.
I guess PROFILE had to be seen to be believed. What the comp.os.vms thread I initiated doesn't capture is that PROFILE wasn't something you only used when you were debugging and testing your program. The first time Bob demonstrated PROFILE to me was at NASA's LANDSAT computer facility (GSFC, Building 28); he attached PROFILE to a CPU-bound program running in the customer's production environment. We had no control over the target program, which was one of a number of subprocesses spawned by a master program, and we couldn't interrupt the production run that was in progress.
PROFILE, a GE-developed high resolution CPU sampling monitor,
was used to determine whether the incorporation program was I/O-bound by file
processing (as was first thought) or CPU-bound by the actual
ASCII-to-binary number conversions. We found that the incorporation
program was saturating the CPU and that most of the processing time was
spent in converting ASCII text to binary REAL*4
numbers. By
concentrating our speed-tuning efforts in this area, we were able to
achieve significant improvements in overall incorporation time.
The incorporation program, as originally written, used FORTRAN formatted
internal READ
statements (i.e., DECODE
) to
convert ASCII numbers to their binary representations. Two alternatives
were apparent for converting ASCII text to binary REAL*4
numbers: the VAX/VMS Run-Time Library conversion routine
OTS$CVT_T_D
(text to D_floating
) or a
user-written assembly-language conversion routine. Benchmark tests showed
that Run-Time Library conversions are about 1.4 times as fast as equivalent
FORTRAN conversions (see Table II). Replacement of
the FORTRAN conversions by Run-Time Library conversions coupled with a
design change in the calling routines produced an incorporation program
that was twice as fast as the original program.
The times below are averages of CPU times reported by
the VAX/VMS Run-Time Library routine LIB$SHOW_TIMER
during
the execution of a benchmark program.
CPU Time in Seconds for 50,000 Conversions .--------------------------------------------. Input ! MACRO Run-Time Library FORTRAN ! ASCII Number ! CONV_STR_REAL OTS$CVT_T_D E14.7 ! .----------------+--------------+--------------+--------------! ! ! ! ! ! ! +0.0000000E+00 ! 4.91 ! 37.66 ! 63.56 ! ! ! ! ! ! ! +0.1111111E+00 ! 5.16 ! 60.17 ! 90.29 ! ! ! ! ! ! ! +0.9999999E+00 ! 5.18 ! 57.07 ! 85.31 ! ! ! ! ! ! ! +0.1234567E+23 ! 5.30 ! 57.84 ! 86.77 ! ! ! ! ! ! ! +0.9876543E+12 ! 5.28 ! 54.32 ! 83.81 ! ! ! ! ! ! ! +0.9876543E-12 ! 5.39 ! 75.24 ! 103.87 ! ! ! ! ! ! ! -0.9876543E+12 ! 5.34 ! 54.76 ! 85.35 ! ! ! ! ! ! ! -0.9876543E-12 ! 5.45 ! 74.74 ! 102.97 ! ! ! ! ! ! `----------------`--------------'--------------'--------------' |
Although an increase in speed by a factor of 2 was desirable, we looked
further into the possibility of writing a MACRO conversion routine that was
even faster than the Run-Time Library. Two factors were in our favor.
First, the ASCII numbers transferred across the MMF-TIPS interface adhere
to strict formats; the majority of the real numbers are represented in
FORTRAN E14.7
format (i.e., "s0.nnnnnnnEsmm
").
Second, the VAX-11/780 has a very useful set of decimal string instructions
that convert ASCII strings to binary integers. The MACRO routine we wrote,
CONV_STR_REAL
, takes advantage of
these two factors, the fixed-format input and the hardware-implemented
conversions. As a result, benchmark tests showed that conversions made
using our assembly-language routine are ten times as fast as equivalent
Run-Time Library conversions (see Table II).
Our assembly-language conversion routine, CONV_STR_REAL
,
differs from the VAX/VMS Run-Time Library conversion routine,
OTS$CVT_T_D
, in three ways: speed, versatility, and accuracy.
Conversion speed is dependent on the versatility and accuracy of the
routine making the conversion. By making acceptable tradeoffs in
versatility and accuracy, it is possible to increase the speed with which
numbers are converted. The need for versatility is dependent on the
application. OTS$CVT_T_D
is a highly versatile routine that
accepts ASCII real numbers in many different formats; this versatility and
its associated overhead (a VAX/VMS DEBUGGER call/branch trace showed over
350 transfers of control during a single call to OTS$CVT_T_D
)
are not required for the majority of real numbers handled by the
incorporation program.
Conversion accuracy is relative. By analyzing roundoff errors, we can
determine the maximum relative error that can occur in converting an ASCII
number to a single or double precision binary number. It is assumed that
the result of an operation involving two real numbers of the same precision
introduces a maximum error of 1/2 least significant bit (LSB) of the
mantissa; n operations introduce a maximum cumulative error of n *
1/2 LSB
of the mantissa. Binary single precision
(F_floating
or REAL*4
) numbers are represented on
the VAX-11/780 using a sign bit; an 8-bit, excess-128 exponent; and a
24-bit, normalized mantissa. Binary double precision
(D_floating
or REAL*8
) numbers are represented
using a sign bit; an 8-bit, excess-128 exponent; and a 56-bit, normalized
mantissa. The range of magnitudes representable by double precision
numbers are almost identical to those of single precision numbers; only the
precision differs.
OTS$CVT_T_D
uses 128-bit arithmetic internally and produces a
REAL*8
result. Because of the extremely high precision of
128-bit arithmetic, it is safe to consider the Run-Time Library conversion
as a single operation with a maximum relative error of 1/2 LSB of the
56-bit REAL*8
mantissa, or one part in 2**57
(1.4 * 10**17
). CONV_STR_REAL
uses
REAL*8
arithmetic internally, produces a REAL*8
result, and converts the result to REAL*4
. Two operations
that may introduce roundoff errors in CONV_STR_REAL
are
representing powers of ten in the exponent table and multiplying the
mantissa by its exponent. The maximum relative error in the
ASCII-REAL*8
conversion is therefore 2 * 1/2 LSB
of the 56-bit REAL*8
mantissa, or one part in
2**56
(7.2 * 10**16
). (Numbers whose exponents
are less than 10**-31
require an additional operation to scale
the mantissa; the maximum relative error is then 3 * 1/2 LSB
,
or 1.5 parts in 2**56
.) For both OTS$CVT_T_D
and
CONV_STR_REAL
, converting the result to single precision may
add an additional error of 1/2 LSB of the 24-bit REAL*4
mantissa, or one part in 2**25
(3.4 * 10**7
).
The TIPS system engineers decided that a REAL*8
accuracy of
approximately one part in ten thousand million million was acceptable.
Extensive, but not exhaustive, benchmark tests showed no differences
between the REAL*4
numbers produced by the two conversion
routines and relative errors no larger than the maximums determined above
for the REAL*8
numbers produced by the routines.
The incorporation program was modified to utilize the new MACRO ASCII-to-real conversion routine. Comparison runs between the old and new programs for several scenes' worth of data yielded a sevenfold increase in speed and no differences in the contents of the data files generated. Incorporation of the data for one scene now only requires 10 seconds of CPU time. Incorporation of the data for 100 scenes will require less than 20 minutes of CPU time, a considerable reduction from the 2 hours required by the old program. Significant performance improvements were realized by exploiting the restrictions of the application and the richness of the hardware's architecture.
Daniel Roy (formerly of General Electric) assisted in the development of the algorithm. Robert Wilson (General Electric), the author of PROFILE, provided assistance during testing.
The purpose of the LANDSAT project is the collection of imagery for use in applications such as land classification, agricultural and geological resource management, etc. The satellite carries two "cameras": a 4-band Multi-Spectral Scanner (MSS) and an experimental, high-resolution, 7-band Thematic Mapper (TM). The satellite ground station, designed and built by General Electric, consists of the various subsystems needed to control the satellite, to receive and process telemetry and video data, and to manage the project's mission.
;******************************************************************************* ; ; CONV_STR_REAL ; ; This routine converts an ASCII representation of a real number to its ; corresponding VAX binary REAL*4 representation. It is optimized for ; the particular input format chosen; it does not accept different real ; number formats as the Run-Time Library routine OTS$CVT_T_D does. ; ; ; INPUTS - ; ; ASCII_NUMBER: ASCII representation of real number in the ; format +/-0.NNNNNNNE+/-MM (E14.7). The mantissa ; need not be normalized. ; ; OUTPUTS - ; ; R4_NUMBER: REAL*4 representation of input number. ; Set to +0.0E+00 if error in conversion. ; ; R0: Function value = ; SS$_NORMAL: No error in conversion. ; SS$_ROPRND: Invalid number format or invalid ; characters in number. ; SS$_FLTUND: Magnitude of number is too small ; for VAX REAL*4 representation; ; i.e., magnitude < +0.2900000E-38. ; SS$_FLTOVF: Magnitude of number is too large ; for VAX REAL*4 representation; ; i.e., magnitude > +0.1700000E+39. ; ; ; To call from a FORTRAN program: ; ; INTEGER*4 CONV_STR_REAL ! Declare function. ; CHARACTER*14 ASCII_number ! Arguments. ; REAL*4 R4_number ; ... ; ... ; status = CONV_STR_REAL (ASCII_number, R4_number) ; ; ;******************************************************************************* ;******************************************************************************* ; ; SYMBOLS - Global Symbols. ; ;******************************************************************************* $DSCDEF ; String descriptor parameters. $SSDEF ; System status codes. .EXTRN CONV_COND_HAND ; External routine - Condition Handler. ;******************************************************************************* ; ; SYMBOLS - Local Symbols. ; ;******************************************************************************* ASCII_NUMBER = 4 ; Byte offsets for argument list. R4_NUMBER = 8 DIGITS_TOTAL = 14 ; Total number of characters required ; in the input string. DIGITS_IN_FRACTION = 7 ; Number of digits to the right of the ; decimal point in the mantissa. DIGITS_IN_EXPONENT = 2 ; Number of digits (excluding sign) in ; the exponent. MAX_DIGITS = 8 ; Maximum number of digits (including ; sign) in the stack string storage. MAX_EXPONENT = 38 ; Range of exponents (10**-m .. 10**+m) ; in the exponent table. ; Temporary storage on the stack. ; Space is allocated for the input ; leading separate numeric string and ; the translated packed numeric string. ; The symbols define the amount of ; storage to be allocated and the ; offsets of the strings from the ; modified stack pointer. STACK_STORAGE = <MAX_DIGITS+1>/2 + MAX_DIGITS SEPARATE_STRING = 0 ; Number of bytes = max_digits. PACKED_STRING = MAX_DIGITS ; Number of bytes = (max_digits+1)/2. ;******************************************************************************* ; ; POWERS_OF_TEN ; ; This module defines the exponent lookup table and the mantissa scale ; factor. The table contains the floating point representation for ; ; m ; 10 for m = -38, -37, ..., +37, +38. ; ; For a given exponent i, the corresponding power of ten is found at ; entry i+38 in the table (entry 0 is 10**(-38)). ; ; The digits to the right of the decimal point in the mantissa are ; converted to a large integer (as if there were no decimal point). ; This number must then be divided by 10**(DIGITS_IN_FRACTION) to ; produce the correct fraction. An equivalent operation is to multiply ; the "mantissa" by 10**(-DIGITS_IN_FRACTION); this latter number is ; called the mantissa scale factor. ; ; Note that the maximum and minimum powers of ten in the exponent table ; are the maximum and minimum powers of ten representable on the VAX. ; The exponents of input ASCII numbers can exceed the maximum exponent ; in the table because of the scaling of the mantissa and also if the ; mantissa is not normalized. ; ;******************************************************************************* .PSECT POWERS_OF_TEN, NOEXE, NOWRT, PIC, RD, SHR ; Macro CREATE_POWER_OF_TEN defines and initializes storage for a power of ten. .MACRO CREATE_POWER_OF_TEN VALUE .DOUBLE +1.0E'VALUE .ENDM ; Macro GEN_EXP_TABLE defines and initializes storage for a table of powers ; of ten. The numbers in the table range from 10**(-n) .. 10**(+n), with the ; exponent increasing in steps of 1. .MACRO GEN_EXP_TABLE EXPONENT VAL = -EXPONENT .REPEAT 2*EXPONENT + 1 CREATE_POWER_OF_TEN \VAL VAL = VAL + 1 .ENDR .ENDM ; Macro GEN_INVERSE_POWER_OF_TEN defines and initializes storage for the ; inverse of a power of ten; i.e., the inverse of 10**i is 10**(-i). .MACRO GEN_INVERSE_POWER_OF_TEN EXPONENT VAL = -EXPONENT CREATE_POWER_OF_TEN \VAL .ENDM EXPONENT_TABLE: GEN_EXP_TABLE \MAX_EXPONENT MANTISSA_SCALE_FACTOR: GEN_INVERSE_POWER_OF_TEN \DIGITS_IN_FRACTION ;******************************************************************************* ; ; CONV_STR_REAL - Start of Executable Code. ; ; REGISTER USAGE - ; ; R0: Function value returned. ; R0-R3: Used by CVTSP and CVTPL instructions for converting ; ASCII to packed and packed to longword (exponent ; conversion below follows copies). ; R2-R3: Exponent conversion; final REAL*8 result. ; R4-R5: Mantissa conversion. ; R6: Address of first character in input string. ; SP: Modified to allocate temporary string storage on the ; stack. ; ;******************************************************************************* .PSECT CONV_STR_REAL, EXE, NOWRT, SHR CONV_STR_REAL:: .WORD ^M<R2,R3,R4,R5,R6> CLRF @R4_NUMBER(AP) ; Zero result, in case of error. MOVAL CONV_COND_HAND, (FP) ; Establish condition handler. SUBL2 #STACK_STORAGE, SP ; Allocate temporary stack storage. MOVL ASCII_NUMBER(AP), R6 ; Address of ASCII_NUMBER descriptor. CMPW DSC$W_LENGTH(R6), #DIGITS_TOTAL BLSSU INVALID_STRING ; Enough characters in input string? MOVL DSC$A_POINTER(R6), R6 ; Extract string address from descriptor. ; Validate non-essential CMPB 1(R6), #^A'0' ; characters in the input BNEQ INVALID_STRING ; string. Return reserved CMPB 2(R6), #^A'.' ; operand error (like CVTxx BNEQ INVALID_STRING ; instructions) if invalid. CMPB <3+DIGITS_IN_FRACTION>(R6), #^A'E' BEQL CONVERT_MANTISSA INVALID_STRING: BRW INVALID_STRING_ERROR ; Invalid input string "long branch". CONVERT_MANTISSA: ; Construct ASCII string ; "SNNNNNNN" from input string ; "S0.NNNNNNNESMM". MOVQ 2(R6), (SP) ; ".NNNNNNN" to separate string. MOVB (R6), (SP) ; "S" to separate string. ; Convert ASCII string to ; packed string (BCD) to ; INTEGER*4 to REAL*8, ; giving the mantissa. CVTSP #DIGITS_IN_FRACTION, (SP), - #DIGITS_IN_FRACTION, PACKED_STRING(SP) CVTPL #DIGITS_IN_FRACTION, PACKED_STRING(SP), R4 CVTLD R4, R4 ; Mantissa * 10**DIGITS_IN_FRACTION. ; Construct ASCII string ; "SMM" from input string ; "S0.NNNNNNNESMM". MOVL <3+DIGITS_IN_FRACTION>(R6), (SP) ; "ESMM" to separate string. ; Convert ASCII string to ; packed string (BCD) to ; INTEGER*4, giving exponent. CVTSP #2, 1(SP), #2, PACKED_STRING(SP) CVTPL #2, PACKED_STRING(SP), R2 CMPL R2, #-MAX_EXPONENT ; Is the exponent too small? BLSS EXPONENT_TOO_SMALL_ERROR CMPL R2, - ; Is the exponent too large? #MAX_EXPONENT+DIGITS_IN_FRACTION BGTR EXPONENT_TOO_LARGE_ERROR ; Subtract mantissa scaling factor ; from exponent and compute index ; into exponent lookup table. ADDL2 #MAX_EXPONENT-DIGITS_IN_FRACTION, R2 BGEQ MULT_MANTISSA_EXPONENT ; Resulting exponent in table? ; Resultant exponent too small. ; Scale mantissa and use original ; exponent. MULD2 MANTISSA_SCALE_FACTOR, R4 ADDL2 #DIGITS_IN_FRACTION, R2 MULT_MANTISSA_EXPONENT: ; Index into the lookup table, ; obtain 10**exponent, and MULD2 EXPONENT_TABLE[R2], R4 ; multiply mantissa by exponent. CVTDF R4, @R4_NUMBER(AP) ; Round and return converted ; REAL*4 value. MOVL #SS$_NORMAL, R0 ; Return normal status. RET ; Error Returns. INVALID_STRING_ERROR: ; Invalid characters in input MOVL #SS$_ROPRAND, R0 ; string. Return reserved RET ; operand error status. EXPONENT_TOO_SMALL_ERROR: ; The exponent is too small. MOVL #SS$_FLTUND, R0 ; Return floating point RET ; underflow error status. EXPONENT_TOO_LARGE_ERROR: ; The exponent is too large. MOVL #SS$_FLTOVF, R0 ; Return floating point RET ; overflow error status. .END
INTEGER*4 FUNCTION CONV_COND_HAND (SIGARGS, MCHARGS) C******************************************************************************* C C CONV_COND_HAND C C This routine is a VAX/VMS condition handler designed for use with C number conversion functions (i.e., converting ASCII text to binary C number representations). The error condition causing the condition C handler to be invoked is tested to determine if it is one of several C arithmetic errors. If so, the condition handler unwinds the stack to C the caller of the routine which established the condition handler and C returns control to that routine; the error condition is the function C value returned. If the error condition is not an arithmetic-type error, C the condition handler resignals the condition to higher-level condition C handlers. For example, let routine A call number conversion function B: C C A : status = B (in_number, out_number) C C ! C ! C V C C B <-----> condition_handler C C B establishes this condition handler to handle errors. If the number is C successfully converted, B returns a function value of SS$_NORMAL. If an C error occurs, the condition handler is invoked, the stack is unwound to C A, and the error code is returned as the function value. Routine A is C not aware of the processing taking place in B or in the condition handler; C all A sees is the function value returned, either normal or error. C C C INPUTS - C C SIGARGS: INTEGER*4 array which defines the condition C and the environment in which it occurred. C C MCHARGS: INTEGER*4 array which contains the values C of R0 and R1 at the time of the condition. C C OUTPUTS - C C MCHARGS: Saved value of R0 replaced by error code if C arithmetic condition caused handler to be C invoked. VAX/VMS will return this new R0 as C a function value to the caller of the routine C that established the condition handler. C C CONV_COND_HAND: SS$_CONTINUE if arithmetic error occurred and C program is to continue. C SS$_RESIGNAL if other error occurred and is to C be processed by higher-level condition handlers. C C******************************************************************************* C... Function arguments. INTEGER*4 SIGARGS(*), MCHARGS (*) C... Parameters and external routines. INCLUDE 'SYS$LIBRARY:SIGDEF.FOR' INTEGER*4 LIB$MATCH_COND C... Local variables. INTEGER*4 CONDITION, INDEX C... Determine if the condition handler was invoked during a stack C unwind, or for an arithmetic error, or for some other condition. CONDITION = SIGARGS(2) ! Condition that invoked handler. INDEX = LIB$MATCH_COND (CONDITION, + SS$_UNWIND, ! Unwinding stack. + SS$_DECOVF, ! Decimal overflow. + SS$_FLTDIV, ! Floating divide. + SS$_FLTDIV_F, ! Floating divide fault. + SS$_FLTOVF, ! Floating overflow. + SS$_FLTOVF_F, ! Floating overflow fault. + SS$_FLTUND, ! Floating underflow. + SS$_FLTUND_F, ! Floating underflow fault. + SS$_INTDIV, ! Integer divide. + SS$_INTOVF, ! Integer overflow. + SS$_ROPRAND) ! Reserved operand. C... If the condition handler was invoked during a stack unwind, C then let the system continue unwinding. IF (INDEX .EQ. 1) THEN CONV_COND_HAND = SS$_CONTINUE C... For an arithmetic error, unwind the stack and return the C status code as the function value to the calling routine. ELSEIF (INDEX .GE. 2) THEN MCHARGS(4) = CONDITION CALL SYS$UNWIND ( , ) CONV_COND_HAND = SS$_CONTINUE C... For all other errors, resignal the condition, allowing it to C be handled by other condition handlers (i.e., the traceback C handler). ELSE CONV_COND_HAND = SS$_RESIGNAL ENDIF RETURN END