Pascal II: Operand Formats (1 of 2)

When you need to send data (especially complex data formats, such as strings) to an assembly routine from a Pascal host program, it can be very useful to be familiar with the internal structure of Pascal variables. This 2 part article describes a few of the more commonly used variable types; for a complete description of the more complex variables, including records and arrays, see pp. 227-228 of the Apple Pascal Operating System Reference Manual.
Machine language (assembly) routines are commonly used either when (a) speed is critical, or (b) when the code must access other assembly routines (such as PROMs or I/O drivers) that can't be reassembled as part of the program. Also, bit manipulations such as right-shift are much easier to do in assembly than in Pascal.

In the UCSD Pascal system, it's fairly easy to create short assembly programs which can be linked into a Pascal host program. In some cases, it may be sufficient to merely call the assembly routine; most routines require that data be passed to them, though. Data is passed to or from routines by means of a "parameter", a temporary variable created by Pascal specifically for that purpose. The term "Var parameter" implies that the address of the actual variable is passed to the routine as a parameter instead of its value.

Certain types of variables may be passed by value, but any variable may be passed by name by simply declaring it to be a Var parameter. Pascal does not allow parameters of variable length (with the exception of certain sets and long integers) to be passed on the CPU stack, since doing so could end up filling the stack to capacity and thereby crashing the operating system. These parameters, therefore, are automatically used as if defined as Var parameters. A good explanation of the various methods of passing parameters may be found in Peter Grogono's book, "Programming in Pascal".

Before delving into the details, let's define some terms and conventions which we'll use later on:

Bit = a binary digit (0 or 1). A bit is the smallest unit of information which can be stored in a computer.
Nybble = 4 bits (half a byte). A hexadecimal digit is one nybble (pronounced "nibble").
Byte = 8 bits (2 nybbles). This is the unit of storage which the 6502 processor uses.
Word = 2 bytes (16 bits). A word is the unit of information which Pascal uses.
LSB = least significant bit
MSB = most significant bit
decimal 65535 <--------memory---------> 0
hexadecimal $FFFF <--------memory---------> $0000 addresses
MSB LSB

This diagram of memory structure is useful for understanding the format of variables: although we're used to writing numbers from left to right, Pascal reads data from memory FROM RIGHT TO LEFT, starting at the least significant byte.


Integers:
Integers in UCSD Pascal are whole numbers between -32768 to +32767, inclusive. They are stored in one word (2 bytes). Negative integers are represented in "two's complement," which means that they appear to have positive values greater than 32767; the negative integer is arrived at by subtracting 2 ^ 16 (65536) from this positve value. Similarly, large positive integers are stored as a complementary negative numbers (cf. Integer BASIC). The sign bit (MSB) is 0 if positive, 1 if negative.
<----byte----> <---byte--->
15 14 . . . . . 8 7 . . . . . . 0 <== 16 bits
Sign Integer Value

Example: the number 3 is represented in binary as:

MSB 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 LSB

However, -3 is represented as:

MSB 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 LSB

which also reads as 65533 (or 65536-3)!

Integers may be passed by value or as Var parameters.


Reals:
Real numbers, in UCSD Pascal, are floating point numbers between +/-1.17550E-38 to +/-3.40282E+38, inclusive. Real numbers take up four bytes (2 words) of storage. Their binary representations are similar to the proposed IEEE standard for floating point numbers:
31 30 . . . . . . . 23 22 . . . . . . . . . 0 <== 32 bits
Sign Exponent Mantissa


"Mantissa" is the name given to the decimal portion of a number; by convention, it's expressed in scientific (exponential) notation. The "exponent" indicates the power to which the mantissa is raised. The exponent is represented in base 2 (2^n). The number 3 x 10^2, for instance, is defined as having a mantissa of 3, an exponent of 2, in base 10 (decimal).

The sign bit refers to the sign of the mantissa; it's 0 if positive, 1 if negative. The exponent is "offset" by 127; that is, a value of 127 in the exponent field corresponds to an exponent of 0. Similarly, if the value is 1, the exponent is -126, and if the field is 254, the exponent is +127. A value of 0 indicates that the real number is 0.

The mantissa of the real number is stored in normalized format in bits 0-22. "Normalizing" a number means adjusting it so that the highest bit is significant (that is, set to 1). The exponent indicates how many times (and in which direction) the value was shifted during normalization.

Notice that the MSB of the mantissa of any non-zero number that has been normalized is always a one. Zero can be treated as a special case: the
exponent is simply set to zero. So, to gain additional precision, the mantissa has an implied "1" that is not stored, resulting in a functional 24-bit mantissa, even though only 23 bits are actually used. This gives slightly more than a 6-decimal-place (single precision) accuracy.

To make this clearer, let's look at some examples:

Real number = 1

MSB 0 01111111 00000000000000000000000 LSB
Exponent = 127 (2^0) Mantissa = 1 (the implied 1 isn't stored)

Real number = -9.9
MSB 1 10000010 00111100110011001100110 LSB
Exponent = 130 (2^3) Mantissa = 99000015

In the second example, the real number (in binary) appears as 1001.1110011...

During normalization, the decimal point is moved to the left 3 times (incrementing the exponent), and the most significant bit becomes implied. The sign bit is 1, indicating that the number is negative.

Real numbers may be passed by value, or else they may be defined as Var parameters and then passed by address.
Published Date: Feb 18, 2012