Floating-Point Support

This chapter contains an overview of the IEEE 754 floating-point standard, Java virtual machine floating-point semantics, and the porting effort required for the implementation of floating-point to various processor architectures. Additionally, the implementation of strictfp arithmetic operations is detailed.

Version 1.0 of the CLDC Specification did not require floating-point arithmetic in compliant implementations. However, the CLDC Specification Version 1.1 does require floating-point, and this chapter describes the implications for porting the floating-point implementation for KVM in CLDC 1.1.

10.1 Introduction

The Java programming language and the Java virtual machine support two floating-point types, 32-bit float and 64-bit double. The numerical results for operations performed on values of these types are defined by the IEEE 754 standard for binary floating-point arithmetic (IEEE Std 754-1985). While many processor architectures also support IEEE 754, there can be complications mapping Java virtual machine floating-point operations to C code or to hardware instructions implementing those operations. Before describing those complications and their solutions, more background on IEEE 754 is necessary.

10.1.1 IEEE 754 floating-point

Floating-point numbers are a subset of the real numbers; the representable finite floating-point numbers have sign, exponent, and significand fields.1

The sign field is 0 or 1. The exponent field is an integer; the significand field is a binary number greater than or equal to zero and less than two. The IEEE 754 standard defines the ranges for the exponent and significand values for the float and double formats. The double format has more than twice the precision of float as well as a greater exponent range. To avoid multiple representations for the same numerical value, a floating-point number’s representation is normalized; that is, the exponent is adjusted to the least value so that the leading bit of the significand is 1 instead of 0. The significand is less than 1 only for subnormal values, which are values so small that an in-range exponent cannot be made small enough to normalize the value’s representation.

Since floating-point numbers have a fixed amount of precision, there must be a rounding policy to decide which floating-point number to return when storing the exact result requires more bits than the precision of the floating-point format. For example, multiplying two floating-point values can lead to the exact product having twice as many bits as either input. The IEEE 754 default rounding policy used in the Java virtual machine is to return the floating-point value closest to the exact numerical result. However, not all operations have clear finite results. For example, what is 1/0 or 0/0? For such situations, the IEEE 754 standard has the special values infinity and NaN (not a number). A signed infinity is returned when the exact result is too big to represent (overflow) or when a finite non-zero value is divided by zero. A NaN is returned for invalid operations, such as 0/0 or sqrt(-1). By adding infinities and NaN, IEEE 754 arithmetic forms a closed system. For every set of inputs, an IEEE 754 arithmetic operation returns an IEEE 754 value.

For two IEEE 754 numbers to be equivalent, they must either be the same non-finite value (+infinity, -infinity, NaN) or if both values are finite, each field of the floating-point numbers must be the same.

10.1.2 Implementing Java virtual machine floating-point semantics

Many processor architectures natively support IEEE 754 arithmetic on float and double formats. Therefore, there is often a straightforward mapping between Java virtual machine floating-point operations, C code implementing those operations, and floating-point instructions on the underlying processor. However, various complications are possible:

Both fused mac and the extra range of the x87 registers necessitate extra care when implementing Java virtual machine semantics.

10.1.3 Java virtual machine floating-point semantics: strictfp

There are actually two flavors of floating-point semantics in the Java virtual machine: FP-strict semantics and default semantics. FP-strict semantics are used if a method or constructor has the ACC_STRICT bit set in the access_flags field of the method_info structure.3 In Java, this bit gets set if a class or a method is declared strictfp. All the floating-point operands and results in FP-strict methods and constructors are exactly 32-bit float or 64-bit double quantities.

In contrast, in default floating-point semantics, while floating-point variables must hold exactly float or double values, values on the operand stack are allowed, but not required, to have greater exponent range.

The Java programming language provides the strictfp modifier, to be applied to the declaration of a class, interface or method containing variables that might take a floating-point value. If the strictfp modifier is used, any compile-time expression involving the variables of the declared class, interface or method is said to be FP-strict. To be FP-strict means that all intermediate floating-point values must be elements of a float value set or a double value set, implying that the results of all FP-strict expressions must be those predicted by IEEE 754 arithmetic on operands represented using float (single-precision) and double (double-precision) formats. Within an expression that is not FP-strict, some leeway is granted for an implementation to use an extended exponent range to represent intermediate results. The net effect, roughly speaking, is that a calculation might produce “the correct answer” in situations where exclusive use of the float value set or double value set might result in overflow or underflow.

For more details, see the The Java™ Virtual Machine Specification (Java Series), Second Edition by Tim Lindholm and Frank Yellin (Addison-Wesley, 1999) and the The Java™ Language Specification by James Gosling, Bill Joy, and Guy L. Steele (Addison-Wesley, 1996).

10.1.4 Floating-point architectures

10.1.4.1 Fused Mac

In general, a fused mac cannot be used to implement chained multiply and add instructions in the Java virtual machine since the rounding behavior will be different. This is true for both default and FP-strict semantics. However, even if an architecture only has fused mac instructions for floating-point, implementing the semantics of separate add and multiply is fairly direct. The result of (a + c) is the same as (a * 1.0 + c). The result of (a * b) is almost the same as (a * b + 0.0); it will be different if (a * b) results in a negative zero. Adding a positive zero would result in a positive zero instead of negative zero being returned for the logical product. This discrepancy is not allowed by Java virtual machine semantics. Assuming the “round to nearest” rounding mode is in effect, (a * b - 0.0) gives the same result as
(a * b) even if (a * b) is zero. More generally, fused mac-based architectures usually have some special instruction idiom to avoid this discrepancy regardless of rounding mode. C compilers for fused mac platforms usually include a switch to disable the collapsing of chained multiplies and adds into fused macs.

10.1.4.2 x87 FPU

The floating-point load and store instructions on the x87 support three floating-point formats 32-bit float (8-bit exponent), 64-bit double (11-bit exponent), and 80-bit double extended (15 bit exponent). However, when values are loaded in the 80-bit registers, they always have 15-bit exponents even when the FPU is set to round to float or double precision. When implementing Java virtual machine instructions, the x87 FPU should be set to round to float or double precision. However, especially in FP-strict methods, the effect of the additional exponent bits must be compensated for.

10.1.4.2.1 FP-strict

FP-strict instructions must generate the same results everywhere, including x87 FPUs. The extra exponent range complicates this since the overflow threshold (the point at which infinity is returned) and the underflow threshold (the point at which subnormal results are returned) differ with the larger exponent range. For example, if the extra exponent range were not an issue, the double computation d = a*b + c might get translated into a sequence of x87 instructions like

The problem with this code sequence is that the intermediate values a*b and
(a*b) + c will not overflow or underflow the same way as pure double code since the intermediate values are kept in registers with larger exponent range. The first attempt at a solution stores each intermediate product to a double location in memory:

# Attempted Fix 1 
fld a  # load a onto register stack 
fmul b # multiply a*b and put result on register stack 
fst t1 # store a*b into a temp double location to restrict exponent 
fld t1 # reload a*b with restricted exponent 
fadd c # add c to product of a and b and put result on register stack 
fst d  # store a*b+c from register stack into d

This first attempted fix does preserve the proper overflow behavior for a*b. However, the underflow behavior is slightly wrong. Performing the multiply and rounding, storing to restrict the exponent (thus rounding again), and then reloading the stored value can give a different subnormal number than if the product were rounded only once to the final precision and range. The compute-store-reload idiom works for addition and subtraction. However, multiplication and division both share this double-rounding-on-underflow hazard. Avoiding the hazard requires a few additional steps; however expressing the needed steps in a C program may be difficult.

If the operand values are float instead of double, and if the FPU’s rounding precision is set to double precision, and the loads and stores are of float values, the store-reload idiom works for the four basic float arithmetic operations (add, subtract, multiply and divide). In the case of multiply, a double precision product of float operands is exact, so double-rounding is avoided. In general, double has enough additional precision over float that these double-rounding problems are all avoided.

To avoid double-rounding on underflow for double values, what would be a subnormal result in pure double must also be a subnormal in the register format with extended exponent range. This can be arranged by scaling one of the operands by a power of two.

# Attempted Fix 2 
fld a           # load a onto register stack 
fmul SCALE_DOWN # scale a down  
fmul b          # multiply a_scaled*b, put result on register stack 
                #  significand will have the right bits if a*b  
                #  should be subnormal 
fmul SCALE_UP   # rescale product to restore the proper exponent  
fst t1          # store a*b into a temporary double location to 
                #  restrict exponent 
fld t1          # reload a*b with restricted exponent 
fadd c          # add c to product of a and b  
                #  and put result on register stack 
fst d           # store a*b+c from register stack into d

Multiplying by SCALE_DOWN and SCALE_UP ensures the right result when the product in pure double would be a subnormal. The store and reload to and from t1 is still needed to ensure an overflow to infinity occurs at the proper value.

The magnitude of the exponent of SCALE_DOWN and SCALE_UP is the difference in the maximum exponent of the double format and the maximum exponent of the register format:

SCALE_DOWN = 2^-(Emax register ^{- E}max double) = 2^{-(16383 - 1023)} = 2^-15360

SCALE_UP = 2^(Emax register ^{- E}max double) = 2^{(16383 - 1023)} = 2¹⁵³⁶⁰

Unfortunately, these values are too large to represent as double values. However, they can be easily synthesized out of double values if the intermediate products are kept on the FPU stack with its large exponent range:

As 80-bit values, logically the final bit patterns from most to least significant bit, are:

Adjusting by the scaling factors is also needed to implement divide. The product or quotient must first be scaled down. Scaling up first will not preserve the underflow threshold.

10.1.4.2.2 Generating FP-strict code in C

If a Java virtual machine on the x87 is generating assembly or machine code directly, creating the code necessary for FP-strict semantics is straightforward. However, coaxing the needed instructions from C source can be challenging due to numerous factors:

One approach to dealing with these issues is to generate the scaling factors by multiplying together sixteen copies of 2^{± 960} stored as a volatile variable. Declaring a variable volatile forces it to be reread every time it is used, foiling unwanted optimizations. However, this means that an FP-strict multiply or divide would require (32 + 2) multiplies in addition to the operation being implemented. If asm cannot be used to implement the FP-strict multiply and divide operations, it may be faster to use an integer-based software implementation of those operations.

10.1.4.2.3 Default floating-point

Compared to FP-strict code, generating code with default floating-point semantics is simple. For default code, the scaling factors are not required and the stores and reloads are only necessary for variables. In other words, the stores and reloads are not necessary for quantities that live on the Java virtual machine operand stack.

10.1.4.3 Other architectures

On architectures with only plain float and double arithmetic operations, mapping to Java virtual machine semantics to equivalent C code is not complicated.

10.2 Floating-point support in the virtual machine

For CLDC 1.1 compliant implementations, the floating-point functionality is always enabled by default. It can be disabled by changing the IMPLEMENTS_FLOAT flag in main.h. The majority of the support needed in the virtual machine for implementing floating-point is done to the Java bytecodes defined in bytecodes.c. The specific modifications needed are described in the sections below.

10.2.1 Floating-point bytecodes implementation

The file bytecodes.c represents one of the major components that must be changed to support floating-point. This file contains Java bytecodes executed by the KVM interpreter. Many of the modifications involve checking for NaNs. Among the bytecodes that require modifications are D2I, D2L, F2I, and F2L. The modifications and checks for NaNs are described in Section 10.4 “Porting.” The x86 specific changes are implemented in fp_bytecodes.c (located in directory kvm/VmExtra/src/fp). Specific details of the changes are also documented with comments in that file.

10.3 CLDC 1.1 floating-point libraries and trigonometric functions

This section describes the floating-point libraries and the trigonometric and other math functions that are now supported by KVM. The Java classes that are needed for floating-point support are described in the following table:

TABLE 10 – Java classes needed for floating-point
File	Description
Float.java	Supports floating-point arithmetic.
Double.java	Supports double-precision floating-point arithmetic.
Math.java	Additional trigonometric and other math functions.
FloatingDecimal.java	Used to convert decimals and doubles to strings

The table below lists the trigonometric function that are now implemented in the KVM for floating-point support. Listed with each function are the corresponding file(s) in which the function is implemented.

TABLE 11 – Files implementing trigonometric and other math functions
Function	File(s)
sin	k_sin.c s_sin.c
cos	k_cos.c s_cos.c
tan	k_tan.c s_tan.c
sqrt	e_sqrt.c w_sqrt.c
ceil	s_ceil.c
floor	s_floor.c
abs	s_fabs.c

The implementation of the trigonometric functions is taken directly from the JDK1.3.1 sources with no changes except to the function names. The trigonometric files are specified in directory kvm/VmExtra/src/fp.

Note – You cannot use optimization when compiling the floating-point files. Doing so will cause incorrect results in the trigonometric functions. This means you cannot set FP_OPTIMIZATION_FLAG. Refer to Section 10.1.2 “Implementing Java virtual machine floating-point semantics” for further details.

10.4 Porting

The following sections summarize the porting effort required for the implementation of floating-point to various processor architectures. The biggest challenge in the porting effort is in the implementation for handling NaNs and infinity bounds checking. The key changes that are required on all platforms are essentially in the conversion of bytecodes D2I, D2L, F2I, and F2L. These bytecodes needed additional checks mandated by the Java™ Virtual Machine Specification to check for NaNs and infinity bounds, and to return the correct value for each of these cases.

The Java™ Virtual Machine Specification (Java Series), Second Edition by Tim Lindholm and Frank Yellin (Addison-Wesley, 1999) states that for each of these conversion bytecodes, if a NaN value is being converted, the result of conversion is zero. If a value is of large magnitude or small magnitude (such as positive or negative infinity) the maximum or minimum value of the conversion type is the result. In all other cases, the value is converted from one type to the other using the IEEE 754 conversion rules. The values defined as NaN and infinity are described in the Java™ Virtual Machine Specification, §4.4.4 and §4.4.5.

10.4.1 SPARC

The SPARC architecture is IEEE 754 compliant and has direct support for float and double operations. Therefore, implementing floating-point on KVM/SPARC only requires additional checks for NaN and infinity in the conversion bytecodes, D2I, D2L, F2I, and F2L.

10.4.2 ARM

The ARM CPU uses a IEEE 754 compliant software floating-point library. Similar to SPARC architecture, the only required changes are additional checks to the floating-point conversion bytecodes D2I, D2L, F2I, and F2L.

10.4.3 x86

The traditional x87 FPU is fully IEEE 754 compliant. However, the IEEE 754 standard explicitly allows rounding to reduced precision, but greater exponent range, which does not always match the floating-point model used in the Java language and the JVM. Therefore, additional work is needed to implement floating-point. Additionally, the P4 processor contains the SSE2 instruction set extension, which is another IEEE 754 compliant implementation. However, SSE2 is more amenable to Java’s semantics.

To implement floating-point for the x86 platform, checks involving NaNs are needed for the following Java bytecodes: FCMPL, FCMPG, DCMPL, DCMPG, FREM, and DREM. These bytecodes needed additional checks to behave as mentioned in the Java™ Virtual Machine Specification. The Java™ Virtual Machine Specification describes what each of these bytecodes should do or return when a NaN value is encountered.

The file fp_bytecodes.c under kvm/VmExtra/src/fp contains the x86-specific implementation for the floating-point bytecodes. Each function in this file implements an algorithm for a specific floating-point bytecode that needs modification. Each of these bytecodes check the value that is on the stack to see if it is a NaN. If a NaN value is encountered, it is handled as a special case according to the Java™ Virtual Machine Specification. These functions are executed only if the variable PROCESSOR_ARCHITECTURE_X86 is set in the platform-specific header file machine_md.h.

10.4.3.1 strictfp implementation for x86

Due to the reasons mentioned in the above sections, the implementation of strictfp is quite a challenge for the x86 platform. The x86 is designed to operate on 80-bit double extended floating-point values rather than the 64-bit and 32-bit double and float values used in the Java programming language. The x86 can be made to round to float or double precision. Unfortunately, this rounding does not exactly emulate the pure float and double called for by Java, since an extended exponent range is available. The extended exponent range means the overflow and underflow thresholds are different than for pure float and double.

To implement strictfp, the bytecodes DMUL and DDIV must be changed. The problem is, while doing these operations on subnormal numbers (very small IEEE 754 values with less precision than normal numbers) rounding occurs, producing an incorrect result. (Refer to 10.1.4.2.1, “FP-strict.”) In addition, double-rounding can occur if the obvious code generation algorithm is used. The solution is to implement the following algorithms for DMUL and DDIV.

Multiply (DMUL)

Divide (DDIV)

For strictfp floating-point on x86, the initial scaled quotient must be smaller than the actual quotient for the rounding to work properly. Thus, the algorithm is:

The bytecodes for FADD and FSUB did not need to be changed since if those operations have subnormal results, the results are exact (that is, no rounding occurs).

Chapter 10