Contents Previous Next

Chapter   11

Native Code


A Java virtual machine commonly needs access to various native functions in order to interact with the outside world. For instance, all the low-level graphics functions, file access functions, networking functions, or other similar routines that depend on the underlying operating system services typically need to be written in native code.

The way these native functions are made available to the Java virtual machine can vary from one virtual machine implementation to another. In order to minimize the work that is needed when porting the native functions, the Java Native Interface1 (JNI) standard was created. The Java Native Interface generally serves two purposes: 1) JNI serves as a common interface for virtual machine implementers so that the same native functions will work unmodified with different virtual machines; 2) JNI provides Java-level APIs that make it possible for a Java programmer to dynamically load libraries and access native functions in those libraries.

Unfortunately, because of its general nature, JNI is rather expensive and introduces a significant memory and performance overhead to native function calls. Also, the ability to dynamically load and call arbitrary native functions from Java programs could pose security problems in the absence of the full Java 2 security model.

KVM does not support the Java Native Interface (JNI). Rather, KVM supports an interface called K Native Interface (KNI), which implements a logical subset of JNI that is significantly more efficient in terms of performance and memory consumption. In addition, KVM also supports an older interface that allows native functions to be added to the KVM in a VM-specific fashion. Information on KNI is provided below in Section 11.1 "Using the K Native Interface (KNI)”. Information for writing native functions using the old-style native interface is provided in Section 11.2 "Implementing old-style native methods.”

11.1 Using the K Native Interface (KNI)

Starting from KVM 1.0.4, KVM has a new interface for writing native functions. The high-level goal of this new interface, K Native Interface (KNI), is to allow native functions to be added to the KVM (and other small-footprint virtual machines) in a manner that is both highly efficient and fully independent of the internal structures of the virtual machine. The K Native Interface (KNI) Specification, (Sun Microsystems, Inc., 2001) (KNI Specification) defines a logical subset of the Java Native Interface (JNI) that is well-suited for low-power, memory-constrained devices. KNI follows the function naming conventions and other aspects of the JNI as far as this is possible and reasonable within the strict memory limits of CLDC target devices and in the absence of the full Java 2 security model. Since KNI is intended to be significantly more lightweight than JNI, some aspects of the interface, such as the parameter passing conventions, have been completely redesigned and are significantly different from JNI.

The K Native Interface is described in more detail in the KNI Specification. Please refer to this specification to learn about the K Native Interface and its usage.

Things to remember when getting started with KNI: One of the key goals of the KNI is to isolate the native function programmer from the implementation details of the virtual machine. Instead of writing native functions using KVM-specific functions and data structures (as required by the old-style native interface described in Section 11.2 "Implementing old-style native methods”), KNI allows native functions to be written using a set of functions that operate identically and efficiently across a wide variety of virtual machines. To ensure portability of native code, the native function programmer shall not use any KVM-specific include files or KVM-specific functions or data types. Rather, the programmer must include the file “kni.h” and use functions and data types defined in that file.

11.2 Implementing old-style native methods


Note – It is highly recommended that you use KNI for writing native functions to the KVM. The use of the old-style API is strongly discouraged for all other purposes than for writing asynchronous native functions (Section 11.4 "Asynchronous native methods”).

WARNING: You should not write old-style native methods unless you have thoroughly read through the implementation and understand its structures. Most of the material in this porting guide is moderately straightforward. The material in this subsection is not!

Old-style native methods must be written extremely carefully. Inattention to detail will cause fatal errors in the virtual machine.

11.2.1 Include files

Your code containing old-style native functions should begin with the line

    #include <global.h> 

which causes all include files that are part of KVM to be included. You might also need to #include additional files.

11.2.2 Accessing arguments from old-style native methods

When a native method is called, its arguments are on top of the Java stack. A static method’s arguments should be popped from the stack in the reverse order from which they were pushed. CODE EXAMPLE 3 shows an example of this coding style:

CODE EXAMPLE 3 Handling arguments of native static methods

Java code:

static native void 
drawRectangle(int x, int y, int width, int height); 

Native implementation:

static void Java_com_sun_kjava_Graphics_drawRectangle() { 
    int height = popStack(); 
    int width =  popStack(); 
    int y =      popStack(); 
    int x =      popStack(); 
    windowSystemDrawRectangle(x, y, width, height); 
} 

An instance method (non-static method) must pop the this argument off the stack after it has popped the rest of the arguments.


Note – Failing to pop the this argument in a native instance method will almost surely cause a fatal error in the virtual machine.

Table 12 shows the macros that should be used to pop arguments off the stack:

TABLE 12  –  Macros for popping arguments from the stack
C type
Macro for popping
char, byte, int, long
popStack()
float
popStackAsType(float)
long64, ulong64
popLong()
double
popDouble()
pointerType
popStackAsType(pointerType)

11.2.3 Returning a result from an old-style native function

If a native method returns a result, it must push that result onto the stack. The native code should use the appropriate macro shown in Table 13 to push the result back onto the stack:

TABLE 13  –  Macros for pushing arguments onto the stack
C type
Macro for pushing
char, byte, int, long
pushStack()
float
pushStackAsType(float)
long64, ulong64
pushLong()
double
pushDouble()
pointerType
pushStackAsType(pointerType)

11.2.4 Shortcuts

Some native code uses the macro topStack instead of popping the last argument off the stack. It then sets topStack to the value it wants to return.

This practice is not encouraged. It should only be used for “one-liners” that access the argument and return the value in a single statement. pushStack and popStack cannot be used in this case, since C would not guarantee their order of evaluation.

In general, it is safer to pop the value, perform the calculation, and push the value back onto the stack as three separate steps.

11.2.5 Callbacks

Native code cannot call back into Java code. KVM provides a mechanism by which native code can alter the interpreter state to begin executing a new piece of code. Upon finishing executing that code, the mechanism can indicate a new C function which should be called.

11.2.6 Exception handling in old-style native code

If the native code needs to throw an error or exception, it should call the function

void raiseException(const char* exceptionClassName) 

where the exceptionClassName argument is the exception class or error class.

11.2.7 Useful functions in old-style native code

Other useful functions that a native method might need to call are the following:

11.2.8 Garbage collection issues

The C stack is not scanned when the KVM performs a garbage collection. If your native code allocates new Java objects, you must take special precautions to prevent your new Java objects from being garbage collected inadvertently.

Since the release 1.0.2, KVM includes a compacting garbage collector. Any time that your native code performs an allocation, objects in the Java heap can move. This includes any arguments passed to your native function and any previous heap allocations performed by your native code.


Note – We strongly recommend that you do not write native methods that perform allocation from the Java heap. You greatly increase the chances that your code will have hard-to-find and hard-to-reproduce bugs.


Note – If, for example, you need to create a structure, it is better to create that structure in Java code, and pass it as an argument to the native code.

If your code must perform allocation, it is important that you

The garbage collector can get erroneous results if an allocation occurs while an argument or return value is on the Java stack. The rest of this chapter describes how your code can interact correctly with the garbage collector.

11.2.8.1 Heap Space and Permanent Space

In order to simplify the garbage collector, the KVM’s memory is divided into two spaces: “permanent space” and “heap space”.

All objects created in permanent space are, well, permanent. These objects are

Among the objects that are allocated in permanent space are

These objects are never moved and never freed after they are created.

Structures that have a possibly limited lifetime are allocated in heap space. Among these are

These structures are liable to move any time an allocation occurs. Your code must be following the rules specified in the following subsections to ensure that your code lives happily with the garbage collector.

11.2.8.2 Asserting no allocation

The KVM provides the two macros ASSERTING_NO_ALLOCATION and END_ASSERTING_NO_ALLOCATION, which are used as shown in CODE EXAMPLE 4:

CODE EXAMPLE 4 Forbidding garbage collection
ASSERTING_NO_ALLOCATION 
    non allocating code 
END_ASSERTING_NO_ALLOCATION; 

These macros are provided for use only in DEBUG mode to guarantee that no allocation is performed by the code between the ASSERTING... and the END_ASSERTING... macro.

If your code is compiled with INCLUDEDEBUGCODE set to a non-zero value, then any allocation inside the specified code causes a fatal error.

If you use the macros, make sure that the non-allocating code inside the macros does not perform a return. The macro END_ASSERTING_NO_ALLOCATION contains cleanup code that must be executed.

You are encouraged to use these macros to indicate safe regions of code in which heap-allocated objects will not move.

11.2.8.3 Handles in old-style native functions

To deal with the fact that heap-allocated objects in the KVM can move, the garbage collector makes use of temporary “handles.” A handle is an indirect pointer to an object. Rather than being the address of the object itself, a handle is the address of a memory location that contains the address of the object.

The memory location that contains the address of the object must not itself be in the Java heap. In general, it is the address of a variable (for global roots) or the address of a location on the C stack (for temporary roots).

All type names in the KVM that end with _HANDLE indicate handles. If an argument has a handle as one of its arguments, the argument must be an indirect pointer, and must be registered with the garbage collector if the object could be in the Java heap.

CODE EXAMPLE 5 shows an example:

CODE EXAMPLE 5 Creating a handle
CLASS getClassX(CHAR_HANDLE name, int start, int length); 
 
/* Case 1, We are calling it with an argument that is known */ 
/*         not to be in the heap. */ 
const char *x = “java/lang/Object”; 
result = getClassX(&x, 0, strlen(x)); 
 
/* Case 2. We are calling it with a heap argument */ 
START_TEMPORARY_ROOTS 
    DECLARE_TEMPORARY_ROOT(char *, x, mallocBytes(100)); 
    sprintf(x, “java/lang/%s”, arg); 
    result = getClassX(&x, 0, strlen(x)); 
END_TEMPORARY_ROOTS 

11.2.8.4 Temporary Roots

The most common method is to use START_TEMPORARY_ROOTS and END_TEMPORARY_ROOTS to delimit a region of code. Within this region of code, the macro

    DECLARE_TEMPORARY_ROOT(type, variable,value) 

creates a local variable of the specified type with the specified initial value. The value must either be a pointer to an object in the heap, or it must be a value that is clearly not in the heap (such as NULL, a pointer to permanent space, or the like).2 The value &variable is registered with the garbage collector as a temporary root.

You are allowed to change the value of variable, provided that any new value is always either a pointer to an object in the heap, or a value that is clearly not in the heap.

The garbage collector ensures whenever a garbage collection occurs, the value of the variable is updated if the value has moved. In addition, &variable is a handle, and can be passed as an argument to any function that expects a handle.

Your code must not return. The END_TEMPORARY_ROOTS contains cleanup code that must be executed.

CODE EXAMPLE 6 below shows some sample code for a native method that takes a String and two integers as arguments, and which must allocate a temporary buffer.

CODE EXAMPLE 6 Temporary roots
START_TEMPORARY_ROOTS 
    int y = popStack(); 
    int x = popStack(); 
    DECLARE_TEMPORARY_ROOT(STRING_INSTANCE, string,  
                              popStackAsType(STRING_INSTANCE)); 
    DECLARE_TEMPORARY_ROOT(char*, buffer, mallocBytes(100)); 
    /* code that might perform allocation */ 
END_TEMPORARY_ROOTS 

If the code clearly cannot perform any allocation, then you could instead have written

    char* buffer = mallocBytes(100); 

Less commonly used is the macro

    DECLARE_TEMPORARY_ROOT_FROM_BASE(type, var, value, base) 

In this case base must be a pointer to an object in the heap, and value must be a pointer into the middle of the object. The variable var is assigned the value value. The garbage collector will treat base as a root. If base is moved by the garbage collector, the value of var will be adjusted appropriately.

11.2.8.5 Global roots

If your code initializes a C variable to point to an object in the Java heap, you can use the code shown in CODE EXAMPLE 7. There is currently no function for removing a variable from the set of global roots.

CODE EXAMPLE 7 Creating a global root
variable = <value> 
makeGlobalRoot(&(cell **)variable); 

This code ensures that the garbage collector knows that the specified variable contains a value that must be protected from garbage collection. If the garbage collector moves the object, the variable is updated to point to the new value.

11.2.8.6 Debugging your native code

A special garbage collector is provided to help you debug your native code and to ensure that it does not have any garbage collection problems. You access this garbage collector by replacing the file collector.c with collectorDebug.c. In addition, you should set the compiler flags INCLUDEDEBUGCODE and EXCESSIVE_GARBAGE_COLLECTION to 1.

This replaces the compact-in-place garbage collector with a 10-space Cheney style3 garbage collection algorithm. A garbage-collection will occur on every allocation, and also on some operations that might have allocated but didn’t. Every object moves on every garbage collection. In addition, this code makes use of memory-protection so that any attempts to read or write a bad pointer will generate a memory fault.

This code uses the following implementation-dependent functions:

    void* allocateVirtualMemory_md(long size); 
    void  freeVirtualMemory_md(void *address, long size); 
    void  protectVirtualMemory_md(void *address, long size,  
                                  int protection); 

Implementations of these three functions for Windows and for Unix are provided. You must implement these functions on your target platform.

11.2.8.7 Two-space Cheney garbage collector

The file collectorDebug.c (see §11.2.8.6) also includes an implementation of a two-space non-debugging Cheney garbage collector. You get this implementation by setting the compiler flag CHENEY_TWO_SPACE to a non-zero value.

The Cheney collector is smaller and faster than the standard garbage collector. However, it uses twice as much heap space. If your implementations has a lot of available memory, but needs a faster garbage collector, you might consider using this garbage collector.

This collector is not supported by Sun, and is provided as is.

11.2.9 Initialization and reinitialization of global variables

Generally, the C language guarantees that all global and static variables are initialized to 0 (zero).

The current implementation is designed to work within an embedded environment. For example, on the PalmOS, the user can start the virtual machine, exit a program, and then restart the virtual machine with a different set of arguments. There is no re-initialization of global or static variables between the two runs.

In general, your code cannot assume the initial value of any variable. You have several options for determining when it is necessary to perform one-time only initialization.

11.3 Native code lookup tables

Regardless of whether you use the KNI (Section 11.1 "Using the K Native Interface (KNI)”) or the old-style native method implementation technology (Section 11.2 "Implementing old-style native methods”), as part of the build process you must create the lookup tables that map methods to the corresponding native implementation.

The JavaCodeCompact (JCC) generates these tables automatically. You should use this utility to generate the lookup tables whether or not you are using the other features of JavaCodeCompact.

JavaCodeCompact is more fully described in Chapter 14. The specific details for creating the file containing the lookup tables can be found in Section 14.5 "Executing JavaCodeCompact.”

The name of the C function that implements a native method must be the same name that JNI4 would assign to the native method.

11.4 Asynchronous native methods


Note – The KNI implementation does not allow asynchronous native methods to be written using the KNI API. In order to write asynchronous native methods for the KVM, you must use the old-style native function interface as described in this section.

From the operating system viewpoint, KVM is just one process (C program) with only one native thread of execution. The multithreading capabilities of KVM have been implemented entirely in software without utilizing the possible multitasking capabilities of the underlying operating system. This approach not only makes the virtual machine highly portable and independent of the operating system, but also greatly simplifies the virtual machine design and improves the readability of the codebase, as the virtual machine designer does not have to worry about mutual exclusion and other problems typically associated with multithreaded software.

However, an unfortunate side effect of the approach described above is that by default, all native methods in KVM are “blocking.” This means that when a native function is called from the virtual machine, all the threads in the VM stop executing until the native method completes execution.

As a general guideline, all the native functions called from KVM should be written so that they complete their execution as soon as possible. However, in many environments this is not desirable or fully possible. For this reason, KVM includes an implementation of “asynchronous native methods” described below.

11.4.1 Design of asynchronous methods

The standard implementation of KVM runs as a single “task” from the operating system’s point of view. If a native method performs an operation that can block, the entire KVM blocks.

Asynchronous native methods are intended to solve this problem. When such a native method is called, the operation is performed “off-line” in an implementation-dependent manner. Other Java threads can continue running normally. When the native call finishes, the Java thread that originally called the native method continues.

To use asynchronous native methods, you must include

    #define ASYNCHRONOUS_NATIVE_FUNCTIONS 1 

in your machine-dependent include file.

Asynchronous native methods cannot be defined in the same file as normal native methods. In addition to their normal includes, they must also add the include file async.h.

Asynchronous methods should always have the following form:

ASYNC_FUNCTION_START(functionname) 
    code 
ASYNC_FUNCTION_END 

Your code must never use pushStack(), popStack(), topStack, or any macro or function that references the stack pointer, the frame pointer, or the current thread. Instead, you must use the alternative macros shown in Table 14.

TABLE 14  –  Macros used in asynchronous methods
Native function macro
Asynchronous native function macro
popStack
ASYNC_popStack
pushStack
ASYNC_pushStack
popLong
ASYNC_popLong
pushLong
ASYNC_pushLong
popStackAsType
ASYNC_popStackAsType
pushStackAsType
ASYNC_pushStackAsType
raiseException
ASYNC_raiseException
topStack
do not use this macro

In addition, your code must not perform a “return.” It must complete through the end, since ASYNC_FUNCTION_END may generate some necessary cleanup code.

All the macros in Table 14 have been designed so that if the symbol ASYNCHRONOUS_NATIVE_FUNCTIONS is 0, the asynchronous method compiles into a normal native method.

It is also important to note that unlike regular native methods, asynchronous native methods cannot allocate any memory from the Java heap. Because of this limitation, extra caution is often necessary when writing asynchronous native methods, since many internal routines in KVM may indirectly allocate memory from the Java heap.


Note – IMPORTANT: We repeat that asynchronous native methods must not allocate memory from the Java heap. Make sure that you read the paragraph above.

If you use asynchronous native methods, you must define the following machine-specific functions.

Enter or exit a critical section. The operating system must guarantee that at most one operating system task is allowed to be inside the critical section at a time.

11.4.2 Implementation of asynchronous methods

We envision two possible implementations of asynchronous methods.

In the current reference implementation, the function CallAsyncNativeFunction_md spawns off a separate operating system task which performs the indicated function. For example, in a Posix implementation one could use pthread_create.

CODE EXAMPLE 8 below shows one possible implementation of a method
int readBytes(byte[] dst, int offset, int length)
using this style of asynchronous native methods. This particular example assumes that the garbage collector does not move objects.

CODE EXAMPLE 8 Asynchronous implementation of ReadBytes
ASYNC_FUNCTION_START(ReadBytes) 
    long   length = ASYNC_popStack(); 
    long   offset = ASYNC_popStack(); 
    BYTEARRAY dst = ASYNC_popStackAsType(BYTEARRAY); 
    INSTANCE instance = ASYNC_popStackAsType(INSTANCE);/* this*/ 
    long fd = getFD(instance); 
    ASYNC_enableGarbageCollection(); 
    length = read(fd, dst->bdata + offset, length); 
    ASYNC_disableGarbageCollection(); 
    ASYNC_pushStack((length == 0) ? -1 : length); 
ASYNC_FUNCTION_END 

In an alternative implementation (CODE EXAMPLE 9), CallAsyncNativeFunction_md simply calls the function f directly. It assumes that the function f starts an operation, but does not wait for its completion. The operating system is required to provide some sort of interrupt or callback to indicate when the operation is complete.

The second implementation is far more operating system-dependent. It might be impossible to write native methods that can work both synchronously and asynchronously, depending on the value of a flag.

CODE EXAMPLE 9 Alternative asynchronous implementation of ReadBytes
static void ReadBytes(THREAD thisThread) 
{ 
    long   length = ASYNC_popStack(); 
    long   offset = ASYNC_popStack(); 
    BYTEARRAY dst = ASYNC_popStackAsType(BYTEARRAY); 
    INSTANCE instance = ASYNC_popStackAsType(INSTANCE); 
    long fd = getFD(instance); 
    THREAD thisThread = CurrentThread; 
    /* Call OS to perform I/O. Perform callback when done. */ 
    AsyncRead(fd, p + offset, length, ReadBytesDone,thisThread); 
} 
 
/* Callback function when I/O is finished */ 
static void ReadBytesDone(void *parm, int length) 
{ 
    THREAD thisThread = (THREAD)parm; 
    ASYNC_pushStack((length == 0) ? -1 : length); 
    ASYNC_RESUME_THREAD(); 
} 

Refer to Section 12.1.4 "Asynchronous notification,” for further information on writing asynchronous code.

1The Java Native Interface: Programmer’s Guide and Specification (Java Series) by Sheng Liang (Addison Wesley, 1999).
2The main purpose of this limitation is that the variable should not have a random integer as its value, and that the variable must be initialized.
3C.J. Cheney. A non-recursive list compacting algorithm. Communications of the ACM, 13(11):677-8, November 1970.
4See The Java Native Interface: Programmer’s Guide and Specification (Java Series) by Sheng Liang (Addison Wesley, 1999), for complete information on the JNI naming scheme. This information is available online at http://java.sun.com/docs/books/jni/index.html.

 


Contents Previous Next KVM Porting Guide
, CLDC 1.1