A Java virtual machine commonly needs access to various native functions in order to interact with the outside world. For instance, all the low-level graphics functions, file access functions, networking functions, or other similar routines that depend on the underlying operating system services typically need to be written in native code.
The way these native functions are made available to the Java virtual machine can vary from one virtual machine implementation to another. In order to minimize the work that is needed when porting the native functions, the Java Native Interface1 (JNI) standard was created. The Java Native Interface generally serves two purposes: 1) JNI serves as a common interface for virtual machine implementers so that the same native functions will work unmodified with different virtual machines; 2) JNI provides Java-level APIs that make it possible for a Java programmer to dynamically load libraries and access native functions in those libraries.
Unfortunately, because of its general nature, JNI is rather expensive and introduces a significant memory and performance overhead to native function calls. Also, the ability to dynamically load and call arbitrary native functions from Java programs could pose security problems in the absence of the full Java 2 security model.
KVM does not support the Java Native Interface (JNI). Rather, KVM supports an interface called K Native Interface (KNI), which implements a logical subset of JNI that is significantly more efficient in terms of performance and memory consumption. In addition, KVM also supports an older interface that allows native functions to be added to the KVM in a VM-specific fashion. Information on KNI is provided below in Section 11.1 "Using the K Native Interface (KNI)”. Information for writing native functions using the old-style native interface is provided in Section 11.2 "Implementing old-style native methods.”
Starting from KVM 1.0.4, KVM has a new interface for writing native functions. The high-level goal of this new interface, K Native Interface (KNI), is to allow native functions to be added to the KVM (and other small-footprint virtual machines) in a manner that is both highly efficient and fully independent of the internal structures of the virtual machine. The K Native Interface (KNI) Specification, (Sun Microsystems, Inc., 2001) (KNI Specification) defines a logical subset of the Java Native Interface (JNI) that is well-suited for low-power, memory-constrained devices. KNI follows the function naming conventions and other aspects of the JNI as far as this is possible and reasonable within the strict memory limits of CLDC target devices and in the absence of the full Java 2 security model. Since KNI is intended to be significantly more lightweight than JNI, some aspects of the interface, such as the parameter passing conventions, have been completely redesigned and are significantly different from JNI.
The K Native Interface is described in more detail in the KNI Specification. Please refer to this specification to learn about the K Native Interface and its usage.
Things to remember when getting started with KNI: One of the key goals of the KNI is to isolate the native function programmer from the implementation details of the virtual machine. Instead of writing native functions using KVM-specific functions and data structures (as required by the old-style native interface described in Section 11.2 "Implementing old-style native methods”), KNI allows native functions to be written using a set of functions that operate identically and efficiently across a wide variety of virtual machines. To ensure portability of native code, the native function programmer shall not use any KVM-specific include files or KVM-specific functions or data types. Rather, the programmer must include the file “kni.h” and use functions and data types defined in that file.
WARNING: You should not write old-style native methods unless you have thoroughly read through the implementation and understand its structures. Most of the material in this porting guide is moderately straightforward. The material in this subsection is not!
Old-style native methods must be written extremely carefully. Inattention to detail will cause fatal errors in the virtual machine.
Your code containing old-style native functions should begin with the line
which causes all include files that are part of KVM to be included. You might also need to #include
additional files.
When a native method is called, its arguments are on top of the Java stack. A static method’s arguments should be popped from the stack in the reverse order from which they were pushed. CODE EXAMPLE 3 shows an example of this coding style:
Java code:
Native implementation:
static void Java_com_sun_kjava_Graphics_drawRectangle() { int height = popStack(); int width = popStack(); int y = popStack(); int x = popStack(); windowSystemDrawRectangle(x, y, width, height); }
An instance method (non-static method) must pop the this
argument off the stack after it has popped the rest of the arguments.
this
argument in a native instance method will almost surely cause a fatal error in the virtual machine.
Table 12 shows the macros that should be used to pop arguments off the stack:
If a native method returns a result, it must push that result onto the stack. The native code should use the appropriate macro shown in Table 13 to push the result back onto the stack:
Some native code uses the macro topStack
instead of popping the last argument off the stack. It then sets topStack
to the value it wants to return.
This practice is not encouraged. It should only be used for “one-liners” that access the argument and return the value in a single statement. pushStack
and popStack
cannot be used in this case, since C would not guarantee their order of evaluation.
In general, it is safer to pop the value, perform the calculation, and push the value back onto the stack as three separate steps.
Native code cannot call back into Java code. KVM provides a mechanism by which native code can alter the interpreter state to begin executing a new piece of code. Upon finishing executing that code, the mechanism can indicate a new C function which should be called.
If the native code needs to throw an error or exception, it should call the function
where the exceptionClassName argument is the exception
class or error
class.
Other useful functions that a native method might need to call are the following:
void fatalError(const char* errorMessage);
The code calls this method to indicate that a serious error has occurred. errorMessage
argument is a brief explanation of the problem. This method does not return.CLASS getClass(const char *name);
This method returns the class whose name is the indicated argument. You might want to coerce the return result to be an INSTANCE_CLASS
or an ARRAY_CLASS
.STRING_INSTANCE instantiateString(const char* string,
int length);
This method converts the given C string into a Java string
.char *getStringContents(STRING_INSTANCE string);
The instance argument must be a Java string
. It is converted into a null-terminated C string, and returned as the result.INSTANCE instantiate(INSTANCE_CLASS class);
Creates a new Java instance of the specified class.ARRAY instantiateArray(ARRAY_CLASS arrayClass, long length);
Creates a Java array of the specified type and length.SHORTARRAY createCharArray(const char* utf8stringArg,
int utf8length,
int* unicodelengthP,
bool_t is_permanent);
Creates a Java character array from the C string passed as an argument.char* mallocBytes(long sizeInBytes);
Allocates a memory block in the garbage-collected heap that is big enough to hold sizeInBytes
number of bytes. You can create a temporary root (Section 11.2.8 "Garbage collection issues”) to prevent the memory block from being garbage-collected.The C stack is not scanned when the KVM performs a garbage collection. If your native code allocates new Java objects, you must take special precautions to prevent your new Java objects from being garbage collected inadvertently.
Since the release 1.0.2, KVM includes a compacting garbage collector. Any time that your native code performs an allocation, objects in the Java heap can move. This includes any arguments passed to your native function and any previous heap allocations performed by your native code.
If your code must perform allocation, it is important that you
The garbage collector can get erroneous results if an allocation occurs while an argument or return value is on the Java stack. The rest of this chapter describes how your code can interact correctly with the garbage collector.
In order to simplify the garbage collector, the KVM’s memory is divided into two spaces: “permanent space” and “heap space”.
All objects created in permanent space are, well, permanent. These objects are
Among the objects that are allocated in permanent space are
java.lang.String
(but not all strings).These objects are never moved and never freed after they are created.
Structures that have a possibly limited lifetime are allocated in heap space. Among these are
String
s),These structures are liable to move any time an allocation occurs. Your code must be following the rules specified in the following subsections to ensure that your code lives happily with the garbage collector.
The KVM provides the two macros ASSERTING_NO_ALLOCATION
and END_ASSERTING_NO_ALLOCATION
, which are used as shown in CODE EXAMPLE 4:
These macros are provided for use only in DEBUG
mode to guarantee that no allocation is performed by the code between the ASSERTING
... and the END_ASSERTING
... macro.
If your code is compiled with INCLUDEDEBUGCODE
set to a non-zero value, then any allocation inside the specified code causes a fatal error.
If you use the macros, make sure that the non-allocating code inside the macros does not perform a return
. The macro END_ASSERTING_NO_ALLOCATION
contains cleanup code that must be executed.
You are encouraged to use these macros to indicate safe regions of code in which heap-allocated objects will not move.
To deal with the fact that heap-allocated objects in the KVM can move, the garbage collector makes use of temporary “handles.” A handle is an indirect pointer to an object. Rather than being the address of the object itself, a handle is the address of a memory location that contains the address of the object.
The memory location that contains the address of the object must not itself be in the Java heap. In general, it is the address of a variable (for global roots) or the address of a location on the C stack (for temporary roots).
All type names in the KVM that end with _HANDLE
indicate handles. If an argument has a handle as one of its arguments, the argument must be an indirect pointer, and must be registered with the garbage collector if the object could be in the Java heap.
CODE EXAMPLE 5 shows an example:
CLASS getClassX(CHAR_HANDLE name, int start, int length); /* Case 1, We are calling it with an argument that is known */ /* not to be in the heap. */ const char *x = “java/lang/Object”; result = getClassX(&x, 0, strlen(x)); /* Case 2. We are calling it with a heap argument */ START_TEMPORARY_ROOTS DECLARE_TEMPORARY_ROOT(char *, x, mallocBytes(100)); sprintf(x, “java/lang/%s”, arg); result = getClassX(&x, 0, strlen(x)); END_TEMPORARY_ROOTS
The most common method is to use START_TEMPORARY_ROOTS
and END_TEMPORARY_ROOTS
to delimit a region of code. Within this region of code, the macro
creates a local variable of the specified type with the specified initial value. The value must either be a pointer to an object in the heap, or it must be a value that is clearly not in the heap (such as NULL
, a pointer to permanent space, or the like).2 The value &variable
is registered with the garbage collector as a temporary root.
You are allowed to change the value of variable
, provided that any new value is always either a pointer to an object in the heap, or a value that is clearly not in the heap.
The garbage collector ensures whenever a garbage collection occurs, the value of the variable is updated if the value has moved. In addition, &variable
is a handle, and can be passed as an argument to any function that expects a handle.
Your code must not return
. The END_TEMPORARY_ROOTS
contains cleanup code that must be executed.
CODE EXAMPLE 6 below shows some sample code for a native method that takes a String
and two integers as arguments, and which must allocate a temporary buffer.
START_TEMPORARY_ROOTS int y = popStack(); int x = popStack(); DECLARE_TEMPORARY_ROOT(STRING_INSTANCE, string, popStackAsType(STRING_INSTANCE)); DECLARE_TEMPORARY_ROOT(char*, buffer, mallocBytes(100)); /* code that might perform allocation */ END_TEMPORARY_ROOTS
If the code clearly cannot perform any allocation, then you could instead have written
Less commonly used is the macro
In this case base must be a pointer to an object in the heap, and value must be a pointer into the middle of the object. The variable var
is assigned the value value. The garbage collector will treat base as a root. If base is moved by the garbage collector, the value of var
will be adjusted appropriately.
If your code initializes a C variable to point to an object in the Java heap, you can use the code shown in CODE EXAMPLE 7. There is currently no function for removing a variable from the set of global roots.
This code ensures that the garbage collector knows that the specified variable contains a value that must be protected from garbage collection. If the garbage collector moves the object, the variable is updated to point to the new value.
A special garbage collector is provided to help you debug your native code and to ensure that it does not have any garbage collection problems. You access this garbage collector by replacing the file collector.c
with collectorDebug.c
. In addition, you should set the compiler flags INCLUDEDEBUGCODE
and EXCESSIVE_GARBAGE_COLLECTION
to 1.
This replaces the compact-in-place garbage collector with a 10-space Cheney style3 garbage collection algorithm. A garbage-collection will occur on every allocation, and also on some operations that might have allocated but didn’t. Every object moves on every garbage collection. In addition, this code makes use of memory-protection so that any attempts to read or write a bad pointer will generate a memory fault.
This code uses the following implementation-dependent functions:
void* allocateVirtualMemory_md(long size); void freeVirtualMemory_md(void *address, long size); void protectVirtualMemory_md(void *address, long size, int protection);
Implementations of these three functions for Windows and for Unix are provided. You must implement these functions on your target platform.
The file collectorDebug.c
(see §11.2.8.6) also includes an implementation of a two-space non-debugging Cheney garbage collector. You get this implementation by setting the compiler flag CHENEY_TWO_SPACE
to a non-zero value.
The Cheney collector is smaller and faster than the standard garbage collector. However, it uses twice as much heap space. If your implementations has a lot of available memory, but needs a faster garbage collector, you might consider using this garbage collector.
This collector is not supported by Sun, and is provided as is.
Generally, the C language guarantees that all global and static variables are initialized to 0
(zero).
The current implementation is designed to work within an embedded environment. For example, on the PalmOS, the user can start the virtual machine, exit a program, and then restart the virtual machine with a different set of arguments. There is no re-initialization of global or static variables between the two runs.
In general, your code cannot assume the initial value of any variable. You have several options for determining when it is necessary to perform one-time only initialization.
InitializeNativeCode()
to either initialize your variables, or to set a flag indicating that initialization needs to be performed.makeGlobalRoot()
above), its value is guaranteed to be 0 the next time that the virtual machine is run.Regardless of whether you use the KNI (Section 11.1 "Using the K Native Interface (KNI)”) or the old-style native method implementation technology (Section 11.2 "Implementing old-style native methods”), as part of the build process you must create the lookup tables that map methods to the corresponding native implementation.
The JavaCodeCompact (JCC) generates these tables automatically. You should use this utility to generate the lookup tables whether or not you are using the other features of JavaCodeCompact.
JavaCodeCompact is more fully described in Chapter 14. The specific details for creating the file containing the lookup tables can be found in Section 14.5 "Executing JavaCodeCompact.”
The name of the C function that implements a native method must be the same name that JNI4 would assign to the native method.
From the operating system viewpoint, KVM is just one process (C program) with only one native thread of execution. The multithreading capabilities of KVM have been implemented entirely in software without utilizing the possible multitasking capabilities of the underlying operating system. This approach not only makes the virtual machine highly portable and independent of the operating system, but also greatly simplifies the virtual machine design and improves the readability of the codebase, as the virtual machine designer does not have to worry about mutual exclusion and other problems typically associated with multithreaded software.
However, an unfortunate side effect of the approach described above is that by default, all native methods in KVM are “blocking.” This means that when a native function is called from the virtual machine, all the threads in the VM stop executing until the native method completes execution.
As a general guideline, all the native functions called from KVM should be written so that they complete their execution as soon as possible. However, in many environments this is not desirable or fully possible. For this reason, KVM includes an implementation of “asynchronous native methods” described below.
The standard implementation of KVM runs as a single “task” from the operating system’s point of view. If a native method performs an operation that can block, the entire KVM blocks.
Asynchronous native methods are intended to solve this problem. When such a native method is called, the operation is performed “off-line” in an implementation-dependent manner. Other Java threads can continue running normally. When the native call finishes, the Java thread that originally called the native method continues.
To use asynchronous native methods, you must include
in your machine-dependent include file.
Asynchronous native methods cannot be defined in the same file as normal native methods. In addition to their normal includes, they must also add the include file async.h
.
Asynchronous methods should always have the following form:
Your code must never use pushStack()
, popStack()
, topStack
, or any macro or function that references the stack pointer, the frame pointer, or the current thread. Instead, you must use the alternative macros shown in Table 14.
In addition, your code must not perform a “return.” It must complete through the end, since ASYNC_FUNCTION_END
may generate some necessary cleanup code.
All the macros in Table 14 have been designed so that if the symbol ASYNCHRONOUS_NATIVE_FUNCTIONS
is 0, the asynchronous method compiles into a normal native method.
It is also important to note that unlike regular native methods, asynchronous native methods cannot allocate any memory from the Java heap. Because of this limitation, extra caution is often necessary when writing asynchronous native methods, since many internal routines in KVM may indirectly allocate memory from the Java heap.
If you use asynchronous native methods, you must define the following machine-specific functions.
void Yield_md()
Pause this operating system task momentarily and allow other tasks to run.
void CallAsyncNativeFunction_md(ASYNCIOCB *iocb, void(*afp)(ASYNCIOCB *))
Call an asynchronous native function. This function is called by the ASYNC_FUNCTION_START
macro to start a new asynchronous function. The function takes as a parameter a data structure that is used by the garbage collector to keep up to date object pointers used by the native code, and a function to call. This function will typically start a new native thread and have that call the supplied function with the ASYNCIOCB
as its parameter.
enterSystemCriticalSection()
exitSystemCriticalSection()
Enter or exit a critical section. The operating system must guarantee that at most one operating system task is allowed to be inside the critical section at a time.
We envision two possible implementations of asynchronous methods.
In the current reference implementation, the function CallAsyncNativeFunction_md
spawns off a separate operating system task which performs the indicated function. For example, in a Posix implementation one could use pthread_create
.
CODE EXAMPLE 8 below shows one possible implementation of a method int
readBytes(byte[]
dst,
int
offset,
int
length)
using this style of asynchronous native methods. This particular example assumes that the garbage collector does not move objects.
ASYNC_FUNCTION_START(ReadBytes) long length = ASYNC_popStack(); long offset = ASYNC_popStack(); BYTEARRAY dst = ASYNC_popStackAsType(BYTEARRAY); INSTANCE instance = ASYNC_popStackAsType(INSTANCE);/* this*/ long fd = getFD(instance); ASYNC_enableGarbageCollection(); length = read(fd, dst->bdata + offset, length); ASYNC_disableGarbageCollection(); ASYNC_pushStack((length == 0) ? -1 : length); ASYNC_FUNCTION_END
In an alternative implementation (CODE EXAMPLE 9), CallAsyncNativeFunction_md
simply calls the function f
directly. It assumes that the function f
starts an operation, but does not wait for its completion. The operating system is required to provide some sort of interrupt or callback to indicate when the operation is complete.
The second implementation is far more operating system-dependent. It might be impossible to write native methods that can work both synchronously and asynchronously, depending on the value of a flag.
static void ReadBytes(THREAD thisThread) { long length = ASYNC_popStack(); long offset = ASYNC_popStack(); BYTEARRAY dst = ASYNC_popStackAsType(BYTEARRAY); INSTANCE instance = ASYNC_popStackAsType(INSTANCE); long fd = getFD(instance); THREAD thisThread = CurrentThread; /* Call OS to perform I/O. Perform callback when done. */ AsyncRead(fd, p + offset, length, ReadBytesDone,thisThread); } /* Callback function when I/O is finished */ static void ReadBytesDone(void *parm, int length) { THREAD thisThread = (THREAD)parm; ASYNC_pushStack((length == 0) ? -1 : length); ASYNC_RESUME_THREAD(); }
Refer to Section 12.1.4 "Asynchronous notification,” for further information on writing asynchronous code.
KVM Porting Guide , CLDC 1.1 |
Copyright © 2003 Sun Microsystems, Inc. All rights reserved.