| <!DOCTYPE article PUBLIC "-//Davenport//DTD DocBook V3.0//EN"> |
| <article> |
| <artheader> |
| <title>The Cygnus Native Interface for C++/Java Integration</title> |
| <subtitle>Writing native Java methods in natural C++</subtitle> |
| <authorgroup> |
| <corpauthor>Cygnus Solutions</corpauthor> |
| </authorgroup> |
| <date>March, 2000</date> |
| </artheader> |
| |
| <abstract><para> |
| This documents CNI, the Cygnus Native Interface, |
| which is is a convenient way to write Java native methods using C++. |
| This is a more efficient, more convenient, but less portable |
| alternative to the standard JNI (Java Native Interface).</para> |
| </abstract> |
| |
| <sect1><title>Basic Concepts</title> |
| <para> |
| In terms of languages features, Java is mostly a subset |
| of C++. Java has a few important extensions, plus a powerful standard |
| class library, but on the whole that does not change the basic similarity. |
| Java is a hybrid object-oriented language, with a few native types, |
| in addition to class types. It is class-based, where a class may have |
| static as well as per-object fields, and static as well as instance methods. |
| Non-static methods may be virtual, and may be overloaded. Overloading is |
| resolved at compile time by matching the actual argument types against |
| the parameter types. Virtual methods are implemented using indirect calls |
| through a dispatch table (virtual function table). Objects are |
| allocated on the heap, and initialized using a constructor method. |
| Classes are organized in a package hierarchy. |
| </para> |
| <para> |
| All of the listed attributes are also true of C++, though C++ has |
| extra features (for example in C++ objects may be allocated not just |
| on the heap, but also statically or in a local stack frame). Because |
| <acronym>gcj</acronym> uses the same compiler technology as |
| <acronym>g++</acronym> (the GNU C++ compiler), it is possible |
| to make the intersection of the two languages use the same |
| <acronym>ABI</acronym> (object representation and calling conventions). |
| The key idea in <acronym>CNI</acronym> is that Java objects are C++ objects, |
| and all Java classes are C++ classes (but not the other way around). |
| So the most important task in integrating Java and C++ is to |
| remove gratuitous incompatibilities. |
| </para> |
| <para> |
| You write CNI code as a regular C++ source file. (You do have to use |
| a Java/CNI-aware C++ compiler, specifically a recent version of G++.)</para> |
| <para> |
| You start with: |
| <programlisting> |
| #include <gcj/cni.h> |
| </programlisting></para> |
| |
| <para> |
| You then include header files for the various Java classes you need |
| to use: |
| <programlisting> |
| #include <java/lang/Character.h> |
| #include <java/util/Date.h> |
| #include <java/lang/IndexOutOfBoundsException.h> |
| </programlisting></para> |
| |
| <para> |
| In general, <acronym>CNI</acronym> functions and macros start with the |
| `<literal>Jv</literal>' prefix, for example the function |
| `<literal>JvNewObjectArray</literal>'. This convention is used to |
| avoid conflicts with other libraries. |
| Internal functions in <acronym>CNI</acronym> start with the prefix |
| `<literal>_Jv_</literal>'. You should not call these; |
| if you find a need to, let us know and we will try to come up with an |
| alternate solution. (This manual lists <literal>_Jv_AllocBytes</literal> |
| as an example; <acronym>CNI</acronym> should instead provide |
| a <literal>JvAllocBytes</literal> function.)</para> |
| <para> |
| These header files are automatically generated by <command>gcjh</command>. |
| </para> |
| </sect1> |
| |
| <sect1><title>Packages</title> |
| <para> |
| The only global names in Java are class names, and packages. |
| A <firstterm>package</firstterm> can contain zero or more classes, and |
| also zero or more sub-packages. |
| Every class belongs to either an unnamed package or a package that |
| has a hierarchical and globally unique name. |
| </para> |
| <para> |
| A Java package is mapped to a C++ <firstterm>namespace</firstterm>. |
| The Java class <literal>java.lang.String</literal> |
| is in the package <literal>java.lang</literal>, which is a sub-package |
| of <literal>java</literal>. The C++ equivalent is the |
| class <literal>java::lang::String</literal>, |
| which is in the namespace <literal>java::lang</literal>, |
| which is in the namespace <literal>java</literal>. |
| </para> |
| <para> |
| Here is how you could express this: |
| <programlisting> |
| // Declare the class(es), possibly in a header file: |
| namespace java { |
| namespace lang { |
| class Object; |
| class String; |
| ... |
| } |
| } |
| |
| class java::lang::String : public java::lang::Object |
| { |
| ... |
| }; |
| </programlisting> |
| </para> |
| <para> |
| The <literal>gcjh</literal> tool automatically generates the |
| nessary namespace declarations.</para> |
| |
| <sect2><title>Nested classes as a substitute for namespaces</title> |
| <para> |
| <!-- FIXME the next line reads poorly jsm --> |
| It is not that long since g++ got complete namespace support, |
| and it was very recent (end of February 1999) that <literal>libgcj</literal> |
| was changed to uses namespaces. Releases before then used |
| nested classes, which are the C++ equivalent of Java inner classes. |
| They provide similar (though less convenient) functionality. |
| The old syntax is: |
| <programlisting> |
| class java { |
| class lang { |
| class Object; |
| class String; |
| }; |
| }; |
| </programlisting> |
| The obvious difference is the use of <literal>class</literal> instead |
| of <literal>namespace</literal>. The more important difference is |
| that all the members of a nested class have to be declared inside |
| the parent class definition, while namespaces can be defined in |
| multiple places in the source. This is more convenient, since it |
| corresponds more closely to how Java packages are defined. |
| The main difference is in the declarations; the syntax for |
| using a nested class is the same as with namespaces: |
| <programlisting> |
| class java::lang::String : public java::lang::Object |
| { ... } |
| </programlisting> |
| Note that the generated code (including name mangling) |
| using nested classes is the same as that using namespaces.</para> |
| </sect2> |
| |
| <sect2><title>Leaving out package names</title> |
| <para> |
| <!-- FIXME next line reads poorly jsm --> |
| Having to always type the fully-qualified class name is verbose. |
| It also makes it more difficult to change the package containing a class. |
| The Java <literal>package</literal> declaration specifies that the |
| following class declarations are in the named package, without having |
| to explicitly name the full package qualifiers. |
| The <literal>package</literal> declaration can be followed by zero or |
| more <literal>import</literal> declarations, which allows either |
| a single class or all the classes in a package to be named by a simple |
| identifier. C++ provides something similar |
| with the <literal>using</literal> declaration and directive. |
| </para> |
| <para> |
| A Java simple-type-import declaration: |
| <programlisting> |
| import <replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable>; |
| </programlisting> |
| allows using <replaceable>TypeName</replaceable> as a shorthand for |
| <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>. |
| The C++ (more-or-less) equivalent is a <literal>using</literal>-declaration: |
| <programlisting> |
| using <replaceable>PackageName</replaceable>::<replaceable>TypeName</replaceable>; |
| </programlisting> |
| </para> |
| <para> |
| A Java import-on-demand declaration: |
| <programlisting> |
| import <replaceable>PackageName</replaceable>.*; |
| </programlisting> |
| allows using <replaceable>TypeName</replaceable> as a shorthand for |
| <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal> |
| The C++ (more-or-less) equivalent is a <literal>using</literal>-directive: |
| <programlisting> |
| using namespace <replaceable>PackageName</replaceable>; |
| </programlisting> |
| </para> |
| </sect2> |
| </sect1> |
| |
| <sect1><title>Primitive types</title> |
| <para> |
| Java provides 8 <quote>primitives</quote> types: |
| <literal>byte</literal>, <literal>short</literal>, <literal>int</literal>, |
| <literal>long</literal>, <literal>float</literal>, <literal>double</literal>, |
| <literal>char</literal>, and <literal>boolean</literal>. |
| These are the same as the following C++ <literal>typedef</literal>s |
| (which are defined by <literal>gcj/cni.h</literal>): |
| <literal>jbyte</literal>, <literal>jshort</literal>, <literal>jint</literal>, |
| <literal>jlong</literal>, <literal>jfloat</literal>, |
| <literal>jdouble</literal>, |
| <literal>jchar</literal>, and <literal>jboolean</literal>. |
| You should use the C++ typenames |
| (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>jint</literal>), |
| and not the Java types names |
| (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>int</literal>), |
| even if they are <quote>the same</quote>. |
| This is because there is no guarantee that the C++ type |
| <literal>int</literal> is a 32-bit type, but <literal>jint</literal> |
| <emphasis>is</emphasis> guaranteed to be a 32-bit type. |
| |
| <informaltable frame="all" colsep="1" rowsep="0"> |
| <tgroup cols="3"> |
| <thead> |
| <row> |
| <entry>Java type</entry> |
| <entry>C/C++ typename</entry> |
| <entry>Description</entry> |
| </thead> |
| <tbody> |
| <row> |
| <entry>byte</entry> |
| <entry>jbyte</entry> |
| <entry>8-bit signed integer</entry> |
| </row> |
| <row> |
| <entry>short</entry> |
| <entry>jshort</entry> |
| <entry>16-bit signed integer</entry> |
| </row> |
| <row> |
| <entry>int</entry> |
| <entry>jint</entry> |
| <entry>32-bit signed integer</entry> |
| </row> |
| <row> |
| <entry>long</entry> |
| <entry>jlong</entry> |
| <entry>64-bit signed integer</entry> |
| </row> |
| <row> |
| <entry>float</entry> |
| <entry>jfloat</entry> |
| <entry>32-bit IEEE floating-point number</entry> |
| </row> |
| <row> |
| <entry>double</entry> |
| <entry>jdouble</entry> |
| <entry>64-bit IEEE floating-point number</entry> |
| </row> |
| <row> |
| <entry>char</entry> |
| <entry>jchar</entry> |
| <entry>16-bit Unicode character</entry> |
| </row> |
| <row> |
| <entry>boolean</entry> |
| <entry>jboolean</entry> |
| <entry>logical (Boolean) values</entry> |
| </row> |
| <row> |
| <entry>void</entry> |
| <entry>void</entry> |
| <entry>no value</entry> |
| </row> |
| </tbody></tgroup> |
| </informaltable> |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef><function>JvPrimClass</function></funcdef> |
| <paramdef><parameter>primtype</parameter></paramdef> |
| </funcsynopsis> |
| This is a macro whose argument should be the name of a primitive |
| type, <ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> |
| <literal>byte</literal>. |
| The macro expands to a pointer to the <literal>Class</literal> object |
| corresponding to the primitive type. |
| <ForeignPhrase><Abbrev>E.g.</Abbrev></ForeignPhrase>, |
| <literal>JvPrimClass(void)</literal> |
| has the same value as the Java expression |
| <literal>Void.TYPE</literal> (or <literal>void.class</literal>). |
| </para> |
| |
| </sect1> |
| |
| <sect1><title>Objects and Classes</title> |
| <sect2><title>Classes</title> |
| <para> |
| All Java classes are derived from <literal>java.lang.Object</literal>. |
| C++ does not have a unique <quote>root</quote>class, but we use |
| a C++ <literal>java::lang::Object</literal> as the C++ version |
| of the <literal>java.lang.Object</literal> Java class. All |
| other Java classes are mapped into corresponding C++ classes |
| derived from <literal>java::lang::Object</literal>.</para> |
| <para> |
| Interface inheritance (the <quote><literal>implements</literal></quote> |
| keyword) is currently not reflected in the C++ mapping.</para> |
| </sect2> |
| <sect2><title>Object references</title> |
| <para> |
| We implement a Java object reference as a pointer to the start |
| of the referenced object. It maps to a C++ pointer. |
| (We cannot use C++ references for Java references, since |
| once a C++ reference has been initialized, you cannot change it to |
| point to another object.) |
| The <literal>null</literal> Java reference maps to the <literal>NULL</literal> |
| C++ pointer. |
| </para> |
| <para> |
| Note that in some Java implementations an object reference is implemented as |
| a pointer to a two-word <quote>handle</quote>. One word of the handle |
| points to the fields of the object, while the other points |
| to a method table. Gcj does not use this extra indirection. |
| </para> |
| </sect2> |
| <sect2><title>Object fields</title> |
| <para> |
| Each object contains an object header, followed by the instance |
| fields of the class, in order. The object header consists of |
| a single pointer to a dispatch or virtual function table. |
| (There may be extra fields <quote>in front of</quote> the object, |
| for example for |
| memory management, but this is invisible to the application, and |
| the reference to the object points to the dispatch table pointer.) |
| </para> |
| <para> |
| The fields are laid out in the same order, alignment, and size |
| as in C++. Specifically, 8-bite and 16-bit native types |
| (<literal>byte</literal>, <literal>short</literal>, <literal>char</literal>, |
| and <literal>boolean</literal>) are <emphasis>not</emphasis> |
| widened to 32 bits. |
| Note that the Java VM does extend 8-bit and 16-bit types to 32 bits |
| when on the VM stack or temporary registers.</para> |
| <para> |
| If you include the <literal>gcjh</literal>-generated header for a |
| class, you can access fields of Java classes in the <quote>natural</quote> |
| way. Given the following Java class: |
| <programlisting> |
| public class Int |
| { |
| public int i; |
| public Integer (int i) { this.i = i; } |
| public static zero = new Integer(0); |
| } |
| </programlisting> |
| you can write: |
| <programlisting> |
| #include <gcj/cni.h> |
| #include <Int.h> |
| Int* |
| mult (Int *p, jint k) |
| { |
| if (k == 0) |
| return Int::zero; // static member access. |
| return new Int(p->i * k); |
| } |
| </programlisting> |
| </para> |
| <para> |
| <acronym>CNI</acronym> does not strictly enforce the Java access |
| specifiers, because Java permissions cannot be directly mapped |
| into C++ permission. Private Java fields and methods are mapped |
| to private C++ fields and methods, but other fields and methods |
| are mapped to public fields and methods. |
| </para> |
| </sect2> |
| </sect1> |
| |
| <sect1><title>Arrays</title> |
| <para> |
| While in many ways Java is similar to C and C++, |
| it is quite different in its treatment of arrays. |
| C arrays are based on the idea of pointer arithmetic, |
| which would be incompatible with Java's security requirements. |
| Java arrays are true objects (array types inherit from |
| <literal>java.lang.Object</literal>). An array-valued variable |
| is one that contains a reference (pointer) to an array object. |
| </para> |
| <para> |
| Referencing a Java array in C++ code is done using the |
| <literal>JArray</literal> template, which as defined as follows: |
| <programlisting> |
| class __JArray : public java::lang::Object |
| { |
| public: |
| int length; |
| }; |
| |
| template<class T> |
| class JArray : public __JArray |
| { |
| T data[0]; |
| public: |
| T& operator[](jint i) { return data[i]; } |
| }; |
| </programlisting></para> |
| <para> |
| <funcsynopsis> |
| <funcdef>template<class T> T *<function>elements</function></funcdef> |
| <paramdef>JArray<T> &<parameter>array</parameter></paramdef> |
| </funcsynopsis> |
| This template function can be used to get a pointer to the |
| elements of the <parameter>array</parameter>. |
| For instance, you can fetch a pointer |
| to the integers that make up an <literal>int[]</literal> like so: |
| <programlisting> |
| extern jintArray foo; |
| jint *intp = elements (foo); |
| </programlisting> |
| The name of this function may change in the future.</para> |
| <para> |
| There are a number of typedefs which correspond to typedefs from JNI. |
| Each is the type of an array holding objects of the appropriate type: |
| <programlisting> |
| typedef __JArray *jarray; |
| typedef JArray<jobject> *jobjectArray; |
| typedef JArray<jboolean> *jbooleanArray; |
| typedef JArray<jbyte> *jbyteArray; |
| typedef JArray<jchar> *jcharArray; |
| typedef JArray<jshort> *jshortArray; |
| typedef JArray<jint> *jintArray; |
| typedef JArray<jlong> *jlongArray; |
| typedef JArray<jfloat> *jfloatArray; |
| typedef JArray<jdouble> *jdoubleArray; |
| </programlisting> |
| </para> |
| <para> |
| You can create an array of objects using this function: |
| <funcsynopsis> |
| <funcdef>jobjectArray <function>JvNewObjectArray</function></funcdef> |
| <paramdef>jint <parameter>length</parameter></paramdef> |
| <paramdef>jclass <parameter>klass</parameter></paramdef> |
| <paramdef>jobject <parameter>init</parameter></paramdef> |
| </funcsynopsis> |
| Here <parameter>klass</parameter> is the type of elements of the array; |
| <parameter>init</parameter> is the initial |
| value to be put into every slot in the array. |
| </para> |
| <para> |
| For each primitive type there is a function which can be used |
| to create a new array holding that type. The name of the function |
| is of the form |
| `<literal>JvNew<<replaceable>Type</replaceable>>Array</literal>', |
| where `<<replaceable>Type</replaceable>>' is the name of |
| the primitive type, with its initial letter in upper-case. For |
| instance, `<literal>JvNewBooleanArray</literal>' can be used to create |
| a new array of booleans. |
| Each such function follows this example: |
| <funcsynopsis> |
| <funcdef>jbooleanArray <function>JvNewBooleanArray</function></funcdef> |
| <paramdef>jint <parameter>length</parameter></paramdef> |
| </funcsynopsis> |
| </para> |
| <para> |
| <funcsynopsis> |
| <funcdef>jsize <function>JvGetArrayLength</function></funcdef> |
| <paramdef>jarray <parameter>array</parameter></paramdef> |
| </funcsynopsis> |
| Returns the length of <parameter>array</parameter>.</para> |
| </sect1> |
| |
| <sect1><title>Methods</title> |
| |
| <para> |
| Java methods are mapped directly into C++ methods. |
| The header files generated by <literal>gcjh</literal> |
| include the appropriate method definitions. |
| Basically, the generated methods have the same names and |
| <quote>corresponding</quote> types as the Java methods, |
| and are called in the natural manner.</para> |
| |
| <sect2><title>Overloading</title> |
| <para> |
| Both Java and C++ provide method overloading, where multiple |
| methods in a class have the same name, and the correct one is chosen |
| (at compile time) depending on the argument types. |
| The rules for choosing the correct method are (as expected) more complicated |
| in C++ than in Java, but given a set of overloaded methods |
| generated by <literal>gcjh</literal> the C++ compiler will choose |
| the expected one.</para> |
| <para> |
| Common assemblers and linkers are not aware of C++ overloading, |
| so the standard implementation strategy is to encode the |
| parameter types of a method into its assembly-level name. |
| This encoding is called <firstterm>mangling</firstterm>, |
| and the encoded name is the <firstterm>mangled name</firstterm>. |
| The same mechanism is used to implement Java overloading. |
| For C++/Java interoperability, it is important that both the Java |
| and C++ compilers use the <emphasis>same</emphasis> encoding scheme. |
| </para> |
| </sect2> |
| |
| <sect2><title>Static methods</title> |
| <para> |
| Static Java methods are invoked in <acronym>CNI</acronym> using the standard |
| C++ syntax, using the `<literal>::</literal>' operator rather |
| than the `<literal>.</literal>' operator. For example: |
| </para> |
| <programlisting> |
| jint i = java::lang::Math::round((jfloat) 2.3); |
| </programlisting> |
| <para> |
| <!-- FIXME this next sentence seems ungammatical jsm --> |
| Defining a static native method uses standard C++ method |
| definition syntax. For example: |
| <programlisting> |
| #include <java/lang/Integer.h> |
| java::lang::Integer* |
| java::lang::Integer::getInteger(jstring str) |
| { |
| ... |
| } |
| </programlisting> |
| </sect2> |
| |
| <sect2><title>Object Constructors</title> |
| <para> |
| Constructors are called implicitly as part of object allocation |
| using the <literal>new</literal> operator. For example: |
| <programlisting> |
| java::lang::Int x = new java::lang::Int(234); |
| </programlisting> |
| </para> |
| <para> |
| <!-- FIXME rewrite needed here, mine may not be good jsm --> |
| Java does not allow a constructor to be a native method. |
| Instead, you could define a private method which |
| you can have the constructor call. |
| </para> |
| </sect2> |
| |
| <sect2><title>Instance methods</title> |
| <para> |
| <!-- FIXME next para week, I would remove a few words from some sentences jsm --> |
| Virtual method dispatch is handled essentially the same way |
| in C++ and Java -- <abbrev>i.e.</abbrev> by doing an |
| indirect call through a function pointer stored in a per-class virtual |
| function table. C++ is more complicated because it has to support |
| multiple inheritance, but this does not effect Java classes. |
| However, G++ has historically used a different calling convention |
| that is not compatible with the one used by <acronym>gcj</acronym>. |
| During 1999, G++ will switch to a new ABI that is compatible with |
| <acronym>gcj</acronym>. Some platforms (including Linux) have already |
| changed. On other platforms, you will have to pass |
| the <literal>-fvtable-thunks</literal> flag to g++ when |
| compiling <acronym>CNI</acronym> code. Note that you must also compile |
| your C++ source code with <literal>-fno-rtti</literal>. |
| </para> |
| <para> |
| Calling a Java instance method in <acronym>CNI</acronym> is done |
| using the standard C++ syntax. For example: |
| <programlisting> |
| java::lang::Number *x; |
| if (x->doubleValue() > 0.0) ... |
| </programlisting> |
| </para> |
| <para> |
| Defining a Java native instance method is also done the natural way: |
| <programlisting> |
| #include <java/lang/Integer.h> |
| jdouble |
| java::lang:Integer::doubleValue() |
| { |
| return (jdouble) value; |
| } |
| </programlisting> |
| </para> |
| </sect2> |
| |
| <sect2><title>Interface method calls</title> |
| <para> |
| In Java you can call a method using an interface reference. |
| This is not yet supported in <acronym>CNI</acronym>.</para> |
| </sect2> |
| </sect1> |
| |
| <sect1><title>Object allocation</title> |
| |
| <para> |
| New Java objects are allocated using a |
| <firstterm>class-instance-creation-expression</firstterm>: |
| <programlisting> |
| new <replaceable>Type</replaceable> ( <replaceable>arguments</replaceable> ) |
| </programlisting> |
| The same syntax is used in C++. The main difference is that |
| C++ objects have to be explicitly deleted; in Java they are |
| automatically deleted by the garbage collector. |
| Using <acronym>CNI</acronym>, you can allocate a new object |
| using standard C++ syntax. The C++ compiler is smart enough to |
| realize the class is a Java class, and hence it needs to allocate |
| memory from the garbage collector. If you have overloaded |
| constructors, the compiler will choose the correct one |
| using standard C++ overload resolution rules. For example: |
| <programlisting> |
| java::util::Hashtable *ht = new java::util::Hashtable(120); |
| </programlisting> |
| </para> |
| <para> |
| <funcsynopsis> |
| <funcdef>void *<function>_Jv_AllocBytes</function></funcdef> |
| <paramdef>jsize <parameter>size</parameter></paramdef> |
| </funcsynopsis> |
| Allocate <parameter>size</parameter> bytes. This memory is not |
| scanned by the garbage collector. However, it will be freed by |
| the GC if no references to it are discovered. |
| </para> |
| </sect1> |
| |
| <sect1><title>Interfaces</title> |
| <para> |
| A Java class can <firstterm>implement</firstterm> zero or more |
| <firstterm>interfaces</firstterm>, in addition to inheriting from |
| a single base class. |
| An interface is a collection of constants and method specifications; |
| it is similar to the <firstterm>signatures</firstterm> available |
| as a G++ extension. An interface provides a subset of the |
| functionality of C++ abstract virtual base classes, but they |
| are currently implemented differently. |
| CNI does not currently provide any support for interfaces, |
| or calling methods from an interface pointer. |
| This is partly because we are planning to re-do how |
| interfaces are implemented in <acronym>gcj</acronym>. |
| </para> |
| </sect1> |
| |
| <sect1><title>Strings</title> |
| <para> |
| <acronym>CNI</acronym> provides a number of utility functions for |
| working with Java <literal>String</literal> objects. |
| The names and interfaces are analogous to those of <acronym>JNI</acronym>. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef>jstring <function>JvNewString</function></funcdef> |
| <paramdef>const jchar *<parameter>chars</parameter></paramdef> |
| <paramdef>jsize <parameter>len</parameter></paramdef> |
| </funcsynopsis> |
| Creates a new Java String object, where |
| <parameter>chars</parameter> are the contents, and |
| <parameter>len</parameter> is the number of characters. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef>jstring <function>JvNewStringLatin1</function></funcdef> |
| <paramdef>const char *<parameter>bytes</parameter></paramdef> |
| <paramdef>jsize <parameter>len</parameter></paramdef> |
| </funcsynopsis> |
| Creates a new Java String object, where <parameter>bytes</parameter> |
| are the Latin-1 encoded |
| characters, and <parameter>len</parameter> is the length of |
| <parameter>bytes</parameter>, in bytes. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef>jstring <function>JvNewStringLatin1</function></funcdef> |
| <paramdef>const char *<parameter>bytes</parameter></paramdef> |
| </funcsynopsis> |
| Like the first JvNewStringLatin1, but computes <parameter>len</parameter> |
| using <literal>strlen</literal>. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef>jstring <function>JvNewStringUTF</function></funcdef> |
| <paramdef>const char *<parameter>bytes</parameter></paramdef> |
| </funcsynopsis> |
| Creates a new Java String object, where <parameter>bytes</parameter> are |
| the UTF-8 encoded characters of the string, terminated by a null byte. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef>jchar *<function>JvGetStringChars</function></funcdef> |
| <paramdef>jstring <parameter>str</parameter></paramdef> |
| </funcsynopsis> |
| Returns a pointer to the array of characters which make up a string. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef> int <function>JvGetStringUTFLength</function></funcdef> |
| <paramdef>jstring <parameter>str</parameter></paramdef> |
| </funcsynopsis> |
| Returns number of bytes required to encode contents |
| of <parameter>str</parameter> as UTF-8. |
| </para> |
| |
| <para> |
| <funcsynopsis> |
| <funcdef> jsize <function>JvGetStringUTFRegion</function></funcdef> |
| <paramdef>jstring <parameter>str</parameter></paramdef> |
| <paramdef>jsize <parameter>start</parameter></paramdef> |
| <paramdef>jsize <parameter>len</parameter></paramdef> |
| <paramdef>char *<parameter>buf</parameter></paramdef> |
| </funcsynopsis> |
| This puts the UTF-8 encoding of a region of the |
| string <parameter>str</parameter> into |
| the buffer <parameter>buf</parameter>. |
| The region of the string to fetch is specifued by |
| <parameter>start</parameter> and <parameter>len</parameter>. |
| It is assumed that <parameter>buf</parameter> is big enough |
| to hold the result. Note |
| that <parameter>buf</parameter> is <emphasis>not</emphasis> null-terminated. |
| </para> |
| </sect1> |
| |
| <sect1><title>Class Initialization</title> |
| <para> |
| Java requires that each class be automatically initialized at the time |
| of the first active use. Initializing a class involves |
| initializing the static fields, running code in class initializer |
| methods, and initializing base classes. There may also be |
| some implementation specific actions, such as allocating |
| <classname>String</classname> objects corresponding to string literals in |
| the code.</para> |
| <para> |
| The Gcj compiler inserts calls to <literal>JvInitClass</literal> (actually |
| <literal>_Jv_InitClass</literal>) at appropriate places to ensure that a |
| class is initialized when required. The C++ compiler does not |
| insert these calls automatically - it is the programmer's |
| responsibility to make sure classes are initialized. However, |
| this is fairly painless because of the conventions assumed by the Java |
| system.</para> |
| <para> |
| First, <literal>libgcj</literal> will make sure a class is initialized |
| before an instance of that object is created. This is one |
| of the responsibilities of the <literal>new</literal> operation. This is |
| taken care of both in Java code, and in C++ code. (When the G++ |
| compiler sees a <literal>new</literal> of a Java class, it will call |
| a routine in <literal>libgcj</literal> to allocate the object, and that |
| routine will take care of initializing the class.) It follows that you can |
| access an instance field, or call an instance (non-static) |
| method and be safe in the knowledge that the class and all |
| of its base classes have been initialized.</para> |
| <para> |
| Invoking a static method is also safe. This is because the |
| Java compiler adds code to the start of a static method to make sure |
| the class is initialized. However, the C++ compiler does not |
| add this extra code. Hence, if you write a native static method |
| using CNI, you are responsible for calling <literal>JvInitClass</literal> |
| before doing anything else in the method (unless you are sure |
| it is safe to leave it out).</para> |
| <para> |
| Accessing a static field also requires the class of the |
| field to be initialized. The Java compiler will generate code |
| to call <literal>_Jv_InitClass</literal> before getting or setting the field. |
| However, the C++ compiler will not generate this extra code, |
| so it is your responsibility to make sure the class is |
| initialized before you access a static field.</para> |
| </sect1> |
| <sect1><title>Exception Handling</title> |
| <para> |
| While C++ and Java share a common exception handling framework, |
| things are not yet perfectly integrated. The main issue is that the |
| <quote>run-time type information</quote> facilities of the two |
| languages are not integrated.</para> |
| <para> |
| Still, things work fairly well. You can throw a Java exception from |
| C++ using the ordinary <literal>throw</literal> construct, and this |
| exception can be caught by Java code. Similarly, you can catch an |
| exception thrown from Java using the C++ <literal>catch</literal> |
| construct. |
| <para> |
| Note that currently you cannot mix C++ catches and Java catches in |
| a single C++ translation unit. We do intend to fix this eventually. |
| </para> |
| <para> |
| Here is an example: |
| <programlisting> |
| if (i >= count) |
| throw new java::lang::IndexOutOfBoundsException(); |
| </programlisting> |
| </para> |
| </sect1> |
| |
| <sect1><title>Synchronization</title> |
| <para> |
| Each Java object has an implicit monitor. |
| The Java VM uses the instruction <literal>monitorenter</literal> to acquire |
| and lock a monitor, and <literal>monitorexit</literal> to release it. |
| The JNI has corresponding methods <literal>MonitorEnter</literal> |
| and <literal>MonitorExit</literal>. The corresponding CNI macros |
| are <literal>JvMonitorEnter</literal> and <literal>JvMonitorExit</literal>. |
| </para> |
| <para> |
| The Java source language does not provide direct access to these primitives. |
| Instead, there is a <literal>synchronized</literal> statement that does an |
| implicit <literal>monitorenter</literal> before entry to the block, |
| and does a <literal>monitorexit</literal> on exit from the block. |
| Note that the lock has to be released even the block is abnormally |
| terminated by an exception, which means there is an implicit |
| <literal>try</literal>-<literal>finally</literal>. |
| </para> |
| <para> |
| From C++, it makes sense to use a destructor to release a lock. |
| CNI defines the following utility class. |
| <programlisting> |
| class JvSynchronize() { |
| jobject obj; |
| JvSynchronize(jobject o) { obj = o; JvMonitorEnter(o); } |
| ~JvSynchronize() { JvMonitorExit(obj); } |
| }; |
| </programlisting> |
| The equivalent of Java's: |
| <programlisting> |
| synchronized (OBJ) { CODE; } |
| </programlisting> |
| can be simply expressed: |
| <programlisting> |
| { JvSynchronize dummy(OBJ); CODE; } |
| </programlisting> |
| </para> |
| <para> |
| Java also has methods with the <literal>synchronized</literal> attribute. |
| This is equivalent to wrapping the entire method body in a |
| <literal>synchronized</literal> statement. |
| (Alternatively, an implementation could require the caller to do |
| the synchronization. This is not practical for a compiler, because |
| each virtual method call would have to test at run-time if |
| synchronization is needed.) Since in <literal>gcj</literal> |
| the <literal>synchronized</literal> attribute is handled by the |
| method implementation, it is up to the programmer |
| of a synchronized native method to handle the synchronization |
| (in the C++ implementation of the method). |
| In otherwords, you need to manually add <literal>JvSynchronize</literal> |
| in a <literal>native synchornized</literal> method.</para> |
| </sect1> |
| |
| <sect1><title>Reflection</title> |
| <para>The types <literal>jfieldID</literal> and <literal>jmethodID</literal> |
| are as in JNI.</para> |
| <para> |
| The function <literal>JvFromReflectedField</literal>, |
| <literal>JvFromReflectedMethod</literal>, |
| <literal>JvToReflectedField</literal>, and |
| <literal>JvToFromReflectedMethod</literal> (as in Java 2 JNI) |
| will be added shortly, as will other functions corresponding to JNI.</para> |
| |
| <sect1><title>Using gcjh</title> |
| <para> |
| The <command>gcjh</command> is used to generate C++ header files from |
| Java class files. By default, <command>gcjh</command> generates |
| a relatively straightforward C++ header file. However, there |
| are a few caveats to its use, and a few options which can be |
| used to change how it operates: |
| </para> |
| <variablelist> |
| <varlistentry> |
| <term><literal>--classpath</literal> <replaceable>path</replaceable></term> |
| <term><literal>--CLASSPATH</literal> <replaceable>path</replaceable></term> |
| <term><literal>-I</literal> <replaceable>dir</replaceable></term> |
| <listitem><para> |
| These options can be used to set the class path for gcjh. |
| Gcjh searches the class path the same way the compiler does; |
| these options have their familiar meanings.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-d <replaceable>directory</replaceable></literal></term> |
| <listitem><para> |
| Puts the generated <literal>.h</literal> files |
| beneath <replaceable>directory</replaceable>.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-o <replaceable>file</replaceable></literal></term> |
| <listitem><para> |
| Sets the name of the <literal>.h</literal> file to be generated. |
| By default the <literal>.h</literal> file is named after the class. |
| This option only really makes sense if just a single class file |
| is specified.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>--verbose</literal></term> |
| <listitem><para> |
| gcjh will print information to stderr as it works.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-M</literal></term> |
| <term><literal>-MM</literal></term> |
| <term><literal>-MD</literal></term> |
| <term><literal>-MMD</literal></term> |
| <listitem><para> |
| These options can be used to generate dependency information |
| for the generated header file. They work the same way as the |
| corresponding compiler options.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-prepend <replaceable>text</replaceable></literal></term> |
| <listitem><para> |
| This causes the <replaceable>text</replaceable> to be put into the generated |
| header just after class declarations (but before declaration |
| of the current class). This option should be used with caution.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-friend <replaceable>text</replaceable></literal></term> |
| <listitem><para> |
| This causes the <replaceable>text</replaceable> to be put into the class |
| declaration after a <literal>friend</literal> keyword. |
| This can be used to declare some |
| other class or function to be a friend of this class. |
| This option should be used with caution.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-add <replaceable>text</replaceable></literal></term> |
| <listitem><para> |
| The <replaceable>text</replaceable> is inserted into the class declaration. |
| This option should be used with caution.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>-append <replaceable>text</replaceable></literal></term> |
| <listitem><para> |
| The <replaceable>text</replaceable> is inserted into the header file |
| after the class declaration. One use for this is to generate |
| inline functions. This option should be used with caution. |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| <para> |
| All other options not beginning with a <literal>-</literal> are treated |
| as the names of classes for which headers should be generated.</para> |
| <para> |
| gcjh will generate all the required namespace declarations and |
| <literal>#include</literal>'s for the header file. |
| In some situations, gcjh will generate simple inline member |
| functions. Note that, while gcjh puts <literal>#pragma |
| interface</literal> in the generated header file, you should |
| <emphasis>not</emphasis> put <literal>#pragma implementation</literal> |
| into your C++ source file. If you do, duplicate definitions of |
| inline functions will sometimes be created, leading to link-time |
| errors. |
| </para> |
| <para> |
| There are a few cases where gcjh will fail to work properly:</para> |
| <para> |
| gcjh assumes that all the methods and fields of a class have ASCII |
| names. The C++ compiler cannot correctly handle non-ASCII |
| identifiers. gcjh does not currently diagnose this problem.</para> |
| <para> |
| gcjh also cannot fully handle classes where a field and a method have |
| the same name. If the field is static, an error will result. |
| Otherwise, the field will be renamed in the generated header; `__' |
| will be appended to the field name.</para> |
| <para> |
| Eventually we hope to change the C++ compiler so that these |
| restrictions can be lifted.</para> |
| </sect1> |
| |
| </article> |