| <section xmlns="http://docbook.org/ns/docbook" version="5.0" |
| xml:id="appendix.porting.internals" xreflabel="Portin Internals"> |
| <?dbhtml filename="internals.html"?> |
| |
| <info><title>Porting to New Hardware or Operating Systems</title> |
| <keywordset> |
| <keyword>ISO C++</keyword> |
| <keyword>internals</keyword> |
| </keywordset> |
| </info> |
| |
| |
| |
| <para> |
| </para> |
| |
| |
| <para>This document explains how to port libstdc++ (the GNU C++ library) to |
| a new target. |
| </para> |
| |
| <para>In order to make the GNU C++ library (libstdc++) work with a new |
| target, you must edit some configuration files and provide some new |
| header files. Unless this is done, libstdc++ will use generic |
| settings which may not be correct for your target; even if they are |
| correct, they will likely be inefficient. |
| </para> |
| |
| <para>Before you get started, make sure that you have a working C library on |
| your target. The C library need not precisely comply with any |
| particular standard, but should generally conform to the requirements |
| imposed by the ANSI/ISO standard. |
| </para> |
| |
| <para>In addition, you should try to verify that the C++ compiler generally |
| works. It is difficult to test the C++ compiler without a working |
| library, but you should at least try some minimal test cases. |
| </para> |
| |
| <para>(Note that what we think of as a "target," the library refers to as |
| a "host." The comment at the top of <code>configure.ac</code> explains why.) |
| </para> |
| |
| |
| <section xml:id="internals.os"><info><title>Operating System</title></info> |
| |
| |
| <para>If you are porting to a new operating system (as opposed to a new chip |
| using an existing operating system), you will need to create a new |
| directory in the <code>config/os</code> hierarchy. For example, the IRIX |
| configuration files are all in <code>config/os/irix</code>. There is no set |
| way to organize the OS configuration directory. For example, |
| <code>config/os/solaris/solaris-2.6</code> and |
| <code>config/os/solaris/solaris-2.7</code> are used as configuration |
| directories for these two versions of Solaris. On the other hand, both |
| Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code> |
| directory. The important information is that there needs to be a |
| directory under <code>config/os</code> to store the files for your operating |
| system. |
| </para> |
| |
| <para>You might have to change the <code>configure.host</code> file to ensure that |
| your new directory is activated. Look for the switch statement that sets |
| <code>os_include_dir</code>, and add a pattern to handle your operating system |
| if the default will not suffice. The switch statement switches on only |
| the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code> |
| in <code>sparc-sun-solaris2.8</code>. If the new directory is named after the |
| OS portion of the triplet (the default), then nothing needs to be changed. |
| </para> |
| |
| <para>The first file to create in this directory, should be called |
| <code>os_defines.h</code>. This file contains basic macro definitions |
| that are required to allow the C++ library to work with your C library. |
| </para> |
| |
| <para>Several libstdc++ source files unconditionally define the macro |
| <code>_POSIX_SOURCE</code>. On many systems, defining this macro causes |
| large portions of the C library header files to be eliminated |
| at preprocessing time. Therefore, you may have to <code>#undef</code> this |
| macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or |
| <code>__EXTENSIONS__</code>). You won't know what macros to define or |
| undefine at this point; you'll have to try compiling the library and |
| seeing what goes wrong. If you see errors about calling functions |
| that have not been declared, look in your C library headers to see if |
| the functions are declared there, and then figure out what macros you |
| need to define. You will need to add them to the |
| <code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your |
| target. It will not work to simply define these macros in |
| <code>os_defines.h</code>. |
| </para> |
| |
| <para>At this time, there are a few libstdc++-specific macros which may be |
| defined: |
| </para> |
| |
| <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99 |
| function declarations (which are not covered by specialization below) |
| found in system headers against versions found in the library headers |
| derived from the standard. |
| </para> |
| |
| <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that |
| yields 0 if and only if the system headers are exposing proper support |
| for C99 functions (which are not covered by specialization below). If |
| defined, it must be 0 while bootstrapping the compiler/rebuilding the |
| library. |
| </para> |
| |
| <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check |
| the set of C99 long long function declarations found in system headers |
| against versions found in the library headers derived from the |
| standard. |
| |
| </para> |
| <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an |
| expression that yields 0 if and only if the system headers are |
| exposing proper support for the set of C99 long long functions. If |
| defined, it must be 0 while bootstrapping the compiler/rebuilding the |
| library. |
| </para> |
| <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an |
| expression that yields 0 if and only if the system headers |
| are exposing proper support for the related set of macros. If defined, |
| it must be 0 while bootstrapping the compiler/rebuilding the library. |
| </para> |
| <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined |
| to 1 to check the related set of function declarations found in system |
| headers against versions found in the library headers derived from |
| the standard. |
| </para> |
| <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined |
| to an expression that yields 0 if and only if the system headers |
| are exposing proper support for the related set of functions. If defined, |
| it must be 0 while bootstrapping the compiler/rebuilding the library. |
| </para> |
| <para><code>_GLIBCXX_NO_OBSOLETE_ISINF_ISNAN_DYNAMIC</code> may be defined |
| to an expression that yields 0 if and only if the system headers |
| are exposing non-standard <code>isinf(double)</code> and |
| <code>isnan(double)</code> functions in the global namespace. Those functions |
| should be detected automatically by the <code>configure</code> script when |
| libstdc++ is built but if their presence depends on compilation flags or |
| other macros the static configuration can be overridden. |
| </para> |
| <para>Finally, you should bracket the entire file in an include-guard, like |
| this: |
| </para> |
| |
| <programlisting> |
| |
| #ifndef _GLIBCXX_OS_DEFINES |
| #define _GLIBCXX_OS_DEFINES |
| ... |
| #endif |
| </programlisting> |
| |
| <para>We recommend copying an existing <code>os_defines.h</code> to use as a |
| starting point. |
| </para> |
| </section> |
| |
| |
| <section xml:id="internals.cpu"><info><title>CPU</title></info> |
| |
| |
| <para>If you are porting to a new chip (as opposed to a new operating system |
| running on an existing chip), you will need to create a new directory in the |
| <code>config/cpu</code> hierarchy. Much like the <link linkend="internals.os">Operating system</link> setup, |
| there are no strict rules on how to organize the CPU configuration |
| directory, but careful naming choices will allow the configury to find your |
| setup files without explicit help. |
| </para> |
| |
| <para>We recommend that for a target triplet <code><CPU>-<vendor>-<OS></code>, you |
| name your configuration directory <code>config/cpu/<CPU></code>. If you do this, |
| the configury will find the directory by itself. Otherwise you will need to |
| edit the <code>configure.host</code> file and, in the switch statement that sets |
| <code>cpu_include_dir</code>, add a pattern to handle your chip. |
| </para> |
| |
| <para>Note that some chip families share a single configuration directory, for |
| example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the |
| <code>config/cpu/alpha</code> directory, and there is an entry in the |
| <code>configure.host</code> switch statement to handle this. |
| </para> |
| |
| <para>The <code>cpu_include_dir</code> sets default locations for the files controlling |
| <link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not |
| appropriate for your chip. |
| </para> |
| |
| </section> |
| |
| |
| <section xml:id="internals.char_types"><info><title>Character Types</title></info> |
| |
| |
| <para>The library requires that you provide three header files to implement |
| character classification, analogous to that provided by the C libraries |
| <code><ctype.h></code> header. You can model these on the files provided in |
| <code>config/os/generic</code>. However, these files will almost |
| certainly need some modification. |
| </para> |
| |
| <para>The first file to write is <code>ctype_base.h</code>. This file provides |
| some very basic information about character classification. The libstdc++ |
| library assumes that your C library implements <code><ctype.h></code> by using |
| a table (indexed by character code) containing integers, where each of |
| these integers is a bit-mask indicating whether the character is |
| upper-case, lower-case, alphabetic, etc. The <code>ctype_base.h</code> |
| file gives the type of the integer, and the values of the various bit |
| masks. You will have to peer at your own <code><ctype.h></code> to figure out |
| how to define the values required by this file. |
| </para> |
| |
| <para>The <code>ctype_base.h</code> header file does not need include guards. |
| It should contain a single <code>struct</code> definition called |
| <code>ctype_base</code>. This <code>struct</code> should contain two type |
| declarations, and one enumeration declaration, like this example, taken |
| from the IRIX configuration: |
| </para> |
| |
| <programlisting> |
| struct ctype_base |
| { |
| typedef unsigned int mask; |
| typedef int* __to_type; |
| |
| enum |
| { |
| space = _ISspace, |
| print = _ISprint, |
| cntrl = _IScntrl, |
| upper = _ISupper, |
| lower = _ISlower, |
| alpha = _ISalpha, |
| digit = _ISdigit, |
| punct = _ISpunct, |
| xdigit = _ISxdigit, |
| alnum = _ISalnum, |
| graph = _ISgraph |
| }; |
| }; |
| </programlisting> |
| |
| <para>The <code>mask</code> type is the type of the elements in the table. If your |
| C library uses a table to map lower-case numbers to upper-case numbers, |
| and vice versa, you should define <code>__to_type</code> to be the type of the |
| elements in that table. If you don't mind taking a minor performance |
| penalty, or if your library doesn't implement <code>toupper</code> and |
| <code>tolower</code> in this way, you can pick any pointer-to-integer type, |
| but you must still define the type. |
| </para> |
| |
| <para>The enumeration should give definitions for all the values in the above |
| example, using the values from your native <code><ctype.h></code>. They can |
| be given symbolically (as above), or numerically, if you prefer. You do |
| not have to include <code><ctype.h></code> in this header; it will always be |
| included before <code>ctype_base.h</code> is included. |
| </para> |
| |
| <para>The next file to write is <code>ctype_configure_char.cc</code>. |
| The first function that must be written is the <code>ctype<char>::ctype</code> constructor. Here is the IRIX example: |
| </para> |
| |
| <programlisting> |
| ctype<char>::ctype(const mask* __table = 0, bool __del = false, |
| size_t __refs = 0) |
| : _Ctype_nois<char>(__refs), _M_del(__table != 0 && __del), |
| _M_toupper(NULL), |
| _M_tolower(NULL), |
| _M_ctable(NULL), |
| _M_table(!__table |
| ? (const mask*) (__libc_attr._ctype_tbl->_class + 1) |
| : __table) |
| { } |
| </programlisting> |
| |
| <para>There are two parts of this that you might choose to alter. The first, |
| and most important, is the line involving <code>__libc_attr</code>. That is |
| IRIX system-dependent code that gets the base of the table mapping |
| character codes to attributes. You need to substitute code that obtains |
| the address of this table on your system. If you want to use your |
| operating system's tables to map upper-case letters to lower-case, and |
| vice versa, you should initialize <code>_M_toupper</code> and |
| <code>_M_tolower</code> with those tables, in similar fashion. |
| </para> |
| |
| <para>Now, you have to write two functions to convert from upper-case to |
| lower-case, and vice versa. Here are the IRIX versions: |
| </para> |
| |
| <programlisting> |
| char |
| ctype<char>::do_toupper(char __c) const |
| { return _toupper(__c); } |
| |
| char |
| ctype<char>::do_tolower(char __c) const |
| { return _tolower(__c); } |
| </programlisting> |
| |
| <para>Your C library provides equivalents to IRIX's <code>_toupper</code> and |
| <code>_tolower</code>. If you initialized <code>_M_toupper</code> and |
| <code>_M_tolower</code> above, then you could use those tables instead. |
| </para> |
| |
| <para>Finally, you have to provide two utility functions that convert strings |
| of characters. The versions provided here will always work - but you |
| could use specialized routines for greater performance if you have |
| machinery to do that on your system: |
| </para> |
| |
| <programlisting> |
| const char* |
| ctype<char>::do_toupper(char* __low, const char* __high) const |
| { |
| while (__low < __high) |
| { |
| *__low = do_toupper(*__low); |
| ++__low; |
| } |
| return __high; |
| } |
| |
| const char* |
| ctype<char>::do_tolower(char* __low, const char* __high) const |
| { |
| while (__low < __high) |
| { |
| *__low = do_tolower(*__low); |
| ++__low; |
| } |
| return __high; |
| } |
| </programlisting> |
| |
| <para>You must also provide the <code>ctype_inline.h</code> file, which |
| contains a few more functions. On most systems, you can just copy |
| <code>config/os/generic/ctype_inline.h</code> and use it on your system. |
| </para> |
| |
| <para>In detail, the functions provided test characters for particular |
| properties; they are analogous to the functions like <code>isalpha</code> and |
| <code>islower</code> provided by the C library. |
| </para> |
| |
| <para>The first function is implemented like this on IRIX: |
| </para> |
| |
| <programlisting> |
| bool |
| ctype<char>:: |
| is(mask __m, char __c) const throw() |
| { return (_M_table)[(unsigned char)(__c)] & __m; } |
| </programlisting> |
| |
| <para>The <code>_M_table</code> is the table passed in above, in the constructor. |
| This is the table that contains the bitmasks for each character. The |
| implementation here should work on all systems. |
| </para> |
| |
| <para>The next function is: |
| </para> |
| |
| <programlisting> |
| const char* |
| ctype<char>:: |
| is(const char* __low, const char* __high, mask* __vec) const throw() |
| { |
| while (__low < __high) |
| *__vec++ = (_M_table)[(unsigned char)(*__low++)]; |
| return __high; |
| } |
| </programlisting> |
| |
| <para>This function is similar; it copies the masks for all the characters |
| from <code>__low</code> up until <code>__high</code> into the vector given by |
| <code>__vec</code>. |
| </para> |
| |
| <para>The last two functions again are entirely generic: |
| </para> |
| |
| <programlisting> |
| const char* |
| ctype<char>:: |
| scan_is(mask __m, const char* __low, const char* __high) const throw() |
| { |
| while (__low < __high && !this->is(__m, *__low)) |
| ++__low; |
| return __low; |
| } |
| |
| const char* |
| ctype<char>:: |
| scan_not(mask __m, const char* __low, const char* __high) const throw() |
| { |
| while (__low < __high && this->is(__m, *__low)) |
| ++__low; |
| return __low; |
| } |
| </programlisting> |
| |
| </section> |
| |
| |
| <section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info> |
| |
| |
| <para>The C++ library string functionality requires a couple of atomic |
| operations to provide thread-safety. If you don't take any special |
| action, the library will use stub versions of these functions that are |
| not thread-safe. They will work fine, unless your applications are |
| multi-threaded. |
| </para> |
| |
| <para>If you want to provide custom, safe, versions of these functions, there |
| are two distinct approaches. One is to provide a version for your CPU, |
| using assembly language constructs. The other is to use the |
| thread-safety primitives in your operating system. In either case, you |
| make a file called <code>atomicity.h</code>, and the variable |
| <code>ATOMICITYH</code> must point to this file. |
| </para> |
| |
| <para>If you are using the assembly-language approach, put this code in |
| <code>config/cpu/<chip>/atomicity.h</code>, where chip is the name of |
| your processor (see <link linkend="internals.cpu">CPU</link>). No additional changes are necessary to |
| locate the file in this case; <code>ATOMICITYH</code> will be set by default. |
| </para> |
| |
| <para>If you are using the operating system thread-safety primitives approach, |
| you can also put this code in the same CPU directory, in which case no more |
| work is needed to locate the file. For examples of this approach, |
| see the <code>atomicity.h</code> file for IRIX or IA64. |
| </para> |
| |
| <para>Alternatively, if the primitives are more closely related to the OS |
| than they are to the CPU, you can put the <code>atomicity.h</code> file in |
| the <link linkend="internals.os">Operating system</link> directory instead. In this case, you must |
| edit <code>configure.host</code>, and in the switch statement that handles |
| operating systems, override the <code>ATOMICITYH</code> variable to point to |
| the appropriate <code>os_include_dir</code>. For examples of this approach, |
| see the <code>atomicity.h</code> file for AIX. |
| </para> |
| |
| <para>With those bits out of the way, you have to actually write |
| <code>atomicity.h</code> itself. This file should be wrapped in an |
| include guard named <code>_GLIBCXX_ATOMICITY_H</code>. It should define one |
| type, and two functions. |
| </para> |
| |
| <para>The type is <code>_Atomic_word</code>. Here is the version used on IRIX: |
| </para> |
| |
| <programlisting> |
| typedef long _Atomic_word; |
| </programlisting> |
| |
| <para>This type must be a signed integral type supporting atomic operations. |
| If you're using the OS approach, use the same type used by your system's |
| primitives. Otherwise, use the type for which your CPU provides atomic |
| primitives. |
| </para> |
| |
| <para>Then, you must provide two functions. The bodies of these functions |
| must be equivalent to those provided here, but using atomic operations: |
| </para> |
| |
| <programlisting> |
| static inline _Atomic_word |
| __attribute__ ((__unused__)) |
| __exchange_and_add (_Atomic_word* __mem, int __val) |
| { |
| _Atomic_word __result = *__mem; |
| *__mem += __val; |
| return __result; |
| } |
| |
| static inline void |
| __attribute__ ((__unused__)) |
| __atomic_add (_Atomic_word* __mem, int __val) |
| { |
| *__mem += __val; |
| } |
| </programlisting> |
| |
| </section> |
| |
| |
| <section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info> |
| |
| |
| <para>The C++ library requires information about the fundamental data types, |
| such as the minimum and maximum representable values of each type. |
| You can define each of these values individually, but it is usually |
| easiest just to indicate how many bits are used in each of the data |
| types and let the library do the rest. For information about the |
| macros to define, see the top of <code>include/bits/std_limits.h</code>. |
| </para> |
| |
| <para>If you need to define any macros, you can do so in <code>os_defines.h</code>. |
| However, if all operating systems for your CPU are likely to use the |
| same values, you can provide a CPU-specific file instead so that you |
| do not have to provide the same definitions for each operating system. |
| To take that approach, create a new file called <code>cpu_limits.h</code> in |
| your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>). |
| </para> |
| |
| </section> |
| |
| |
| <section xml:id="internals.libtool"><info><title>Libtool</title></info> |
| |
| |
| <para>The C++ library is compiled, archived and linked with libtool. |
| Explaining the full workings of libtool is beyond the scope of this |
| document, but there are a few, particular bits that are necessary for |
| porting. |
| </para> |
| |
| <para>Some parts of the libstdc++ library are compiled with the libtool |
| <code>--tags CXX</code> option (the C++ definitions for libtool). Therefore, |
| <code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct |
| logic to compile and archive objects equivalent to the C version of libtool, |
| <code>ltcf-c.sh</code>. Some libtool targets have definitions for C but not |
| for C++, or C++ definitions which have not been kept up to date. |
| </para> |
| |
| <para>The C++ run-time library contains initialization code that needs to be |
| run as the library is loaded. Often, that requires linking in special |
| object files when the C++ library is built as a shared library, or |
| taking other system-specific actions. |
| </para> |
| |
| <para>The libstdc++ library is linked with the C version of libtool, even |
| though it is a C++ library. Therefore, the C version of libtool needs to |
| ensure that the run-time library initializers are run. The usual way to |
| do this is to build the library using <code>gcc -shared</code>. |
| </para> |
| |
| <para>If you need to change how the library is linked, look at |
| <code>ltcf-c.sh</code> in the top-level directory. Find the switch statement |
| that sets <code>archive_cmds</code>. Here, adjust the setting for your |
| operating system. |
| </para> |
| |
| |
| </section> |
| |
| </section> |