| Arm / Thumb Interworking |
| ======================== |
| |
| The Cygnus GNU Pro Toolkit for the ARM7T processor supports function |
| calls between code compiled for the ARM instruction set and code |
| compiled for the Thumb instruction set and vice versa. This document |
| describes how that interworking support operates and explains the |
| command line switches that should be used in order to produce working |
| programs. |
| |
| Note: The Cygnus GNU Pro Toolkit does not support switching between |
| compiling for the ARM instruction set and the Thumb instruction set |
| on anything other than a per file basis. There are in fact two |
| completely separate compilers, one that produces ARM assembler |
| instructions and one that produces Thumb assembler instructions. The |
| two compilers share the same assembler, linker and so on. |
| |
| |
| 1. Explicit interworking support for C and C++ files |
| ==================================================== |
| |
| By default if a file is compiled without any special command line |
| switches then the code produced will not support interworking. |
| Provided that a program is made up entirely from object files and |
| libraries produced in this way and which contain either exclusively |
| ARM instructions or exclusively Thumb instructions then this will not |
| matter and a working executable will be created. If an attempt is |
| made to link together mixed ARM and Thumb object files and libraries, |
| then warning messages will be produced by the linker and a non-working |
| executable will be created. |
| |
| In order to produce code which does support interworking it should be |
| compiled with the |
| |
| -mthumb-interwork |
| |
| command line option. Provided that a program is made up entirely from |
| object files and libraries built with this command line switch a |
| working executable will be produced, even if both ARM and Thumb |
| instructions are used by the various components of the program. (No |
| warning messages will be produced by the linker either). |
| |
| Note that specifying -mthumb-interwork does result in slightly larger, |
| slower code being produced. This is why interworking support must be |
| specifically enabled by a switch. |
| |
| |
| 2. Explicit interworking support for assembler files |
| ==================================================== |
| |
| If assembler files are to be included into an interworking program |
| then the following rules must be obeyed: |
| |
| * Any externally visible functions must return by using the BX |
| instruction. |
| |
| * Normal function calls can just use the BL instruction. The |
| linker will automatically insert code to switch between ARM |
| and Thumb modes as necessary. |
| |
| * Calls via function pointers should use the BX instruction if |
| the call is made in ARM mode: |
| |
| .code 32 |
| mov lr, pc |
| bx rX |
| |
| This code sequence will not work in Thumb mode however, since |
| the mov instruction will not set the bottom bit of the lr |
| register. Instead a branch-and-link to the _call_via_rX |
| functions should be used instead: |
| |
| .code 16 |
| bl _call_via_rX |
| |
| where rX is replaced by the name of the register containing |
| the function address. |
| |
| * All externally visible functions which should be entered in |
| Thumb mode must have the .thumb_func pseudo op specified just |
| before their entry point. eg: |
| |
| .code 16 |
| .global function |
| .thumb_func |
| function: |
| ...start of function.... |
| |
| * All assembler files must be assembled with the switch |
| -mthumb-interwork specified on the command line. (If the file |
| is assembled by calling gcc it will automatically pass on the |
| -mthumb-interwork switch to the assembler, provided that it |
| was specified on the gcc command line in the first place.) |
| |
| |
| 3. Support for old, non-interworking aware code. |
| ================================================ |
| |
| If it is necessary to link together code produced by an older, |
| non-interworking aware compiler, or code produced by the new compiler |
| but without the -mthumb-interwork command line switch specified, then |
| there are two command line switches that can be used to support this. |
| |
| The switch |
| |
| -mcaller-super-interworking |
| |
| will allow calls via function pointers in Thumb mode to work, |
| regardless of whether the function pointer points to old, |
| non-interworking aware code or not. Specifying this switch does |
| produce slightly slower code however. |
| |
| Note: There is no switch to allow calls via function pointers in ARM |
| mode to be handled specially. Calls via function pointers from |
| interworking aware ARM code to non-interworking aware ARM code work |
| without any special considerations by the compiler. Calls via |
| function pointers from interworking aware ARM code to non-interworking |
| aware Thumb code however will not work. (Actually under some |
| circumstances they may work, but there are no guarantees). This is |
| because only the new compiler is able to produce Thumb code, and this |
| compiler already has a command line switch to produce interworking |
| aware code. |
| |
| |
| The switch |
| |
| -mcallee-super-interworking |
| |
| will allow non-interworking aware ARM or Thumb code to call Thumb |
| functions, either directly or via function pointers. Specifying this |
| switch does produce slightly larger, slower code however. |
| |
| Note: There is no switch to allow non-interworking aware ARM or Thumb |
| code to call ARM functions. There is no need for any special handling |
| of calls from non-interworking aware ARM code to interworking aware |
| ARM functions, they just work normally. Calls from non-interworking |
| aware Thumb functions to ARM code however, will not work. There is no |
| option to support this, since it is always possible to recompile the |
| Thumb code to be interworking aware. |
| |
| As an alternative to the command line switch |
| -mcallee-super-interworking, which affects all externally visible |
| functions in a file, it is possible to specify an attribute or |
| declspec for individual functions, indicating that that particular |
| function should support being called by non-interworking aware code. |
| The function should be defined like this: |
| |
| int function __attribute__((interfacearm)) |
| { |
| ... body of function ... |
| } |
| |
| or |
| |
| int function __declspec(interfacearm) |
| { |
| ... body of function ... |
| } |
| |
| |
| |
| 4. Interworking support in dlltool |
| ================================== |
| |
| Currently there is no interworking support in dlltool. This may be a |
| future enhancement. |
| |
| |
| |
| 5. How interworking support works |
| ================================= |
| |
| Switching between the ARM and Thumb instruction sets is accomplished |
| via the BX instruction which takes as an argument a register name. |
| Control is transfered to the address held in this register (with the |
| bottom bit masked out), and if the bottom bit is set, then Thumb |
| instruction processing is enabled, otherwise ARM instruction |
| processing is enabled. |
| |
| When the -mthumb-interwork command line switch is specified, gcc |
| arranges for all functions to return to their caller by using the BX |
| instruction. Thus provided that the return address has the bottom bit |
| correctly initialised to indicate the instruction set of the caller, |
| correct operation will ensue. |
| |
| When a function is called explicitly (rather than via a function |
| pointer), the compiler generates a BL instruction to do this. The |
| Thumb version of the BL instruction has the special property of |
| setting the bottom bit of the LR register after it has stored the |
| return address into it, so that a future BX instruction will correctly |
| return the instruction after the BL instruction, in Thumb mode. |
| |
| The BL instruction does not change modes itself however, so if an ARM |
| function is calling a Thumb function, or vice versa, it is necessary |
| to generate some extra instructions to handle this. This is done in |
| the linker when it is storing the address of the referenced function |
| into the BL instruction. If the BL instruction is an ARM style BL |
| instruction, but the referenced function is a Thumb function, then the |
| linker automatically generates a calling stub that converts from ARM |
| mode to Thumb mode, puts the address of this stub into the BL |
| instruction, and puts the address of the referenced function into the |
| stub. Similarly if the BL instruction is a Thumb BL instruction, and |
| the referenced function is an ARM function, the linker generates a |
| stub which converts from Thumb to ARM mode, puts the address of this |
| stub into the BL instruction, and the address of the referenced |
| function into the stub. |
| |
| This is why it is necessary to mark Thumb functions with the |
| .thumb_func pseudo op when creating assembler files. This pseudo op |
| allows the assembler to distinguish between ARM functions and Thumb |
| functions. (The Thumb version of GCC automatically generates these |
| pseudo ops for any Thumb functions that it generates). |
| |
| Calls via function pointers work differently. Whenever the address of |
| a function is taken, the linker examines the type of the function |
| being referenced. If the function is a Thumb function, then it sets |
| the bottom bit of the address. Technically this makes the address |
| incorrect, since it is now one byte into the start of the function, |
| but this is never a problem because: |
| |
| a. with interworking enabled all calls via function pointer |
| are done using the BX instruction and this ignores the |
| bottom bit when computing where to go to. |
| |
| b. the linker will always set the bottom bit when the address |
| of the function is taken, so it is never possible to take |
| the address of the function in two different places and |
| then compare them and find that they are not equal. |
| |
| As already mentioned any call via a function pointer will use the BX |
| instruction (provided that interworking is enabled). The only problem |
| with this is computing the return address for the return from the |
| called function. For ARM code this can easily be done by the code |
| sequence: |
| |
| mov lr, pc |
| bx rX |
| |
| (where rX is the name of the register containing the function |
| pointer). This code does not work for the Thumb instruction set, |
| since the MOV instruction will not set the bottom bit of the LR |
| register, so that when the called function returns, it will return in |
| ARM mode not Thumb mode. Instead the compiler generates this |
| sequence: |
| |
| bl _call_via_rX |
| |
| (again where rX is the name if the register containing the function |
| pointer). The special call_via_rX functions look like this: |
| |
| .thumb_func |
| _call_via_r0: |
| bx r0 |
| nop |
| |
| The BL instruction ensures that the correct return address is stored |
| in the LR register and then the BX instruction jumps to the address |
| stored in the function pointer, switch modes if necessary. |
| |
| |
| 6. How caller-super-interworking support works |
| ============================================== |
| |
| When the -mcaller-super-interworking command line switch is specified |
| it changes the code produced by the Thumb compiler so that all calls |
| via function pointers (including virtual function calls) now go via a |
| different stub function. The code to call via a function pointer now |
| looks like this: |
| |
| bl _interwork_call_via_r0 |
| |
| Note: The compiler does not insist that r0 be used to hold the |
| function address. Any register will do, and there are a suite of stub |
| functions, one for each possible register. The stub functions look |
| like this: |
| |
| .code 16 |
| .thumb_func |
| _interwork_call_via_r0 |
| bx pc |
| nop |
| |
| .code 32 |
| tst r0, #1 |
| stmeqdb r13!, {lr} |
| adreq lr, _arm_return |
| bx r0 |
| |
| The stub first switches to ARM mode, since it is a lot easier to |
| perform the necessary operations using ARM instructions. It then |
| tests the bottom bit of the register containing the address of the |
| function to be called. If this bottom bit is set then the function |
| being called uses Thumb instructions and the BX instruction to come |
| will switch back into Thumb mode before calling this function. (Note |
| that it does not matter how this called function chooses to return to |
| its caller, since the both the caller and callee are Thumb functions, |
| and mode switching is necessary). If the function being called is an |
| ARM mode function however, the stub pushes the return address (with |
| its bottom bit set) onto the stack, replaces the return address with |
| the address of the a piece of code called '_arm_return' and then |
| performs a BX instruction to call the function. |
| |
| The '_arm_return' code looks like this: |
| |
| .code 32 |
| _arm_return: |
| ldmia r13!, {r12} |
| bx r12 |
| .code 16 |
| |
| |
| It simply retrieves the return address from the stack, and then |
| performs a BX operation to return to the caller and switch back into |
| Thumb mode. |
| |
| |
| 7. How callee-super-interworking support works |
| ============================================== |
| |
| When -mcallee-super-interworking is specified on the command line the |
| Thumb compiler behaves as if every externally visible function that it |
| compiles has had the (interfacearm) attribute specified for it. What |
| this attribute does is to put a special, ARM mode header onto the |
| function which forces a switch into Thumb mode: |
| |
| without __attribute__((interfacearm)): |
| |
| .code 16 |
| .thumb_func |
| function: |
| ... start of function ... |
| |
| with __attribute__((interfacearm)): |
| |
| .code 32 |
| function: |
| orr r12, pc, #1 |
| bx r12 |
| |
| .code 16 |
| .thumb_func |
| .real_start_of_function: |
| |
| ... start of function ... |
| |
| Note that since the function now expects to be entered in ARM mode, it |
| no longer has the .thumb_func pseudo op specified for its name. |
| Instead the pseudo op is attached to a new label .real_start_of_<name> |
| (where <name> is the name of the function) which indicates the start |
| of the Thumb code. This does have the interesting side effect in that |
| if this function is now called from a Thumb mode piece of code |
| outsside of the current file, the linker will generate a calling stub |
| to switch from Thumb mode into ARM mode, and then this is immediately |
| overridden by the function's header which switches back into Thumb |
| mode. |
| |
| In addition the (interfacearm) attribute also forces the function to |
| return by using the BX instruction, even if has not been compiled with |
| the -mthumb-interwork command line flag, so that the correct mode will |
| be restored upon exit from the function. |
| |
| |
| 8. Some examples |
| ================ |
| |
| Given this test file: |
| |
| int func (void) { return 1; } |
| |
| int call (int (* ptr)(void)) { return ptr (); } |
| |
| The following varying pieces of assembler are produced depending upon |
| the command line options used: |
| |
| no options: |
| |
| @ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe |
| .code 16 |
| .text |
| .globl _func |
| .thumb_func |
| _func: |
| mov r0, #1 |
| bx lr |
| |
| .globl _call |
| .thumb_func |
| _call: |
| push {lr} |
| bl __call_via_r0 |
| pop {pc} |
| |
| Note how the two functions have different exit sequences. In |
| particular call() uses pop {pc} to return. This would not work if the |
| caller was in ARM mode. |
| |
| If -mthumb-interwork is specified on the command line: |
| |
| @ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe |
| .code 16 |
| .text |
| .globl _func |
| .thumb_func |
| _func: |
| mov r0, #1 |
| bx lr |
| |
| .globl _call |
| .thumb_func |
| _call: |
| push {lr} |
| bl __call_via_r0 |
| pop {r1} |
| bx r1 |
| |
| This time both functions return by using the BX instruction. This |
| means that call() is now two bytes longer and several cycles slower |
| than the version that is not interworking enabled. |
| |
| If -mcaller-super-interworking is specified: |
| |
| @ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe |
| .code 16 |
| .text |
| .globl _func |
| .thumb_func |
| _func: |
| mov r0, #1 |
| bx lr |
| |
| .globl _call |
| .thumb_func |
| _call: |
| push {lr} |
| bl __interwork_call_via_r0 |
| pop {pc} |
| |
| Very similar to the first (non-interworking) version, except that a |
| different stub is used to call via the function pointer. Note that |
| the assembly code for call() is not interworking aware, and so should |
| not be called from ARM code. |
| |
| If -mcallee-super-interworking is specified: |
| |
| @ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe |
| .code 16 |
| .text |
| .globl _func |
| .code 32 |
| _func: |
| orr r12, pc, #1 |
| bx r12 |
| .code 16 |
| .globl .real_start_of_func |
| .thumb_func |
| .real_start_of_func: |
| mov r0, #1 |
| bx lr |
| |
| .globl _call |
| .code 32 |
| _call: |
| orr r12, pc, #1 |
| bx r12 |
| .code 16 |
| .globl .real_start_of_call |
| .thumb_func |
| .real_start_of_call: |
| push {lr} |
| bl __call_via_r0 |
| pop {r1} |
| bx r1 |
| |
| Now both functions have an ARM coded prologue, and both functions |
| return by using the BX instruction. These functions are interworking |
| aware therefore and can safely be called from ARM code. The code for |
| the call() function is now 10 bytes longer than the original, non |
| interworking aware version, an increase of over 200%. |
| |
| If the source code is slightly altered so that only the call function |
| has an (interfacearm) attribute: |
| |
| int func (void) { return 1; } |
| int call () __attribute__((interfacearm)); |
| int call (int (* ptr)(void)) { return ptr (); } |
| int main (void) { return printf ("result: %d\n", call (func)); } |
| |
| then this code is produced (with no command line switches): |
| |
| @ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe |
| .code 16 |
| .text |
| .globl _func |
| .thumb_func |
| _func: |
| mov r0, #1 |
| bx lr |
| |
| .globl _call |
| .code 32 |
| _call: |
| orr r12, pc, #1 |
| bx r12 |
| .code 16 |
| .globl .real_start_of_call |
| .thumb_func |
| .real_start_of_call: |
| push {lr} |
| bl __call_via_r0 |
| pop {r1} |
| bx r1 |
| |
| .globl _main |
| .thumb_func |
| _main: |
| push {r4, lr} |
| bl ___gccmain |
| ldr r4, .L4 |
| ldr r0, .L4+4 |
| bl _call |
| add r1, r0, #0 |
| add r0, r4, #0 |
| bl _printf |
| pop {r4, pc} |
| .L4: |
| .word .LC0 |
| .word _func |
| |
| .section .rdata |
| .LC0: |
| .ascii "result: %d\n\000" |
| |
| So now only call() can be called via non-interworking aware ARM code. |
| When this program is assembled, the assembler detects the fact that |
| main() is calling call() in Thumb mode, and so automatically adjusts |
| the BL instruction to point to the real start of call(): |
| |
| Disassembly of section .text: |
| |
| 00000028 <_main>: |
| 28: b530 b530 push {r4, r5, lr} |
| 2a: fffef7ff f7ff bl 2a <_main+0x2> |
| 2e: 4d06 4d06 ldr r5, [pc, #24] (48 <.L7>) |
| 30: ffe8f7ff f7ff bl 4 <_doit> |
| 34: 1c04 1c04 add r4, r0, #0 |
| 36: 4805 4805 ldr r0, [pc, #20] (4c <.L7+0x4>) |
| 38: fff0f7ff f7ff bl 1c <.real_start_of_call> |
| 3c: 1824 1824 add r4, r4, r0 |
| 3e: 1c28 1c28 add r0, r5, #0 |
| 40: 1c21 1c21 add r1, r4, #0 |
| 42: fffef7ff f7ff bl 42 <_main+0x1a> |
| 46: bd30 bd30 pop {r4, r5, pc} |
| |