| .. Copyright (C) 2015-2021 Free Software Foundation, Inc. |
| Originally contributed by David Malcolm <dmalcolm@redhat.com> |
| |
| This is free software: you can redistribute it and/or modify it |
| under the terms of the GNU General Public License as published by |
| the Free Software Foundation, either version 3 of the License, or |
| (at your option) any later version. |
| |
| This program is distributed in the hope that it will be useful, but |
| WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| General Public License for more details. |
| |
| You should have received a copy of the GNU General Public License |
| along with this program. If not, see |
| <http://www.gnu.org/licenses/>. |
| |
| Tutorial part 5: Implementing an Ahead-of-Time compiler |
| ------------------------------------------------------- |
| |
| If you have a pre-existing language frontend that's compatible with |
| libgccjit's license, it's possible to hook it up to libgccjit as a |
| backend. In the previous example we showed |
| how to do that for in-memory JIT-compilation, but libgccjit can also |
| compile code directly to a file, allowing you to implement a more |
| traditional ahead-of-time compiler ("JIT" is something of a misnomer |
| for this use-case). |
| |
| The essential difference is to compile the context using |
| :c:func:`gcc_jit_context_compile_to_file` rather than |
| :c:func:`gcc_jit_context_compile`. |
| |
| The "brainf" language |
| ********************* |
| |
| In this example we use libgccjit to construct an ahead-of-time compiler |
| for an esoteric programming language that we shall refer to as "brainf". |
| |
| brainf scripts operate on an array of bytes, with a notional data pointer |
| within the array. |
| |
| brainf is hard for humans to read, but it's trivial to write a parser for |
| it, as there is no lexing; just a stream of bytes. The operations are: |
| |
| ====================== ============================= |
| Character Meaning |
| ====================== ============================= |
| ``>`` ``idx += 1`` |
| ``<`` ``idx -= 1`` |
| ``+`` ``data[idx] += 1`` |
| ``-`` ``data[idx] -= 1`` |
| ``.`` ``output (data[idx])`` |
| ``,`` ``data[idx] = input ()`` |
| ``[`` loop until ``data[idx] == 0`` |
| ``]`` end of loop |
| Anything else ignored |
| ====================== ============================= |
| |
| Unlike the previous example, we'll implement an ahead-of-time compiler, |
| which reads ``.bf`` scripts and outputs executables (though it would |
| be trivial to have it run them JIT-compiled in-process). |
| |
| Here's what a simple ``.bf`` script looks like: |
| |
| .. literalinclude:: ../examples/emit-alphabet.bf |
| :lines: 1- |
| |
| .. note:: |
| |
| This example makes use of whitespace and comments for legibility, but |
| could have been written as:: |
| |
| ++++++++++++++++++++++++++ |
| >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< |
| [>.+<-] |
| |
| It's not a particularly useful language, except for providing |
| compiler-writers with a test case that's easy to parse. The point |
| is that you can use :c:func:`gcc_jit_context_compile_to_file` |
| to use libgccjit as a backend for a pre-existing language frontend |
| (provided that the pre-existing frontend is compatible with libgccjit's |
| license). |
| |
| Converting a brainf script to libgccjit IR |
| ****************************************** |
| |
| As before we write simple code to populate a :c:type:`gcc_jit_context *`. |
| |
| .. literalinclude:: ../examples/tut05-bf.c |
| :start-after: #define MAX_OPEN_PARENS 16 |
| :end-before: /* Entrypoint to the compiler. */ |
| :language: c |
| |
| Compiling a context to a file |
| ***************************** |
| |
| Unlike the previous tutorial, this time we'll compile the context |
| directly to an executable, using :c:func:`gcc_jit_context_compile_to_file`: |
| |
| .. code-block:: c |
| |
| gcc_jit_context_compile_to_file (ctxt, |
| GCC_JIT_OUTPUT_KIND_EXECUTABLE, |
| output_file); |
| |
| Here's the top-level of the compiler, which is what actually calls into |
| :c:func:`gcc_jit_context_compile_to_file`: |
| |
| .. literalinclude:: ../examples/tut05-bf.c |
| :start-after: /* Entrypoint to the compiler. */ |
| :end-before: /* Use the built compiler to compile the example to an executable: |
| :language: c |
| |
| Note how once the context is populated you could trivially instead compile |
| it to memory using :c:func:`gcc_jit_context_compile` and run it in-process |
| as in the previous tutorial. |
| |
| To create an executable, we need to export a ``main`` function. Here's |
| how to create one from the JIT API: |
| |
| .. literalinclude:: ../examples/tut05-bf.c |
| :start-after: #include "libgccjit.h" |
| :end-before: #define MAX_OPEN_PARENS 16 |
| :language: c |
| |
| .. note:: |
| |
| The above implementation ignores ``argc`` and ``argv``, but you could |
| make use of them by exposing ``param_argc`` and ``param_argv`` to the |
| caller. |
| |
| Upon compiling this C code, we obtain a bf-to-machine-code compiler; |
| let's call it ``bfc``: |
| |
| .. code-block:: console |
| |
| $ gcc \ |
| tut05-bf.c \ |
| -o bfc \ |
| -lgccjit |
| |
| We can now use ``bfc`` to compile .bf files into machine code executables: |
| |
| .. code-block:: console |
| |
| $ ./bfc \ |
| emit-alphabet.bf \ |
| a.out |
| |
| which we can run directly: |
| |
| .. code-block:: console |
| |
| $ ./a.out |
| ABCDEFGHIJKLMNOPQRSTUVWXYZ |
| |
| Success! |
| |
| We can also inspect the generated executable using standard tools: |
| |
| .. code-block:: console |
| |
| $ objdump -d a.out |less |
| |
| which shows that libgccjit has managed to optimize the function |
| somewhat (for example, the runs of 26 and 65 increment operations |
| have become integer constants 0x1a and 0x41): |
| |
| .. code-block:: console |
| |
| 0000000000400620 <main>: |
| 400620: 80 3d 39 0a 20 00 00 cmpb $0x0,0x200a39(%rip) # 601060 <data |
| 400627: 74 07 je 400630 <main |
| 400629: eb fe jmp 400629 <main+0x9> |
| 40062b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) |
| 400630: 48 83 ec 08 sub $0x8,%rsp |
| 400634: 0f b6 05 26 0a 20 00 movzbl 0x200a26(%rip),%eax # 601061 <data_cells+0x1> |
| 40063b: c6 05 1e 0a 20 00 1a movb $0x1a,0x200a1e(%rip) # 601060 <data_cells> |
| 400642: 8d 78 41 lea 0x41(%rax),%edi |
| 400645: 40 88 3d 15 0a 20 00 mov %dil,0x200a15(%rip) # 601061 <data_cells+0x1> |
| 40064c: 0f 1f 40 00 nopl 0x0(%rax) |
| 400650: 40 0f b6 ff movzbl %dil,%edi |
| 400654: e8 87 fe ff ff callq 4004e0 <putchar@plt> |
| 400659: 0f b6 05 01 0a 20 00 movzbl 0x200a01(%rip),%eax # 601061 <data_cells+0x1> |
| 400660: 80 2d f9 09 20 00 01 subb $0x1,0x2009f9(%rip) # 601060 <data_cells> |
| 400667: 8d 78 01 lea 0x1(%rax),%edi |
| 40066a: 40 88 3d f0 09 20 00 mov %dil,0x2009f0(%rip) # 601061 <data_cells+0x1> |
| 400671: 75 dd jne 400650 <main+0x30> |
| 400673: 31 c0 xor %eax,%eax |
| 400675: 48 83 c4 08 add $0x8,%rsp |
| 400679: c3 retq |
| 40067a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) |
| |
| We also set up debugging information (via |
| :c:func:`gcc_jit_context_new_location` and |
| :c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb`` |
| to singlestep through the generated binary and inspect the internal |
| state ``idx`` and ``data_cells``: |
| |
| .. code-block:: console |
| |
| (gdb) break main |
| Breakpoint 1 at 0x400790 |
| (gdb) run |
| Starting program: a.out |
| |
| Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448) |
| (gdb) stepi |
| 0x0000000000400797 in main (argc=1, argv=0x7fffffffe448) |
| (gdb) stepi |
| 0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448) |
| (gdb) stepi |
| 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< |
| (gdb) list |
| 4 |
| 5 cell 0 = 26 |
| 6 ++++++++++++++++++++++++++ |
| 7 |
| 8 cell 1 = 65 |
| 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< |
| 10 |
| 11 while cell#0 != 0 |
| 12 [ |
| 13 > |
| (gdb) n |
| 6 ++++++++++++++++++++++++++ |
| (gdb) n |
| 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< |
| (gdb) p idx |
| $1 = 1 |
| (gdb) p data_cells |
| $2 = "\032", '\000' <repeats 29998 times> |
| (gdb) p data_cells[0] |
| $3 = 26 '\032' |
| (gdb) p data_cells[1] |
| $4 = 0 '\000' |
| (gdb) list |
| 4 |
| 5 cell 0 = 26 |
| 6 ++++++++++++++++++++++++++ |
| 7 |
| 8 cell 1 = 65 |
| 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< |
| 10 |
| 11 while cell#0 != 0 |
| 12 [ |
| 13 > |
| |
| |
| Other forms of ahead-of-time-compilation |
| **************************************** |
| |
| The above demonstrates compiling a :c:type:`gcc_jit_context *` directly |
| to an executable. It's also possible to compile it to an object file, |
| and to a dynamic library. See the documentation of |
| :c:func:`gcc_jit_context_compile_to_file` for more information. |