| This file describes the implementation notes of the GNU C Compiler for |
| the National Semiconductor 32032 chip (and 32000 family). |
| |
| The 32032 machine description and configuration file for this compiler |
| is, for NS32000 family machine, primarily machine independent. |
| However, since this release still depends on vendor-supplied |
| assemblers and linkers, the compiler must obey the existing |
| conventions of the actual machine to which this compiler is targeted. |
| In this case, the actual machine which this compiler was targeted to |
| is a Sequent Balance 8000, running DYNIX 2.1. |
| |
| The assembler for DYNIX 2.1 (and DYNIX 3.0, alas) does not cope with |
| the full generality of the addressing mode REGISTER RELATIVE. |
| Specifically, it generates incorrect code for operands of the |
| following form: |
| |
| sym(rn) |
| |
| Where `rn' is one of the general registers. Correct code is generated |
| for operands of the form |
| |
| sym(pn) |
| |
| where `pn' is one of the special processor registers (sb, fp, or sp). |
| |
| An equivalent operand can be generated by the form |
| |
| sym[rn:b] |
| |
| although this addressing mode is about twice as slow on the 32032. |
| |
| The more efficient addressing mode is controlled by defining the |
| constant SEQUENT_ADDRESS_BUG to 0. It is currently defined to be 1. |
| |
| Another bug in the assembler makes it impossible to compute with |
| explicit addresses. In order to compute with a symbolic address, it |
| is necessary to load that address into a register using the "addr" |
| instruction. For example, it is not possible to say |
| |
| cmpd _p,@_x |
| |
| Rather one must say |
| |
| addr _x,rn |
| cmpd _p,rn |
| |
| |
| The ns32032 chip has a number of known bugs. Any attempt to make the |
| compiler unaware of these deficiencies will surely bring disaster. |
| The current list of know bugs are as follows (list provided by Richard |
| Stallman): |
| |
| 1) instructions with two overlapping operands in memory |
| (unlikely in C code, perhaps impossible). |
| |
| 2) floating point conversion instructions with constant |
| operands (these may never happen, but I'm not certain). |
| |
| 3) operands crossing a page boundary. These can be prevented |
| by setting the flag in tm.h that requires strict alignment. |
| |
| 4) Scaled indexing in an insn following an insn that has a read-write |
| operand in memory. This can be prevented by placing a no-op in |
| between. I, Michael Tiemann, do not understand what exactly is meant |
| by `read-write operand in memory'. If this is referring to the special |
| TOS mode, for example "addd 5,tos" then one need not fear, since this |
| will never be generated. However, is this includes "addd 5,-4(fp)" |
| then there is room for disaster. The Sequent compiler does not insert |
| a no-op for code involving the latter, and I have been informed that |
| Sequent is aware of this list of bugs, so I must assume that it is not |
| a problem. |
| |
| 5) The 32032 cannot shift by 32 bits. It shifts modulo the word size |
| of the operand. Therefore, for 32-bit operations, 32-bit shifts are |
| interpreted as zero bit shifts. 32-bit shifts have been removed from |
| the compiler, but future hackers must be careful not to reintroduce |
| them. |
| |
| 6) The ns32032 is a very slow chip; however, some instructions are |
| still very much slower than one might expect. For example, it is |
| almost always faster to double a quantity by adding it to itself than |
| by shifting it by one, even if that quantity is deep in memory. The |
| MOVM instruction has a 20-cycle setup time, after which it moves data |
| at about the speed that normal moves would. It is also faster to use |
| address generation instructions than shift instructions for left |
| shifts less than 4. I do not claim that I generate optimal code for all |
| given patterns, but where I did escape from National's "clean |
| architecture", I did so because the timing specification from the data |
| book says that I will win if I do. I suppose this is called the |
| "performance gap". |
| |
| |
| Signed bitfield extraction has not been implemented. It is not |
| provided by the NS32032, and while it is most certainly possible to do |
| better than the standard shift-left/shift-right sequence, it is also |
| quite hairy. Also, since signed bitfields do not yet exist in C, this |
| omission seems relatively harmless. |
| |
| |
| Zero extractions could be better implemented if it were possible in |
| GCC to provide sized zero extractions: i.e. a byte zero extraction |
| would be allowed to yield a byte result. The current implementation |
| of GCC manifests 68000-ist thinking, where bitfields are extracted |
| into a register, and automatically sign/zero extended to fill the |
| register. See comments in ns32k.md around the "extzv" insn for more |
| details. |
| |
| |
| It should be noted that while the NS32000 family was designed to |
| provide odd-aligned addressing capability for multi-byte data (also |
| provided by the 68020, but not by the 68000 or 68010), many machines |
| do not opt to take advantage of this. For example, on the sequent, |
| although there is no advantage to long-word aligning word data, shorts |
| must be int-aligned in structs. This is an example of another |
| machine-specific machine dependency. |
| |
| |
| Because the ns32032 is has a coherent byte-order/bit-order |
| architecture, many instructions which would be different for |
| 68000-style machines, fold into the same instruction for the 32032. |
| The classic case is push effective address, where it does not matter |
| whether one is pushing a long, word, or byte address. They all will |
| push the same address. |
| |
| |
| The macro FUNCTION_VALUE_REGNO_P is probably not sufficient, what is |
| needed is FUNCTION_VALUE_P, which also takes a MODE parameter. In |
| this way it will be possible to determine more exactly whether a |
| register is really a function value register, or just one that happens |
| to look right. |