| Copyright (C) 2000 Free Software Foundation, Inc. |
| |
| This file is intended to contain a few notes about writing C code |
| within GCC so that it compiles without error on the full range of |
| compilers GCC needs to be able to compile on. |
| |
| The problem is that many ISO-standard constructs are not accepted by |
| either old or buggy compilers, and we keep getting bitten by them. |
| This knowledge until know has been sparsely spread around, so I |
| thought I'd collect it in one useful place. Please add and correct |
| any problems as you come across them. |
| |
| I'm going to start from a base of the ISO C89 standard, since that is |
| probably what most people code to naturally. Obviously using |
| constructs introduced after that is not a good idea. |
| |
| The first section of this file deals strictly with portability issues, |
| the second with common coding pitfalls. |
| |
| |
| Portability Issues |
| ================== |
| |
| Unary + |
| ------- |
| |
| K+R C compilers and preprocessors have no notion of unary '+'. Thus |
| the following code snippet contains 2 portability problems. |
| |
| int x = +2; /* int x = 2; */ |
| #if +1 /* #if 1 */ |
| #endif |
| |
| |
| Pointers to void |
| ---------------- |
| |
| K+R C compilers did not have a void pointer, and used char * as the |
| pointer to anything. The macro PTR is defined as either void * or |
| char * depending on whether you have a standards compliant compiler or |
| a K+R one. Thus |
| |
| free ((void *) h->value.expansion); |
| |
| should be written |
| |
| free ((PTR) h->value.expansion); |
| |
| Further, an initial investigation indicates that pointers to functions |
| returning void are okay. Thus the example given by "Calling functions |
| through pointers to functions" below appears not to cause a problem. |
| |
| |
| String literals |
| --------------- |
| |
| Some SGI compilers choke on the parentheses in:- |
| |
| const char string[] = ("A string"); |
| |
| This is unfortunate since this is what the GNU gettext macro N_ |
| produces. You need to find a different way to code it. |
| |
| K+R C did not allow concatenation of string literals like |
| |
| "This is a " "single string literal". |
| |
| Moreover, some compilers like MSVC++ have fairly low limits on the |
| maximum length of a string literal; 509 is the lowest we've come |
| across. You may need to break up a long printf statement into many |
| smaller ones. |
| |
| |
| Empty macro arguments |
| --------------------- |
| |
| ISO C (6.8.3 in the 1990 standard) specifies the following: |
| |
| If (before argument substitution) any argument consists of no |
| preprocessing tokens, the behavior is undefined. |
| |
| This was relaxed by ISO C99, but some older compilers emit an error, |
| so code like |
| |
| #define foo(x, y) x y |
| foo (bar, ) |
| |
| needs to be coded in some other way. |
| |
| |
| signed keyword |
| -------------- |
| |
| The signed keyword did not exist in K+R compilers; it was introduced |
| in ISO C89, so you cannot use it. In both K+R and standard C, |
| unqualified char and bitfields may be signed or unsigned. There is no |
| way to portably declare signed chars or signed bitfields. |
| |
| All other arithmetic types are signed unless you use the 'unsigned' |
| qualifier. For instance, it is safe to write |
| |
| short paramc; |
| |
| instead of |
| |
| signed short paramc; |
| |
| If you have an algorithm that depends on signed char or signed |
| bitfields, you must find another way to write it before it can be |
| integrated into GCC. |
| |
| |
| Function prototypes |
| ------------------- |
| |
| You need to provide a function prototype for every function before you |
| use it, and functions must be defined K+R style. The function |
| prototype should use the PARAMS macro, which takes a single argument. |
| Therefore the parameter list must be enclosed in parentheses. For |
| example, |
| |
| int myfunc PARAMS ((double, int *)); |
| |
| int |
| myfunc (var1, var2) |
| double var1; |
| int *var2; |
| { |
| ... |
| } |
| |
| This implies that if the function takes no arguments, it should be |
| declared and defined as follows: |
| |
| int myfunc PARAMS ((void)); |
| |
| int |
| myfunc () |
| { |
| ... |
| } |
| |
| You also need to use PARAMS when referring to function protypes in |
| other circumstances, for example see "Calling functions through |
| pointers to functions" below. |
| |
| Variable-argument functions are best described by example:- |
| |
| void cpp_ice PARAMS ((cpp_reader *, const char *msgid, ...)); |
| |
| void |
| cpp_ice VPARAMS ((cpp_reader *pfile, const char *msgid, ...)) |
| { |
| VA_OPEN (ap, msgid); |
| VA_FIXEDARG (ap, cpp_reader *, pfile); |
| VA_FIXEDARG (ap, const char *, msgid); |
| |
| ... |
| VA_CLOSE (ap); |
| } |
| |
| See ansidecl.h for the definitions of the above macros and more. |
| |
| One aspect of using K+R style function declarations, is you cannot |
| have arguments whose types are char, short, or float, since without |
| prototypes (ie, K+R rules), these types are promoted to int, int, and |
| double respectively. |
| |
| Calling functions through pointers to functions |
| ----------------------------------------------- |
| |
| K+R C compilers require parentheses around the dereferenced function |
| pointer expression in the call, whereas ISO C relaxes the syntax. For |
| example |
| |
| typedef void (* cl_directive_handler) PARAMS ((cpp_reader *, const char *)); |
| *p->handler (pfile, p->arg); |
| |
| needs to become |
| |
| (*p->handler) (pfile, p->arg); |
| |
| |
| Macros |
| ------ |
| |
| The rules under K+R C and ISO C for achieving stringification and |
| token pasting are quite different. Therefore some macros have been |
| defined which will get it right depending upon the compiler. |
| |
| CONCAT2(a,b) CONCAT3(a,b,c) and CONCAT4(a,b,c,d) |
| |
| will paste the tokens passed as arguments. You must not leave any |
| space around the commas. Also, |
| |
| STRINGX(x) |
| |
| will stringify an argument; to get the same result on K+R and ISO |
| compilers x should not have spaces around it. |
| |
| |
| Passing structures by value |
| --------------------------- |
| |
| Avoid passing structures by value, either to or from functions. It |
| seems some K+R compilers handle this differently or not at all. |
| |
| |
| Enums |
| ----- |
| |
| In K+R C, you have to cast enum types to use them as integers, and |
| some compilers in particular give lots of warnings for using an enum |
| as an array index. |
| |
| |
| Bitfields |
| --------- |
| |
| See also "signed keyword" above. In K+R C only unsigned int bitfields |
| were defined (i.e. unsigned char, unsigned short, unsigned long. |
| Using plain int/short/long was not allowed). |
| |
| |
| free and realloc |
| ---------------- |
| |
| Some implementations crash upon attempts to free or realloc the null |
| pointer. Thus if mem might be null, you need to write |
| |
| if (mem) |
| free (mem); |
| |
| |
| Reserved Keywords |
| ----------------- |
| |
| K+R C has "entry" as a reserved keyword, so you should not use it for |
| your variable names. |
| |
| |
| Type promotions |
| --------------- |
| |
| K+R used unsigned-preserving rules for arithmetic expresssions, while |
| ISO uses value-preserving. This means an unsigned char compared to an |
| int is done as an unsigned comparison in K+R (since unsigned char |
| promotes to unsigned) while it is signed in ISO (since all of the |
| values in unsigned char fit in an int, it promotes to int). |
| |
| Trigraphs |
| --------- |
| |
| You weren't going to use them anyway, but trigraphs were not defined |
| in K+R C, and some otherwise ISO C compliant compilers do not accept |
| them. |
| |
| |
| Suffixes on Integer Constants |
| ----------------------------- |
| |
| K+R C did not accept a 'u' suffix on integer constants. If you want |
| to declare a constant to be be unsigned, you must use an explicit |
| cast. |
| |
| You should never use a 'l' suffix on integer constants ('L' is fine), |
| since it can easily be confused with the number '1'. |
| |
| |
| Common Coding Pitfalls |
| ====================== |
| |
| errno |
| ----- |
| |
| errno might be declared as a macro. |
| |
| |
| Implicit int |
| ------------ |
| |
| In C, the 'int' keyword can often be omitted from type declarations. |
| For instance, you can write |
| |
| unsigned variable; |
| |
| as shorthand for |
| |
| unsigned int variable; |
| |
| There are several places where this can cause trouble. First, suppose |
| 'variable' is a long; then you might think |
| |
| (unsigned) variable |
| |
| would convert it to unsigned long. It does not. It converts to |
| unsigned int. This mostly causes problems on 64-bit platforms, where |
| long and int are not the same size. |
| |
| Second, if you write a function definition with no return type at |
| all: |
| |
| operate (a, b) |
| int a, b; |
| { |
| ... |
| } |
| |
| that function is expected to return int, *not* void. GCC will warn |
| about this. K+R C has no problem with 'void' as a return type, so you |
| need not worry about that. |
| |
| Implicit function declarations always have return type int. So if you |
| correct the above definition to |
| |
| void |
| operate (a, b) |
| int a, b; |
| ... |
| |
| but operate() is called above its definition, you will get an error |
| about a "type mismatch with previous implicit declaration". The cure |
| is to prototype all functions at the top of the file, or in an |
| appropriate header. |
| |
| Char vs unsigned char vs int |
| ---------------------------- |
| |
| In C, unqualified 'char' may be either signed or unsigned; it is the |
| implementation's choice. When you are processing 7-bit ASCII, it does |
| not matter. But when your program must handle arbitrary binary data, |
| or fully 8-bit character sets, you have a problem. The most obvious |
| issue is if you have a look-up table indexed by characters. |
| |
| For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A |
| WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be |
| true. But if you read '\341' from a file and store it in a plain |
| char, isalpha(c) may look up character 225, or it may look up |
| character -31. And the ctype table has no entry at offset -31, so |
| your program will crash. (If you're lucky.) |
| |
| It is wise to use unsigned char everywhere you possibly can. This |
| avoids all these problems. Unfortunately, the routines in <string.h> |
| take plain char arguments, so you have to remember to cast them back |
| and forth - or avoid the use of strxxx() functions, which is probably |
| a good idea anyway. |
| |
| Another common mistake is to use either char or unsigned char to |
| receive the result of getc() or related stdio functions. They may |
| return EOF, which is outside the range of values representable by |
| char. If you use char, some legal character value may be confused |
| with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1). |
| The correct choice is int. |
| |
| A more subtle version of the same mistake might look like this: |
| |
| unsigned char pushback[NPUSHBACK]; |
| int pbidx; |
| #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c)) |
| #define get(c) (pbidx ? pushback[--pbidx] : getchar()) |
| ... |
| unget(EOF); |
| |
| which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y |
| WITH UMLAUT. |
| |
| |
| Other common pitfalls |
| --------------------- |
| |
| o Expecting 'plain' char to be either sign or unsigned extending |
| |
| o Shifting an item by a negative amount or by greater than or equal to |
| the number of bits in a type (expecting shifts by 32 to be sensible |
| has caused quite a number of bugs at least in the early days). |
| |
| o Expecting ints shifted right to be sign extended. |
| |
| o Modifying the same value twice within one sequence point. |
| |
| o Host vs. target floating point representation, including emitting NaNs |
| and Infinities in a form that the assembler handles. |
| |
| o qsort being an unstable sort function (unstable in the sense that |
| multiple items that sort the same may be sorted in different orders |
| by different qsort functions). |
| |
| o Passing incorrect types to fprintf and friends. |
| |
| o Adding a function declaration for a module declared in another file to |
| a .c file instead of to a .h file. |