. \"define f2c % "\f(CWf2c\fP" % . \"define F2c % "\f(CWF2c\fP" % .de Bp .ft R .sp .5 .in \w'\(bu\ 'u .ti 0 \(bu\ \c .. .EQ define dollar % "\f(CW$\fP" % delim $$ define f2c % "f\|2c" % define F2c % "F\^2c" % define libF77 % "libF77" % define libI77 % "libI77" % define LibF77 % "LibF77" % define LibI77 % "LibI77" % .EN .TL A Fortran to C Converter .AU S. I. Feldman .AI Bellcore Morristown, NJ 07960 .AU D. M. Gay .AI .MH .AU M. W. Maimone .AI Carnegie-Mellon University Pittsburgh, PA 15213 .AU N. L. Schryer .AI .MH .AB We describe $f2c$, a program that translates Fortran 77 into C or C++. $F2c$ lets one portably mix C and Fortran and makes a large body of well-tested Fortran source code available to C environments. .AE .SH 1. INTRODUCTION .PP Automatic conversion of Fortran 77 .[ [ ANSI FORTRAN 1978 .]] to C .[ [ Kernighan Ritchie 1978 .] .[ Kernighan Ritchie 1988 .]] is desirable for several reasons. Sometimes it is useful to run a well-tested Fortran program on a machine that has a C compiler but no Fortran compiler. At other times, it is convenient to mix C and Fortran. Some things are impossible to express in Fortran 77 or are harder to express in Fortran than in C (e.g. storage management, some character operations, arrays of functions, heterogeneous data structures, and calls that depend on the operating system), and some programmers simply prefer C to Fortran. There is a large body of well tested Fortran source code for carrying out a wide variety of useful calculations, and it is sometimes desirable to exploit some of this Fortran source in a C environment. Many vendors provide some way of mixing C and Fortran, but the details vary from system to system. Automatic Fortran to C conversion lets one create a .I portable C program that exploits Fortran source code. .PP A side benefit of automatic Fortran 77 to C conversion is that it allows such tools as .I cyntax (1) and .I lint (1) \ .[[ v101 .]] to provide Fortran 77 programs with some of the consistency and portability checks that the Pfort Verifier .[ [ Ryder 1974 .]] provided to Fortran 66 programs. The consistency checks detect errors in calling sequences and are thus a boon to debugging. .PP This paper describes $f2c$, a Fortran 77 to C converter based on Feldman's original $f77$ compiler .[ [ Feldman Weinberger Portable Fortran .]]. We have used $f2c$ to convert various large programs and subroutine libraries to C automatically (i.e., with no manual intervention); these include the \s-2PORT3\s+2 subroutine library (\s-2PORT1\s+2 is described in .[ [ Fox Hall Schryer Algorithm 1978 .] .[ Fox Hall Schryer port 1978 .]]), MINOS .[ [ Murtagh Saunders 1987 .]], and Schryer's floating-point test .[ [ Schryer floating .]]. The floating-point test is of particular interest, as it relies heavily on correct evaluation of parenthesized expressions and is bit-level self-testing. .PP As a debugging aid, we sought bit-level compatibility between objects compiled from the C produced by $f2c$ and objects produced by our local $f77$ compiler. That is, on the VAX where we developed $f2c$, we sought to make it impossible to tell by running a Fortran program whether some of its modules had been compiled by $f2c$ or all had been compiled by $f77$. This meant that $f2c$ should follow the same calling conventions as $f77$ .[ [ Feldman Weinberger Portable Fortran .]] and should use $f77$'s support libraries, $libF77$ and $libI77$. .PP Although we have tried to make $f2c$'s output reasonably readable, our goal of strict compatibility with $f77$ implies some nasty looking conversions. Input/output statements, in particular, generally get expanded into a series of calls on routines in $libI77$, $f77$'s I/O library. Thus the C output of $f2c$ would probably be something of a nightmare to maintain as C; it would be much more sensible to maintain the original Fortran, translating it anew each time it changed. Some commercial vendors, e.g., those listed in Appendix A, seek to perform translations yielding C that one might reasonably maintain directly; these translations generally require some manual intervention. .PP The rest of this paper is organized as follows. Section 2 describes the interlanguage conventions used by $f2c$ (and $f77$). \(sc3 summarizes some extensions to Fortran 77 that $f2c$ recognizes. . \"The extensions to Fortran 77 that $f2c$ recognizes are summarized in \(sc3. Example invocations of $f2c$ appear in \(sc4. \(sc5 illustrates various details of $f2c$'s translations, and \(sc6 considers portability issues. \(sc7 discusses the generation and use of .I prototypes , which can be used both by C++ and ANSI C compilers and by $f2c$ to check consistency of calling sequences. \(sc8 describes our experience with an experimental $f2c$ service provided by $netlib$ .[ [ Dongarra Grosse 1987 .]], and \(sc9 considers possible extensions. Appendix A lists some vendors who offer conversion of Fortran to C that one might maintain as C. Finally, Appendix B contains a $man$ page telling how to use $f2c$. .SH 2. INTERLANGUAGE CONVENTIONS .PP Much of the material in this section is taken from .[ [ Feldman Weinberger Portable Fortran .]]. .SH Names .PP An $f2c$ extension inspired by Fortran 8x .[ [ Fort8x .]] is that long names are allowed ($f2c$ truncates names that are longer than 50 characters), and names may contain underscores. To avoid conflict with the names of library routines and with names that $f2c$ generates, Fortran names may have one or two underscores appended. Fortran names are forced to lower case (unless the .CW \%-U option described in Appendix B is in effect); external names, i.e., the names of Fortran procedures and common blocks, have a single underscore appended if they do not contain any underscores and have a pair of underscores appended if they do contain underscores. Thus Fortran subroutines named .CW ABC , .CW A_B_C , and .CW A_B_C_ result in C functions named .CW abc_ , .CW a_b_c_\|\^_ , and .CW a_b_c_\|\^_\|\^_ . .SH Types .PP The table below shows corresponding Fortran and C declarations; the C declarations use types defined in .CW f2c.h , a header file upon which $f2c$'s translations rely. The table also shows the C types defined in the standard version of .CW f2c.h . .KS .TS center box; c c c l l l. Fortran C standard \f(CWf2c.h\fP .sp .5 integer\(**2 x shortint x; short int x; integer x integer x; long int x; logical x long int x; long int x; real x real x; float x; double precision x doublereal x; double x; complex x complex x; struct { float r, i; } x; double complex x doublecomplex x; struct { double r, i; } x; character\(**6 x char x[6]; char x[6]; .TE .KE By the rules of Fortran, .CW integer, .CW logical, and .CW real data occupy the same amount of memory, and .CW "double precision" and .CW complex occupy twice this amount; $f2c$ assumes that the types in the C column above are chosen (in .CW f2c.h ) so that these assumptions are valid. The translations of the Fortran .CW equivalence and .CW data statements depend on these assumptions. On some machines, one must modify .CW f2c.h to make these assumptions hold. See \(sc6 for examples and further discussion. .SH Return Values .PP A function of type .CW integer , .CW logical , or .CW "double precision" must be declared as a C function that returns the corresponding type. If the .CW \%-R option is in effect (see Appendix B), the same is true of a function of type .CW real ; otherwise, a .CW real function must be declared as a C function that returns .CW doublereal ; this hack facilitates our VAX regression testing, as it duplicates the behavior of our local Fortran compiler ($f77$). A .CW complex or .CW "double complex" function is equivalent to a C routine with an additional initial argument that points to the place where the return value is to be stored. Thus, .P1 complex function f( . . . ) .P2 is equivalent to .P1 void f_(temp, . . .) complex \(**temp; . . . .P2 A character-valued function is equivalent to a C routine with two extra initial arguments: a data address and a length. Thus, .P1 character\(**15 function g( . . . ) .P2 is equivalent to .P1 g_(result, length, . . .) char \(**result; ftnlen length; . . . .P2 and could be invoked in C by .P1 char chars[15]; . . . g_(chars, 15L, . . . ); .P2 Subroutines are invoked as if they were .CW int -valued functions whose value specifies which alternate return to use. Alternate return arguments (statement labels) are not passed to the function, but are used to do an indexed branch in the calling procedure. (If the subroutine has no entry points with alternate return arguments, the returned value is undefined.) The statement .P1 call nret(\(**1, \(**2, \(**3) .P2 is treated exactly as if it were the Fortran computed .CW goto .P1 goto (1, 2, 3), nret( ) .P2 .SH Argument Lists .PP All Fortran arguments are passed by address. In addition, for every non-function argument that is of type character, an argument giving the length of the value is passed. (The string lengths are .CW ftnlen values, i.e., .CW "long int" quantities passed by value). In summary, the order of arguments is: extra arguments for complex and character functions, an address for each datum or function, and a .CW ftnlen for each character argument (other than character-valued functions). Thus, the call in .P1 external f character\(**7 s integer b(3) . . . call sam(f, b(2), s) .P2 is equivalent to that in .P1 int f(); char s[7]; long int b[3]; . . . sam_(f, &b[1], s, 7L); .P2 Note that the first element of a C array always has subscript zero, but Fortran arrays begin at 1 by default. Because Fortran arrays are stored in column-major order, whereas C arrays are stored in row-major order, $f2c$ translates multi-dimensional Fortran arrays into one-dimensional C arrays and issues appropriate subscripting expressions. .SH 3. EXTENSIONS TO FORTRAN 77 .PP Since it is derived from $f77$, $f2c$ supports all of the $f77$ extensions described in .[ [ Feldman Weinberger Portable Fortran .]]. $F2c$'s extensions include the following. .Bp Type .CW "double complex" (alias .CW "complex*16" ) is a double-precision version of .CW complex . Specific intrinsic functions for .CW "double complex" have names that start with .CW z rather than .CW c . .Bp The ``types'' that may appear in an .CW implicit statement include .CW undefined , which implies that variables whose names begin with the associated letters must be explicitly declared in a type statement. $F2c$ also recognizes the Fortran 8x statement .P1 implicit none .P2 as equivalent to .P1 implicit undefined(a-z) .P2 The command-line option .CW \%-u has the effect of inserting .P1 implicit none .P2 at the beginning of each Fortran procedure. .Bp Procedures may call themselves recursively, i.e., may call themselves either directly or indirectly through a chain of other calls. .Bp The keywords .CW static and .CW automatic act as ``types'' in type and implicit statements; they specify storage classes. There is exactly one copy of each .CW static variable, and such variables retain their values between invocations of the procedure in which they appear. On the other hand, each invocation of a procedure gets new copies of the procedure's .CW automatic variables. .CW Automatic variables may not appear in .CW equivalence , .CW data , .CW namelist , or .CW save statements. The command-line option .CW \%-a changes the default storage class from .CW static to .CW automatic (for all variables except those that appear in .CW common , .CW data , .CW equivalence , .CW namelist , or .CW save statements). .Bp A tab in the first 6 columns signifies that the current line is a free-format line, which may extend beyond column 72. An ampersand .CW & in column 1 indicates that the current line is a free-format continuation line. Lines that have neither an ampersand in column 1 nor a tab in the first 6 columns are treated as Fortran 77 fixed-format lines: if shorter than 72 characters, they are padded on the right with blanks until they are 72 characters long; if longer than 72 characters, the characters beyond column 72 are discarded. After taking continuations into account, statements may be up to 1320 characters long; this is the only constraint on the length of free-format lines. (This limit is implied by the Fortran 77 standard, which allows at most 19 continuation lines; $1320 ~=~ (1^+^19) ~times~ 66$.) .Bp Aside from quoted strings, $f2c$ ignores case (unless the .CW \%-U option is in effect). .Bp The statement .P1 include stuff .P2 is replaced by the contents of the file .CW stuff. .CW Include s may be nested to a reasonable depth, currently ten. The command-line option .CW \%-!I disables .CW include s; this option is used by the $netlib$ $f2c$ service described in \(sc8 (for which .CW include obviously makes no sense). .Bp $F77$ allows binary, octal, and hexadecimal constants to appear in .CW data statements; $f2c$ goes somewhat further, allowing such constants to appear anywhere; they are treated just like a decimal integer constant having the equivalent value. Binary, octal, and hexadecimal constants may assume one of two forms: a letter followed by a quoted string of digits, or a decimal base, followed by a sharp sign .CW # , followed by a string of digits (not quoted). The letter is .CW b or .CW B for binary constants, .CW o or .CW O for octal constants, and .CW x , .CW X , .CW z , or .CW Z for hexadecimal constants. Thus, for example, .CW z'a7' , .CW 16#a7 , .CW o'247' , .CW 8#247 , .CW b'10100111' and .CW 2#10100111 are all treated just like the integer .CW 167 . .Bp For compatibility with C, quoted strings may contain the following escapes: .TS center box; aFCW a a aFCW a. \e0 null \ \en newline \e\e \e \ \er carriage return \eb backspace \ \et tab \ef form feed \ \ev vertical tab .T& aFCW a s s s. \e' apostrophe (does not terminate a string) \e" quotation mark (does not terminate a string) \e\fIx\fP \fIx\fR, where \fIx\fR is any other character .TE The .CW \%-!bs option tells $f2c$ not to recognize these escapes. Quoted strings may be delimited either by double quotes (\ \f(CW"\fR\ ) or by single quotes (\ \f(CW\(fm\fR\ ); if a string starts with one kind of quote, the other kind may be embedded in the string without being repeated or quoted by a backslash escape. Where possible, translated strings are null-terminated.