[TUHS] Question about early C behavior.

Paul Winalski paul.winalski at gmail.com
Sat Jan 11 06:24:30 AEST 2020


On 1/10/20, Dan Cross <crossd at gmail.com> wrote:
>
> Given the definition `int x;` (without an initializer) in a source file the
> corresponding object contains `x` in a "common" section. What this means is
> that, at link time, if some object file explicitly allocates an 'x' (e.g.,
> by specifying an initializer, so that 'x' appears in the data section for
> that object file), use that; otherwise, allocate space for it at link time,
> possibly in the BSS. If several source files contain such a declaration,
> the linker allocates exactly one 'x' (or whatever identifier) as
> appropriate. We've verified that this behavior was present as early as 6th
> edition.

I think the situation you describe (common sections) is how this is
done in ELF.  a.out and COFF, as used on Unix, don't have common
sections.  Instead 'int x;' (without an initializer) becomes symbol
'x' in the object file's symbol table, with both the "external" and
"undefined" attribute bits set, and with the symbol's value being the
size of 'x' (typically 4 bites, in your example).  It is the non-zero
symbol value that distinguishes common symbols from ordinary external
references, e.g., 'extern int x;' (without an initializer).

At link time, common symbols are handled differently from ordinary
external references:

[1] When the linker is searching libraries, an ordinary external
reference to 'x' will cause the linker to load an object that contains
an external definition for 'x'.  Common symbols do not trigger the
loading of an object from a library.

[2] After the linker has processed all of the files and libraries on
the command line, if there is an external definition for 'x', all
common symbol references to 'x' are treated as ordinary external
references to 'x' and resolved against the definition.  If no external
definition is found, the linker allocates 'x' in BSS, using the
maximum allocation size seen in any common symbol references to 'x'.
All common symbol references and ordinary external references to 'x'
are resolved to the newly-allocated space.

> The question is, what is the origin of this concept and nomenclature?
> FORTRAN, of course, has "common blocks": was that an inspiration for the
> name? Where did the idea for the implicit behavior come from (FORTRAN
> common blocks are explicit).

Yes, the concept, nomenclature, and semantics come from FORTRAN, and
they were included in a.out and COFF to support FORTRAN and other
languages (such as PL/I) that have COMMON block-type semantics.  I
don't know why 'int x;' (without an initializer) in C was implemented
as a common symbol.  I suspect it was done to allow C and FORTRAN
object modules linked together in the same executable to share
external data.

-Paul W.


More information about the TUHS mailing list