Tuesday, November 19, 2019

[Linker Script Tutorial] $7 - SECTIONS Command

The SECTIONS command tells the linker how to map input sections into output sections, and how to place the output sections in memory.
The format of the SECTIONS command is:
SECTIONS
{
  sections-command
  sections-command
  …
}
Each sections-command may of be one of the following:
  • an ENTRY command (see Entry command)
  • a symbol assignment (see Assignments)
  • an output section description
  • an overlay description
The ENTRY command and symbol assignments are permitted inside the SECTIONS command for convenience in using the location counter in those commands. This can also make the linker script easier to understand because you can use those commands at meaningful points in the layout of the output file.
Output section descriptions and overlay descriptions are described below.
If you do not use a SECTIONS command in your linker script, the linker will place each input section into an identically named output section in the order that the sections are first encountered in the input files. If all input sections are present in the first file, for example, the order of sections in the output file will match the order in the first input file. The first section will be at address zero.

1 Output Section Description

The full description of an output section looks like this:
section [address] [(type)] :
  [AT(lma)]
  [ALIGN(section_align) | ALIGN_WITH_INPUT]
  [SUBALIGN(subsection_align)]
  [constraint]
  {
    output-section-command
    output-section-command
    …
  } [>region] [AT>lma_region] [:phdr :phdr …] [=fillexp] [,]
Most output sections do not use most of the optional section attributes.
The whitespace around section is required, so that the section name is unambiguous. The colon and the curly braces are also required. The comma at the end may be required if a fillexp is used and the next sections-command looks like a continuation of the expression. The line breaks and other white space are optional.
Each output-section-command may be one of the following:

2 Output Section Name

The name of the output section is sectionsection must meet the constraints of your output format. In formats which only support a limited number of sections, such as a.out, the name must be one of the names supported by the format (a.out, for example, allows only ‘.text’, ‘.data’ or ‘.bss’). If the output format supports any number of sections, but with numbers and not names (as is the case for Oasys), the name should be supplied as a quoted numeric string. A section name may consist of any sequence of characters, but a name which contains any unusual characters such as commas must be quoted.

The output section name ‘/DISCARD/’ is special; Output Section Discarding.

3 Output Section Address

The address is an expression for the VMA (the virtual memory address) of the output section. This address is optional, but if it is provided then the output address will be set exactly as specified.
If the output address is not specified then one will be chosen for the section, based on the heuristic below. This address will be adjusted to fit the alignment requirement of the output section. The alignment requirement is the strictest alignment of any input section contained within the output section.
The output section address heuristic is as follows:
  • If an output memory region is set for the section then it is added to this region and its address will be the next free address in that region.
  • If the MEMORY command has been used to create a list of memory regions then the first region which has attributes compatible with the section is selected to contain it. The section’s output address will be the next free address in that region; MEMORY.
  • If no memory regions were specified, or none match the section then the output address will be based on the current value of the location counter.
For example:
.text . : { *(.text) }
and
.text : { *(.text) }
are subtly different. The first will set the address of the ‘.text’ output section to the current value of the location counter. The second will set it to the current value of the location counter aligned to the strictest alignment of any of the ‘.text’ input sections.
The address may be an arbitrary expression; Expressions. For example, if you want to align the section on a 0x10 byte boundary, so that the lowest four bits of the section address are zero, you could do something like this:
.text ALIGN(0x10) : { *(.text) }
This works because ALIGN returns the current location counter aligned upward to the specified value.

Specifying address for a section will change the value of the location counter, provided that the section is non-empty. (Empty sections are ignored).


4 Input Section Description

The most common output section command is an input section description.
The input section description is the most basic linker script operation. You use output sections to tell the linker how to lay out your program in memory. You use input section descriptions to tell the linker how to map the input files into your memory layout.




5 Output Section Data

You can include explicit bytes of data in an output section by using BYTESHORTLONGQUAD, or SQUAD as an output section command. Each keyword is followed by an expression in parentheses providing the value to store (see Expressions). The value of the expression is stored at the current value of the location counter.
The BYTESHORTLONG, and QUAD commands store one, two, four, and eight bytes (respectively). After storing the bytes, the location counter is incremented by the number of bytes stored.
For example, this will store the byte 1 followed by the four byte value of the symbol ‘addr’:
BYTE(1)
LONG(addr)
When using a 64 bit host or target, QUAD and SQUAD are the same; they both store an 8 byte, or 64 bit, value. When both host and target are 32 bits, an expression is computed as 32 bits. In this case QUAD stores a 32 bit value zero extended to 64 bits, and SQUAD stores a 32 bit value sign extended to 64 bits.
If the object file format of the output file has an explicit endianness, which is the normal case, the value will be stored in that endianness. When the object file format does not have an explicit endianness, as is true of, for example, S-records, the value will be stored in the endianness of the first input object file.
Note—these commands only work inside a section description and not between them, so the following will produce an error from the linker:
SECTIONS { .text : { *(.text) } LONG(1) .data : { *(.data) } } 
whereas this will work:
SECTIONS { .text : { *(.text) ; LONG(1) } .data : { *(.data) } } 
You may use the FILL command to set the fill pattern for the current section. It is followed by an expression in parentheses. Any otherwise unspecified regions of memory within the section (for example, gaps left due to the required alignment of input sections) are filled with the value of the expression, repeated as necessary. A FILL statement covers memory locations after the point at which it occurs in the section definition; by including more than one FILL statement, you can have different fill patterns in different parts of an output section.
This example shows how to fill unspecified regions of memory with the value ‘0x90’:
FILL(0x90909090)

The FILL command is similar to the ‘=fillexp’ output section attribute, but it only affects the part of the section following the FILL command, rather than the entire section. If both are used, the FILL command takes precedence. See Output Section Fill, for details on the fill expression.


5 Output Section Data

You can include explicit bytes of data in an output section by using BYTESHORTLONGQUAD, or SQUAD as an output section command. Each keyword is followed by an expression in parentheses providing the value to store (see Expressions). The value of the expression is stored at the current value of the location counter.
The BYTESHORTLONG, and QUAD commands store one, two, four, and eight bytes (respectively). After storing the bytes, the location counter is incremented by the number of bytes stored.
For example, this will store the byte 1 followed by the four byte value of the symbol ‘addr’:
BYTE(1)
LONG(addr)
When using a 64 bit host or target, QUAD and SQUAD are the same; they both store an 8 byte, or 64 bit, value. When both host and target are 32 bits, an expression is computed as 32 bits. In this case QUAD stores a 32 bit value zero extended to 64 bits, and SQUAD stores a 32 bit value sign extended to 64 bits.
If the object file format of the output file has an explicit endianness, which is the normal case, the value will be stored in that endianness. When the object file format does not have an explicit endianness, as is true of, for example, S-records, the value will be stored in the endianness of the first input object file.
Note—these commands only work inside a section description and not between them, so the following will produce an error from the linker:
SECTIONS { .text : { *(.text) } LONG(1) .data : { *(.data) } } 
whereas this will work:
SECTIONS { .text : { *(.text) ; LONG(1) } .data : { *(.data) } } 
You may use the FILL command to set the fill pattern for the current section. It is followed by an expression in parentheses. Any otherwise unspecified regions of memory within the section (for example, gaps left due to the required alignment of input sections) are filled with the value of the expression, repeated as necessary. A FILL statement covers memory locations after the point at which it occurs in the section definition; by including more than one FILL statement, you can have different fill patterns in different parts of an output section.
This example shows how to fill unspecified regions of memory with the value ‘0x90’:
FILL(0x90909090)

The FILL command is similar to the ‘=fillexp’ output section attribute, but it only affects the part of the section following the FILL command, rather than the entire section. If both are used, the FILL command takes precedence. See Output Section Fill, for details on the fill expression.


6 Output Section Keywords

There are a couple of keywords which can appear as output section commands.
CREATE_OBJECT_SYMBOLS
The command tells the linker to create a symbol for each input file. The name of each symbol will be the name of the corresponding input file. The section of each symbol will be the output section in which the CREATE_OBJECT_SYMBOLS command appears.
This is conventional for the a.out object file format. It is not normally used for any other object file format.
CONSTRUCTORS
When linking using the a.out object file format, the linker uses an unusual set construct to support C++ global constructors and destructors. When linking object file formats which do not support arbitrary sections, such as ECOFF and XCOFF, the linker will automatically recognize C++ global constructors and destructors by name. For these object file formats, the CONSTRUCTORS command tells the linker to place constructor information in the output section where the CONSTRUCTORS command appears. The CONSTRUCTORS command is ignored for other object file formats.
The symbol __CTOR_LIST__ marks the start of the global constructors, and the symbol __CTOR_END__ marks the end. Similarly, __DTOR_LIST__ and __DTOR_END__ mark the start and end of the global destructors. The first word in the list is the number of entries, followed by the address of each constructor or destructor, followed by a zero word. The compiler must arrange to actually run the code. For these object file formats GNU C++ normally calls constructors from a subroutine __main; a call to __main is automatically inserted into the startup code for mainGNU C++ normally runs destructors either by using atexit, or directly from the function exit.
For object file formats such as COFF or ELF which support arbitrary section names, GNU C++ will normally arrange to put the addresses of global constructors and destructors into the .ctors and .dtors sections. Placing the following sequence into your linker script will build the sort of table which the GNU C++ runtime code expects to see.
      __CTOR_LIST__ = .;
      LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)
      *(.ctors)
      LONG(0)
      __CTOR_END__ = .;
      __DTOR_LIST__ = .;
      LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)
      *(.dtors)
      LONG(0)
      __DTOR_END__ = .;
If you are using the GNU C++ support for initialization priority, which provides some control over the order in which global constructors are run, you must sort the constructors at link time to ensure that they are executed in the correct order. When using the CONSTRUCTORS command, use ‘SORT_BY_NAME(CONSTRUCTORS)’ instead. When using the .ctors and .dtors sections, use ‘*(SORT_BY_NAME(.ctors))’ and ‘*(SORT_BY_NAME(.dtors))’ instead of just ‘*(.ctors)’ and ‘*(.dtors)’.
Normally the compiler and linker will handle these issues automatically, and you will not need to concern yourself with them. However, you may need to consider this if you are using C++ and writing your own linker scripts.

7 Output Section Discarding

The linker will not normally create output sections with no contents. This is for convenience when referring to input sections that may or may not be present in any of the input files. For example:
.foo : { *(.foo) }
will only create a ‘.foo’ section in the output file if there is a ‘.foo’ section in at least one input file, and if the input sections are not all empty. Other link script directives that allocate space in an output section will also create the output section. So too will assignments to dot even if the assignment does not create space, except for ‘. = 0’, ‘. = . + 0’, ‘. = sym’, ‘. = . + sym’ and ‘. = ALIGN (. != 0, expr, 1)’ when ‘sym’ is an absolute symbol of value 0 defined in the script. This allows you to force output of an empty section with ‘. = .’.
The linker will ignore address assignments (see Output Section Address) on discarded output sections, except when the linker script defines symbols in the output section. In that case the linker will obey the address assignments, possibly advancing dot even though the section is discarded.

The special output section name ‘/DISCARD/’ may be used to discard input sections. Any input sections which are assigned to an output section named ‘/DISCARD/’ are not included in the output file.



8 Output Section Attributes

We showed above that the full description of an output section looked like this:
section [address] [(type)] :
  [AT(lma)]
  [ALIGN(section_align) | ALIGN_WITH_INPUT]
  [SUBALIGN(subsection_align)]
  [constraint]
  {
    output-section-command
    output-section-command
    …
  } [>region] [AT>lma_region] [:phdr :phdr …] [=fillexp]
We’ve already described sectionaddress, and output-section-command. In this section we will describe the remaining section attributes.








No comments:

Post a Comment

Back to Top