annobin - Man Page

Annobin

Synopsis

Description

Binary Annotation is a method for recording information about an application inside the application itself.  It is an implementation of the Watermark specification defined here: <https://fedoraproject.org/wiki/Toolchain/Watermark>

Although mainly focused on recording security information, the system can be used to record any kind of data, even data not related to the application.  One of the main goals of the system however is the ability to specify the address range over which a given piece of information is valid.  So for example it is possible to specify that all of a program was compiled with the -O2 option except for one special function which was compiled with -O0 instead.

The range information is useful because it allows third parties to examine the binary and find out if its construction was consistent. IE that there are no gaps in the recorded information, and no special cases where a required feature was not active.

The system works by adding special sections to the application containing individual pieces of information along with an address range for which the information is valid.  (Some effort has gone into the storing this information in a reasonably compact format).

The information is generated by a plugin that is attached to the compiler.  The plugin extracts information from the internals of compiler and records them in the object file(s) being produced.

Note - the plugin method is just one way of generating the information.  Any interested party can create and add information to the object file, providing that they follow the Watermark specification.

The information can be extracted from files via the use of tools like readelf and objdump.  The annobin package itself includes a program called annocheck which can can also examine this information.  Details on this program can be found elsewhere in this documentation.

Experience has shown however that storing the range information along with the data does tend to significantly increase the size of programs.  So the system also provides an alternative implementation which uses a more compact format, at the cose of dropping the range data.

Normally the option to enable the recording of binary annotation notes is enabled automatically by the build system, so no user intervention is required.  On Fedora and RHEL based systems this is handled by the redhat-rpm-config package.

Currently the binary annotations are generated by a plugin to the compiler (GCC, clang or llvm).  This does mean that files that are not compiled by any of these compilers will not gain any annotations, although there is an optional assembler switch to add some basic notes if none are present in the input files.

If the build system being used does not automatically enable the annobin plugin then it can be specifically added to the compiler command line by adding the -fplugin=annobin (for gcc) or -fplugin=annobin-for-clang (for clang) or -fplugin=annobin-for-llvm (for LLVM) option.  It may also be necessary to tell the compiler where to find the plugin by adding the -iplugindir= option, although this should only be necessary if the plugin is installed in an unusual place.

If it is desired to disable the recording of binary annotations then the -fplugin-arg-annobin-disable (for gcc) or -Xclang -plugin-arg-annobin-disable (for clang or llvm) can be used.  Note - these options must be placed after the -fplugin=annobin option.

On Fedora and RHEL systems the plugin can be disabled entirely for all compilations in a package by adding %undefine _annotated_build to the spec file.

The information is stored in a binary in either the ELF Note format inside a special section called .gnu.build.attributes, or else as ordinary strings inside a section called .annobin.notes.

The readelf program from the binutils package can extract and display these notes.  (Adding the --wide option is also helpful).

If the information is held in the ELF note format then readelf's --notes option will display them.  Here is an example of the output:

        Displaying notes found in: .gnu.build.attributes
          Owner                        Data size        Description
          GA$<version>3p3              0x00000010       OPEN        Applies to region from 0x8a0 to 0x8c6 (hello.c)
          GA$<tool>gcc 7.2.1 20170915  0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*GOW:0x452b                0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*<stack prot>strong        0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*GOW:0x412b                0x00000010       func        Applies to region from 0x8c0 to 0x8c6 (baz)

This shows various different pieces of information, including the fact that the notes were produced using version 3 of the specification, and version 3 of the plugin.  The binary was built by gcc version 7.2.1 and the -fstack-protector-strong option was enabled on the command line.  The program was compiled with -O2 enabled except the baz() function which was compiled with -O0 instead.

The most complicated part of the notes is the owner field.  This is used to encode the type of note as well as its value and possibly extra data as well.  The format of the field is explained in detail in the Watermark specification, but it basically consists of the letters G and A followed by an encoding character (one of *$!+) and then a type character and finally the value.

The notes are always four byte aligned, even on 64-bit systems.  This does mean that consumers of the notes may have to read 8-byte wide values from 4-byte aligned addresses, and that producers of the notes may have to generate unaligned relocs when creating them.

If the information is held as strings then readelf's -p.annobin.notes option will display them.  Here is an example of the output:

        String dump of section '.annobin.notes':
          [     0]  AV:4.p.1200
          [     c]  RV:running gcc 12.2.1 20221121
          [    2b]  BV:annobin gcc 12.2.1 20221121
          [    4a]  PN:annobin
          [    55]  GW:0x290540
          [    61]  SP:3

Options

The plugin accepts a small selection of command line arguments, all accessed by passing -fplugin-arg-annobin-<option> (for gcc) or -Xclang -plugin-arg-annobin-<option> (for clang or llvm) on the command line.  These options must be placed on the command line after the plugin itself is mentioned.  Note - not all versions of the plugin accept all of these options.

In addition it is possible to pass options via the ANNOBIN environment variable.  Multiple arguments must be separated by commas, and arguments that need a value must use an equals sign rather than a space or colon.

The supported options are:

disable
enable

Either disable or enable the plugin.  The default is for the plugin to be enabled.

help

Display a list of supported options on the standard output.  This is in addition to whatever else the plugin has been instructed to do.

version

Display the version of the plugin on the standard output.  This is in addition to whatever else the plugin has been instructed to do.

verbose

Report the actions that the plugin is taking.  If invoked for a second time on the command line the plugin will be very verbose.

function-verbose

Report the generation of function specific notes.  This indicates that the named function was compiled with different options from those that were globally enabled.

stack-size-notes
no-stack-size-notes

Do, or do not, record information about the stack requirements of functions in the executable.  This feature is disabled by default as these notes can take up a lot of extra room if the executable contains a lot of functions.

stack-threshold=N

If stack size requirements are being recorded then this option sets the minimum value to record.  Functions which require less than N bytes of static stack space will not have their requirements recorded.  If not set, then N defaults to 1024.

global-file-syms
no-global-file-syms

If enabled the global-file-syms option will create globally visible, unique symbols to mark the start and end of the compiled code.  This can be desirable if a program consists of multiple source files with the same name, or if it links to a library that was built with source files of the same name as the program itself.  The disadvantage of this feature however is that the unique names are based upon the time of the build, so repeated builds of the same source will have different symbol names inside it.  This breaks the functionality of the build-id system which is meant to identify similar builds created at different times.  This feature is disabled by default, and if enabled can be disabled again via the no-global-file-syms option.

attach
no-attach

When gcc compiles code with the -ffunction-sections option active it will place each function into its own section.  When the annobin attach option is active the plugin will attempt to attach the function section to a group containing the notes and relocations for the function.  In that way, if the linker decides to discard the function, it will also know that it should discard the notes and relocations as well.

The default is attach, but this can be disabled via the no-attach option.  Note however that if both attach and link-order are disabled then note generation for function sections will not work properly.

link-order
no-link-order

As an alternative to using section groups and a special assembler directive the plugin can use a feature of the ELF SHF_LINK_ORDER flag which tells the linker that it should discard a section if the section it is linked to is also being discarded.  This behaviour is enabled by the link-order option.

rename

Adds an extra prefix to the symbol names generated by the annobin plugin.  This allows the plugin to be run twice on the same executable, which can be useful for debugging and build testing.

active-checks
no-active-checks

The annobin plugin will normally generate warning messages if it detects that certain preprocessor command line options are missing or misspelt.  The active-checks option changes the warnings into errors, just as if -Werror had been specified.  The no-active-checks option disables the messages entirely.

Currently the plugin checks for these issues:

Missing FORTIFY_SOURCE

This warning is generated when neither -D_FORTIFY_SOURCE=2 nor -D_FORTIFY_SOURCE=3 have been provided on the command line and the -flto option has been enabled.

Nomrally this problem would be detected by the annocheck tool, but LTO compilation hides preprocessor options, so information about them cannot be passed on by the plugin.  This is why the plugin will generate a warning message when the _FORTIFY_SOURCE option is missing and LTO is enabled.

-D_FORTIFY_SOURCE typo

The plugin will warn if the -D_FORTIFY_SOURCE option is spelt as either -DFORTIFY_SOURCE or -D__FORTIFY_SOURCE.

-D_GLIBCXX_ASSERTIONS typo

The plugin will warn if the -D_GLIBCXX_ASSERTIONS option is spelt as either -DGLIBCXX_ASSERTIONS or -D__GLIBCXX_ASSERTIONS.

Note - in the future the annobin plugin might be extended to produce warning messages for other missing command line options.

Note - as a workaround for certain tests generated by the autoconf tool the warning message will not be produced if the input source filename starts with conftest..  In these cases autoconf is usually checking to see if a warning will be produced for some other reason, and so the annobin warning would get in the way.  If the active-checks option has been enabled however, an error message will still be generated.

dynamic-notes
no-dynamic-notes
static-notes
no-static-notes

These options are deprecated.

ppc64-nops
no-ppc64-nops

This option either enables or disables the insertion of NOP instructions in the some of the code sections of PowerPC64 binaries. This is necessary to avoid problems with the elflint program which will complain about binaries built without this option enabled. The option is enabled by default, but since it does increase the size of compiled programs by a small amount, the no-ppc64-nops is provided in order to turn it off.

note-format=note|string

This option chooses the format used to store the information generated by the plugin.  The possibilities are:

note

Store the information as ELF format notes in the .gnu.build.attributes section.

string

Store the information as mergeable strings in the .annobin.notes section.

The default is note.

Info

2024-01-02 annobin-1 RPM Development Tools