gotcha - Man Page

Gotcha Documentation

Overview

Gotcha is an API that provides function wrapping, interposing a wrapper function between a function and its callsites. Many tools rely on function wrapping as a fundamental building block. For example, a performance analysis tool which wants to measure time an application spends in IO might put wrappers around "read()" and "write()" which trigger stopwatches.

Tools traditionally implemented function wrapping with the Ld_preload environment variable on glibc-based systems. This environment variable allowed the tool to inject a tool library into the target application. Any functions that the tool library exports, such as a "read()" or "write()" function, will intercept calls to the matching function names from the original application. While powerful, the Ld_preload approach had several limitations:

Gotcha addresses these limitations by providing an API for function wrapping. Tool libraries make wrapping requests to Gotcha that say, for example, “wrap all calls to the read() function with my tool_read() function, and give me a function pointer to the original read().” Gotcha’s API allows tool wrapping decisions to be made at runtime, and it handles cases of multiple tools wrapping the same function. It does not, however, provide any new mechanisms for injecting the tool library into an application. Gotcha-based tools should be added to the application at link-time or injected with Ld_preload.

Gotcha works by rewriting the Global Offset Table (Got) that links inter-library callsites and variable references to their targets. Because of this Gotcha cannot wrap intra-library calls (such as a call to a static function in C) or calls in statically-linked binaries. Binary rewriting technology such as DyninstAPI is more appropriate for these use cases.

Definitions

This section defines some terms used throughout the document.

Got

The Global Offset Table, or GOT, is a section of a computer program's (executables and shared libraries) memory used to enable computer program code compiled as an Elf file to run correctly, independent of the memory address where the program's code or data is loaded at runtime. More details can be read at GOT Documentation.

Elf

In computing, the Executable and Linkable Format[2] (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, object code, shared libraries, and core dumps.

Ld_preload

LD_PRELOAD is a powerful and advanced feature in the Linux dynamic linker that allows users to preload shared object files into the address space of a process. Read more at LD_PRELOAD Documentation.

ABI-compatibility

An application binary interface (ABI) is an interface between two binary program modules. An ABI defines how data structures or computational routines are accessed in machine code, which is a low-level, hardware-dependent format.

Limitations

General Limitations

Operating system support

As the Elf is the file format used for .o object files, binaries, shared libraries and core dumps in Linux. We currently only support Linux OS.

Intra and Intra-library calls

Gotcha works by rewriting the Global Offset Table (Got) that links inter-library callsites and variable references to their targets. Because of this Gotcha cannot wrap intra-library calls (such as a call to a static function in C) or calls in statically-linked binaries. Binary rewriting technology such as DyninstAPI is more appropriate for these use cases. Additionally, the function pointer wrapping feature with GOTCHA only applies to function pointers created after wrapping functions. The function pointers created before wrapping would not be wrapped by gotcha.

Build Gotcha

This section describes how to build GOTCHA, and what configure time options are available.

There are two build options:

----

Build GOTCHA with Spack

One may install GOTCHA with Spack. If you already have Spack, make sure you have the latest release. If you use a clone of the Spack develop branch, be sure to pull the latest changes.

Install Spack

$ git clone https://github.com/spack/spack
$ # create a packages.yaml specific to your machine
$ . spack/share/spack/setup-env.sh

Use Spack's shell support to add Spack to your PATH and enable use of the spack command.

Build and Install GOTCHA

$ spack install gotcha
$ spack load gotcha

If the most recent changes on the development branch ('dev') of GOTCHA are desired, then do spack install gotcha@develop.

ATTENTION:

The initial install could take a while as Spack will install build dependencies (autoconf, automake, m4, libtool, and pkg-config) as well as any dependencies of dependencies (cmake, perl, etc.) if you don't already have these dependencies installed through Spack or haven't told Spack where they are locally installed on your system (i.e., through a custom packages.yaml). Run spack spec -I gotcha before installing to see what Spack is going to do.

----

Build GOTCHA with CMake

Download the latest GOTCHA release from the Releases page or clone the develop branch ('develop') from the GOTCHA repository https://github.com/LLNL/GOTCHA.

cmake . -B build -DCMAKE_INSTALL_PREFIX=<where you want to install GOTCHA>
cmake --build build
cmake --install build

----

Gotcha API

This section describes how to use the GOTCHA API in an application.

----

Include the GOTCHA Header

In C or C++ applications, include gotcha.h.

#include <gotcha.h>

Define your Gotcha wrappee

Gotcha wrappee enables the application to call the function it wrapped using GOTCHA.

static gotcha_wrappee_handle_t wrappee_fputs_handle;

Define your function wrapper

The function wrapper for wrapping functions from shared libraries.

static int fputs_wrapper(const char *str, FILE *f) {
  // insert clever tool logic here
  typeof(&fputs_wrapper) wrappee_fputs = gotcha_get_wrappee(wrappee_fputs_handle); // get my wrappee from Gotcha
  return wrappee_fputs(str, f); //wrappee_fputs was directed to the original fputs by GOTCHA
}

Define GOTCHA bindings

GOTCHA works on binding a function name, wrapper function, and wrappee handle. Gotcha works on triplets containing this information.

struct gotcha_binding_t wrap_actions [] = {
  { "fputs", fputs_wrapper, &wrappee_fputs_handle },
};

Wrap the binding calls

To initiate gotcha with the bindings defined in last step, tools can call the gotcha_wrap function. This function should be called before any interception is expected by the tool. Some popular places for calling this are gnu constructor or the start of main function. The function will always be successful and would never throw error.

gotcha_error_t gotcha_wrap(wrap_actions,
            sizeof(wrap_actions)/sizeof(struct gotcha_binding_t), // number of bindings
            "my_tool_name");

Multiple gotcha_wrap Caveat

We allow tools to bind different set of functions to different tool names through multiple gotcha_wrap calls. However, a tool within GOTCHA is designed to layer or prioritize the order of functions binding same symbol name. For instance, if multiple tools bind the fputs functions, then GOTCHA layers them to call one after the other with the lowest level being the system call. In this case, tools can prioritize which tools go first or second at runtime to determine the wrapper order for GOTCHA. If an tool uses multiple bindings then they have to set priority to different bindings identified using tool_name defined within the same tool.

ATTENTION:

The gotcha_wrap function modifies the gotcha_binding_t wrap_actions[] provided by the user. GOTCHA does not create a copy of the binding functions and is the responsibility of the user to maintain this binding.

Set priority of tool binding

To set priority of tool within GOTCHA, tools can utilize gotcha_set_priority function. The priority is an integer value with lower values are for inner most call. The lowest layer is the system call followed by GOTCHA layer and finally other tools based on priority. The API would never fail. If it return GOTCHA_INTERNAL as error then there was issue with memory allocation of tool. If multiple tools have same priority then they are wrapper in FIFO order with the first tool being the inner-most wrapper. Without calling this API the default priority given to each tool is -1.

gotcha_error_t gotcha_set_priority(const char* tool_name,
                                   int priority);

Get priority of tool binding

This API gets the priority of the tool. This could be default or as assigned by the tool.

gotcha_error_t gotcha_get_priority(const char* tool_name,
                                   int *priority);

Get the wrapped function from GOTCHA stack

This API return the wrapped function to call based on the tool's handle. The tools handle is used to locate the next element of the wrapper stack and return the function. Returns the ptr of the wrapped function.

void* gotcha_get_wrappee(gotcha_wrappee_handle_t handle);

Filter libraries

Within GOTCHA, even bound symbol is updated in the Got table for each shared library loaded within the tool. In some cases, tools might not want to update these symbols on some libraries. For these cases, GOTCHA has a series of filter functions that can assist tools to define which libraries should be updated. CAUTION: this could lead to behaviors where calls from these libraries would not be intercepted by GOTCHA wrappers and need to handled by the tool.

Filter by Name

This API allows GOTCHA to include only libraries given specified by the user. This could be a partial match of string contains as defined by strstr function in C.

void gotcha_filter_libraries_by_name(const char* nameFilter);

Filter if Last

This API allows GOTCHA to include only the last library defined in the linker of the tool.

void gotcha_only_filter_last();

Filter by user defined function

This API allows users to define a function that selected the libraries that user wants to intercept. The function should take struct link_map* as input and return true if it should be wrapped by GOTCHA. TIP: the library name can be accessed by map->l_name.

void gotcha_set_library_filter_func(int(*new_func)(struct link_map*));

Restore default filter of GOTCHA

The default filter of gotcha selects all libraries loaded. This function set the default filter back for GOTCHA.

void gotcha_restore_library_filter_func();

Example Programs

This example shows how to use gotcha to wrap the open and fopen libc calls. This example is self-contained, though in typical gotcha workflows the gotcha calls would be in a separate library from the application.

The example logs the parameters and return result of every open and fopen call to stderr.

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#include "gotcha/gotcha.h"

typedef int (*open_fptr)(const char *pathname, int flags, mode_t mode);
typedef FILE* (*fopen_fptr)(const char *pathname, const char *mode);

static gotcha_wrappee_handle_t open_handle;
static gotcha_wrappee_handle_t fopen_handle;

static int open_wrapper(const char *pathname, int flags, mode_t mode) {
  open_fptr open_wrappee = (open_fptr) gotcha_get_wrappee(open_handle);
  int result = open_wrappee(pathname, flags, mode);
  fprintf(stderr, "open(%s, %d, %u) = %d\n",
          pathname, flags, (unsigned int) mode, result);
  return result;
}

static FILE *fopen_wrapper(const char *path, const char *mode) {
  fopen_fptr fopen_wrappee = (fopen_fptr) gotcha_get_wrappee(fopen_handle);
  FILE *result = fopen_wrappee(path, mode);
  fprintf(stderr, "fopen(%s, %s) = %p\n",
          path, mode, result);
  return result;
}

static gotcha_binding_t bindings[] = {
  { "open", open_wrapper, &open_handle },
  { "fopen", fopen_wrapper, &fopen_handle }
};

int main(int argc, char *argv[]) {
  gotcha_wrap(bindings, 2, "demotool");

  open("/dev/null", O_RDONLY);
  open("/dev/null", O_WRONLY | O_CREAT | O_EXCL);
  fopen("/dev/random", "r");
  fopen("/does/not/exist", "w");

  return 0;
}

The fundamental data structure in the Gotcha API is the gotcha_binding_t table, which is shown in lines 29-32. This table states that calls to open should be rerouted to call open_wrapper, and similarly for fopen and fopen_wrapper. The original open and fopen functions will still be accessible via the handles open_handle and fopen_handle.

The binding table is passed to Gotcha on line 36, which specifies there are two entries in the table and that these are part of the “demotool” tool. The open_handle and fopen_handle variables are updated by this call to gotcha_wrap and can now be used to look up function pointers to the original open and fopen calls.

The subsequent callsites to open and fopen on lines 37-40 are redirected to respectively call open_wrapper and fopen_wrapper on lines 14-20 and 22-27. Each of these functions looks up the original open and fopen functions using the gotcha_get_wrappee API call and the open_handle and fopen_handle on lines 15 and 23.

The wrappers call then call the underlying functions open and fopen functions on lines 16 and 24. The print the parameters and results of these calls on lines 17 and 25 and return.

Note that this example skips proper error handling for brevity. The call to gotcha_wrap could have failed to find instances of fopen and open in the process, which would have led to an error return. The calls to fprintf on lines 17 and 25 are stomping on the value of errno, which could be set in the open and fopen calls on lines 16 and 24.

Style Guides

Coding Conventions

GOTCHA follows the Google coding style. Please run git clang-format --diff HEAD~1 -q to check your patch for style problems before submitting it for review.

Styling Code

The clang-format tool can be used to apply much of the required code styling used in the project.

To apply style to the source file foo.c:

clang-format --style=Google --Werror foo.c

The .clang-format file specifies the options used for this project. For a full list of available clang-format options, see https://clang.llvm.org/docs/ClangFormat.html.

Verifying Style Checks

To check that uncommitted changes meet the coding style, use the following command:

git clang-format --diff HEAD~1 -q
TIP:

This command will only check specific changes and additions to files that are already tracked by git. Run the command git add -N [<untracked_file>...] first in order to style check new files as well.

----

Commit Message Format

Commit messages for new changes must meet the following guidelines:

  • In 50 characters or less, provide a summary of the change as the first line in the commit message.
  • A body which provides a description of the change. If necessary, please summarize important information such as why the proposed approach was chosen or a brief description of the bug you are resolving. Each line of the body must be 72 characters or less.

An example commit message for new changes is provided below.

Capitalized, short (50 chars or less) summary

More detailed explanatory text, if necessary.  Wrap it to about 72
characters or so.  In some contexts, the first line is treated as the
subject of an email and the rest of the text as the body.  The blank
line separating the summary from the body is critical (unless you omit
the body entirely); tools like rebase can get confused if you run the
two together.

Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
or "Fixes bug."  This convention matches up with commit messages generated
by commands like git merge and git revert.

Further paragraphs come after blank lines.

- Bullet points are okay

- Typically a hyphen or asterisk is used for the bullet, followed by a
  single space, with blank lines in between, but conventions vary here

- Use a hanging indent

Testing Guide

We can never have enough testing. Any additional tests you can write are always greatly appreciated.

Unit Tests

Testing new core features within GOTCHA should be implemented in the test/unit/gotcha_unit_tests.c using the check framework as defined in https://libcheck.github.io/check.

Create a new test

We can create a new test using START_TEST and END_TEST macros.

START_TEST(sample_test){
}
END_TEST

Create a new suite

These new tests can be added to new suite with code similar to the following. To add to existing suite, we need use tcase_add_test api to add the test function to the suite.

Suite* gotcha_sample_suite(){
  Suite* s = suite_create("Sample");
  TCase* sample_case = configured_case_create("Basic tests");
  tcase_add_test(sample_case, sample_test);
  suite_add_tcase(s, sample_case);
  return s;
}

Adding suite to runner

Within the main function of the test/unit/gotcha_unit_tests.c, the gotcha_sample_suite can be added as follows.

Suite* sample_suite = gotcha_sample_suite();
SRunner* sample_runner = srunner_create(sample_suite);
srunner_run_all(sample_suite, CK_NORMAL);
num_fails += srunner_ntests_failed(sample_suite);

Testing tool specific usage of GOTCHA

We should use custom test cases where we are testing the API for GOTCHA for specific features such as filtering libraries, main function bindings, etc. These test cases can be stored within the test folder. Look at existing examples such as test/stack and test/dlopen to understand how we can implement these tests.

Once you add a self containing test case within test, we can add it to the test/CMakeLists.txt.

  • Index
  • Module Index
  • Search Page

Author

Hariharan Devarajan, David Poliakoff, Matt Legendre

Info

Jul 18, 2024 1.0 Gotcha