Alternatives To Valgrind

Having debugging symbols available is useful both when running R under a debugger (e.g., R -d gdb) and when using sanitizers and valgrind, all things intended for experts. Debugging symbols (and some others) can be ‘stripped’ on installation by using. Official Home Page for valgrind, a suite of tools for debugging and profiling. Automatically detect memory management and threading bugs, and perform detailed profiling. The current stable version is valgrind-3.17.0. The VALGRINDDISCARDTRANSLATIONS client request is an alternative to -smc-check=all and -smc-check=all-non-file that requires more programmer effort but allows Valgrind to run your program faster, by telling it precisely when translations need to be re-made. GTFOBins is a curated list of Unix binaries that can be used to bypass local security restrictions in misconfigured systems. The project collects legitimate functions of Unix binaries that can be abused to get the f.k break out restricted shells, escalate or maintain elevated privileges, transfer files, spawn bind and reverse shells, and facilitate the other post-exploitation tasks.

Name

valgrind - a suite of tools for debugging and profiling programs

Synopsis

valgrind [valgrind-options] [your-program] [your-program-options]

Description

Valgrind is a flexible program for debugging and profiling Linux executables. It consists of a core, which provides a synthetic CPU in software, anda series of debugging and profiling tools. The architecture is modular, so that new tools can be created easily and without disturbing the existing structure.

Some of the options described below work with all Valgrind tools, and some only work with a few or one. The section MEMCHECK OPTIONS and those below itdescribe tool-specific options.

This manual page covers only basic usage and options. For more comprehensive information, please see the HTML documentation on your system:$INSTALL/share/doc/valgrind/html/index.html, or online: http://www.valgrind.org/docs/manual/index.html.

Tool Selection Options

The single most important option.

--tool=<toolname> [default: memcheck]

--version
Show the version number of the Valgrind core. Tools can have their own version numbers. There is a scheme in place to ensure that tools only execute whenthe core version is one they are known to work with. This was done to minimise the chances of strange problems arising from tool-vs-core versionincompatibilities.
-q, --quiet
Run silently, and only print error messages. Useful if you are running regression tests or have some other automated test machinery.
-v, --verbose
Be more verbose. Gives extra information on various aspects of your program, such as: the shared objects loaded, the suppressions used, the progress of theinstrumentation and execution engines, and warnings about unusual behaviour. Repeating the option increases the verbosity level.
--trace-children=<yes|no> [default: no]
When enabled, Valgrind will trace into sub-processes initiated via the exec system call. This is necessary for multi-process programs.

Note that Valgrind does trace into the child of a fork (it would be difficult not to, since fork makes an identical copy of a process), sothis option is arguably badly named. However, most children of fork calls immediately call exec anyway.

--trace-children-skip=patt1,patt2,...
This option only has an effect when --trace-children=yes is specified. It allows for some children to be skipped. The option takes a comma separatedlist of patterns for the names of child executables that Valgrind should not trace into. Patterns may include the metacharacters ? and *, which have the usualmeaning.

This can be useful for pruning uninteresting branches from a tree of processes being run on Valgrind. But you should be careful when using it. When Valgrindskips tracing into an executable, it doesn't just skip tracing that executable, it also skips tracing any of that executable's child processes. In other words,the flag doesn't merely cause tracing to stop at the specified executables -- it skips tracing of entire process subtrees rooted at any of the specifiedexecutables.

--trace-children-skip-by-arg=patt1,patt2,...
This is the same as --trace-children-skip, with one difference: the decision as to whether to trace into a child process is made by examining thearguments to the child process, rather than the name of its executable.
--child-silent-after-fork=<yes|no> [default: no]
When enabled, Valgrind will not show any debugging or logging output for the child process resulting from a fork call. This can make the output lessconfusing (although more misleading) when dealing with processes that create children. It is particularly useful in conjunction with --trace-children=.Use of this option is also strongly recommended if you are requesting XML output (--xml=yes), since otherwise the XML from child and parent may becomemixed up, which usually makes it useless.
--vgdb=<no|yes|full> [default: yes]
Valgrind will provide 'gdbserver' functionality when --vgdb=yes or --vgdb=full is specified. This allows an external GNU GDB debugger tocontrol and debug your program when it runs on Valgrind. --vgdb=full incurs significant performance overheads, but provides more precise breakpoints andwatchpoints. See ??? for a detailed description.

If the embedded gdbserver is enabled but no gdb is currently being used, the ??? command line utility can send 'monitor commands' to Valgrind from a shell.The Valgrind core provides a set of ???. A tool can optionally provide tool specific monitor commands, which are documented in the tool specificchapter.

--vgdb-error=<number> [default: 999999999]
Use this option when the Valgrind gdbserver is enabled with --vgdb=yes or --vgdb=full. Tools that report errors will wait for 'number' errorsto be reported before freezing the program and waiting for you to connect with GDB. It follows that a value of zero will cause the gdbserver to be startedbefore your program is executed. This is typically used to insert GDB breakpoints before execution, and also works with tools that do not report errors, suchas Massif.
--track-fds=<yes|no> [default: no]
When enabled, Valgrind will print out a list of open file descriptors on exit. Along with each file descriptor is printed a stack backtrace of where thefile was opened and any details relating to the file descriptor such as the file name or socket details.
--time-stamp=<yes|no> [default: no]
When enabled, each message is preceded with an indication of the elapsed wallclock time since startup, expressed as days, hours, minutes, seconds andmilliseconds.
--log-fd=<number> [default: 2, stderr]
Specifies that Valgrind should send all of its messages to the specified file descriptor. The default, 2, is the standard error channel (stderr). Note thatthis may interfere with the client's own use of stderr, as Valgrind's output will be interleaved with any output that the client sends to stderr.
--log-file=<filename>
Specifies that Valgrind should send all of its messages to the specified file. If the file name is empty, it causes an abort. There are three special formatspecifiers that can be used in the file name.

%p is replaced with the current process ID. This is very useful for program that invoke multiple processes. WARNING: If you use--trace-children=yes and your program invokes multiple processes OR your program forks without calling exec afterwards, and you don't use this specifier(or the %q specifier below), the Valgrind output from all those processes will go into one file, possibly jumbled up, and possibly incomplete.

%q{FOO} is replaced with the contents of the environment variable FOO. If the {FOO} part is malformed, it causes an abort. Thisspecifier is rarely needed, but very useful in certain circumstances (eg. when running MPI programs). The idea is that you specify a variable which will be setdifferently for each process in the job, for example BPROC_RANK or whatever is applicable in your MPI setup. If the named environment variable is not set, itcauses an abort. Note that in some shells, the { and } characters may need to be escaped with a backslash.

%% is replaced with %.

If an % is followed by any other character, it causes an abort.

--log-socket=<ip-address:port-number>
Specifies that Valgrind should send all of its messages to the specified port at the specified IP address. The port may be omitted, in which case port 1500is used. If a connection cannot be made to the specified socket, Valgrind falls back to writing output to the standard error (stderr). This option is intendedto be used in conjunction with the valgrind-listener program. For further details, see the commentary in the manual.

Error-related Options

These options are used by all tools that can report errors, e.g. Memcheck, but not Cachegrind.

--xml=<yes|no> [default: no]

When enabled, the important parts of the output (e.g. tool error messages) will be in XML format rather than plain text. Furthermore, the XML output will besent to a different output channel than the plain text output. Therefore, you also must use one of --xml-fd, --xml-file or --xml-socket tospecify where the XML is to be sent.

Less important messages will still be printed in plain text, but because the XML output and plain text output are sent to different output channels (thedestination of the plain text output is still controlled by --log-fd, --log-file and --log-socket) this should not cause problems.

This option is aimed at making life easier for tools that consume Valgrind's output as input, such as GUI front ends. Currently this option works withMemcheck, Helgrind, DRD and SGcheck. The output format is specified in the file docs/internals/xml-output-protocol4.txt in the source tree for Valgrind 3.5.0or later.

The recommended options for a GUI to pass, when requesting XML output, are: --xml=yes to enable XML output, --xml-file to send the XML outputto a (presumably GUI-selected) file, --log-file to send the plain text output to a second GUI-selected file, --child-silent-after-fork=yes, and-q to restrict the plain text output to critical error messages created by Valgrind itself. For example, failure to read a specified suppressions filecounts as a critical error message. In this way, for a successful run the text output file will be empty. But if it isn't empty, then it will contain importantinformation which the GUI user should be made aware of.

--xml-fd=<number> [default: -1, disabled]
Specifies that Valgrind should send its XML output to the specified file descriptor. It must be used in conjunction with --xml=yes.
--xml-file=<filename>
Specifies that Valgrind should send its XML output to the specified file. It must be used in conjunction with --xml=yes. Any %p or %qsequences appearing in the filename are expanded in exactly the same way as they are for --log-file. See the description of --log-file fordetails.
--xml-socket=<ip-address:port-number>
Specifies that Valgrind should send its XML output the specified port at the specified IP address. It must be used in conjunction with --xml=yes. Theform of the argument is the same as that used by --log-socket. See the description of --log-socket for further details.
--xml-user-comment=<string>
Embeds an extra user comment string at the start of the XML output. Only works when --xml=yes is specified; ignored otherwise.
--demangle=<yes|no> [default: yes]
Enable/disable automatic demangling (decoding) of C++ names. Enabled by default. When enabled, Valgrind will attempt to translate encoded C++ names back tosomething approaching the original. The demangler handles symbols mangled by g++ versions 2.X, 3.X and 4.X.

An important fact about demangling is that function names mentioned in suppressions files should be in their mangled form. Valgrind does not demanglefunction names when searching for applicable suppressions, because to do otherwise would make suppression file contents dependent on the state of Valgrind'sdemangling machinery, and also slow down suppression matching.

--num-callers=<number> [default: 12]
Specifies the maximum number of entries shown in stack traces that identify program locations. Note that errors are commoned up using only the top fourfunction locations (the place in the current function, and that of its three immediate callers). So this doesn't affect the total number of errors reported.

The maximum value for this is 500. Note that higher settings will make Valgrind run a bit more slowly and take a bit more memory, but can be useful whenworking with programs with deeply-nested call chains.

--error-limit=<yes|no> [default: yes]
When enabled, Valgrind stops reporting errors after 10,000,000 in total, or 1,000 different ones, have been seen. This is to stop the error trackingmachinery from becoming a huge performance overhead in programs with many errors.
--error-exitcode=<number> [default: 0]
Specifies an alternative exit code to return if Valgrind reported any errors in the run. When set to the default value (zero), the return value fromValgrind will always be the return value of the process being simulated. When set to a nonzero value, that value is returned instead, if Valgrind detects anyerrors. This is useful for using Valgrind as part of an automated test suite, since it makes it easy to detect test cases for which Valgrind has reportederrors, just by inspecting return codes.
--show-below-main=<yes|no> [default: no]
By default, stack traces for errors do not show any functions that appear beneath main because most of the time it's uninteresting C library stuffand/or gobbledygook. Alternatively, if main is not present in the stack trace, stack traces will not show any functions below main-like functionssuch as glibc's __libc_start_main. Furthermore, if main-like functions are present in the trace, they are normalised as (below main), inorder to make the output more deterministic.

If this option is enabled, all stack trace entries will be shown and main-like functions will not be normalised.

--fullpath-after=<string> [default: don't show source paths]
By default Valgrind only shows the filenames in stack traces, but not full paths to source files. When using Valgrind in large projects where the sourcesreside in multiple different directories, this can be inconvenient. --fullpath-after provides a flexible solution to this problem. When this option ispresent, the path to each source file is shown, with the following all-important caveat: if string is found in the path, then the path up to andincluding string is omitted, else the path is shown unmodified. Note that string is not required to be a prefix of the path.

For example, consider a file named /home/janedoe/blah/src/foo/bar/xyzzy.c. Specifying --fullpath-after=/home/janedoe/blah/src/ will cause Valgrind toshow the name as foo/bar/xyzzy.c.

Because the string is not required to be a prefix, --fullpath-after=src/ will produce the same output. This is useful when the path containsarbitrary machine-generated characters. For example, the path /my/build/dir/C32A1B47/blah/src/foo/xyzzy can be pruned to foo/xyzzy using--fullpath-after=/blah/src/.

If you simply want to see the full path, just specify an empty string: --fullpath-after=. This isn't a special case, merely a logical consequence ofthe above rules.

Finally, you can use --fullpath-after multiple times. Any appearance of it causes Valgrind to switch to producing full paths and applying the abovefiltering rule. Each produced path is compared against all the --fullpath-after-specified strings, in the order specified. The first string to matchcauses the path to be truncated as described above. If none match, the full path is shown. This facilitates chopping off prefixes when the sources are drawnfrom a number of unrelated directories.

--suppressions=<filename> [default: $PREFIX/lib/valgrind/default.supp]
Specifies an extra file from which to read descriptions of errors to suppress. You may use up to 100 extra suppression files.
--gen-suppressions=<yes|no|all> [default: no]
When set to yes, Valgrind will pause after every error shown and print the line:
The prompt's behaviour is the same as for the --db-attach option (see below).

If you choose to, Valgrind will print out a suppression for this error. You can then cut and paste it into a suppression file if you don't want to hearabout the error in the future.

When set to all, Valgrind will print a suppression for every reported error, without querying the user.

This option is particularly useful with C++ programs, as it prints out the suppressions with mangled names, as required.

Note that the suppressions printed are as specific as possible. You may want to common up similar ones, by adding wildcards to function names, and by usingframe-level wildcards. The wildcarding facilities are powerful yet flexible, and with a bit of careful editing, you may be able to suppress a whole family ofrelated errors with only a few suppressions.

Sometimes two different errors are suppressed by the same suppression, in which case Valgrind will output the suppression more than once, but you only needto have one copy in your suppression file (but having more than one won't cause problems). Also, the suppression name is given as <insert a suppression namehere>; the name doesn't really matter, it's only used with the -v option which prints out all used suppression records.

--db-attach=<yes|no> [default: no]
When enabled, Valgrind will pause after every error shown and print the line:
Pressing Ret, or N Ret or n Ret, causes Valgrind not to start a debugger for this error.

Pressing Y Ret or y Ret causes Valgrind to start a debugger for the program at this point. When you have finished with the debugger, quit fromit, and the program will continue. Trying to continue from inside the debugger doesn't work.

Note: if you use GDB, more powerful debugging support is provided by the --vgdb=yes or full value. This activates Valgrind's internalgdbserver, which provides more-or-less full GDB-style control of the application: insertion of breakpoints, continuing from inside GDB, inferior functioncalls, and much more.

C Ret or c Ret causes Valgrind not to start a debugger, and not to ask again.

--db-command=<command> [default: gdb -nw %f %p]
Specify the debugger to use with the --db-attach command. The default debugger is GDB. This option is a template that is expanded by Valgrind atruntime. %f is replaced with the executable's file name and %p is replaced by the process ID of the executable.

This specifies how Valgrind will invoke the debugger. By default it will use whatever GDB is detected at build time, which is usually /usr/bin/gdb. Usingthis command, you can specify some alternative command to invoke the debugger you want to use.

The command string given can include one or instances of the %p and %f expansions. Each instance of %p expands to the PID of the process to be debugged andeach instance of %f expands to the path to the executable for the process to be debugged.

Since <command> is likely to contain spaces, you will need to put this entire option in quotes to ensure it is correctly handled by the shell.

--input-fd=<number> [default: 0, stdin]
When using --db-attach=yes or --gen-suppressions=yes, Valgrind will stop so as to read keyboard input from you when each error occurs. Bydefault it reads from the standard input (stdin), which is problematic for programs which close stdin. This option allows you to specify an alternative filedescriptor from which to read input.
--dsymutil=no|yes [no]
This option is only relevant when running Valgrind on Mac OS X.Valgrind

Mac OS X uses a deferred debug information (debuginfo) linking scheme. When object files containing debuginfo are linked into a .dylib or an executable, thedebuginfo is not copied into the final file. Instead, the debuginfo must be linked manually by running dsymutil, a system-provided utility, on the executableor .dylib. The resulting combined debuginfo is placed in a directory alongside the executable or .dylib, but with the extension .dSYM.

With --dsymutil=no, Valgrind will detect cases where the .dSYM directory is either missing, or is present but does not appear to match the associatedexecutable or .dylib, most likely because it is out of date. In these cases, Valgrind will print a warning message but take no further action.

With --dsymutil=yes, Valgrind will, in such cases, automatically run dsymutil as necessary to bring the debuginfo up to date. For all practicalpurposes, if you always use --dsymutil=yes, then there is never any need to run dsymutil manually or as part of your applications's build system, sinceValgrind will run it as necessary.

Valgrind will not attempt to run dsymutil on any executable or library in /usr/, /bin/, /sbin/, /opt/, /sw/, /System/, /Library/ or /Applications/ sincedsymutil will always fail in such situations. It fails both because the debuginfo for such pre-installed system components is not available anywhere, and alsobecause it would require write privileges in those directories.

Be careful when using --dsymutil=yes, since it will cause pre-existing .dSYM directories to be silently deleted and re-created. Also note thatdsymutil is quite slow, sometimes excessively so.

--max-stackframe=<number> [default: 2000000]
The maximum size of a stack frame. If the stack pointer moves by more than this amount then Valgrind will assume that the program is switching to adifferent stack.

You may need to use this option if your program has large stack-allocated arrays. Valgrind keeps track of your program's stack pointer. If it changes bymore than the threshold amount, Valgrind assumes your program is switching to a different stack, and Memcheck behaves differently than it would for a stackpointer change smaller than the threshold. Usually this heuristic works well. However, if your program allocates large structures on the stack, this heuristicwill be fooled, and Memcheck will subsequently report large numbers of invalid stack accesses. This option allows you to change the threshold to a differentvalue.

You should only consider use of this option if Valgrind's debug output directs you to do so. In that case it will tell you the new threshold you shouldspecify.

In general, allocating large structures on the stack is a bad idea, because you can easily run out of stack space, especially on systems with limited memoryor which expect to support large numbers of threads each with a small stack, and also because the error checking performed by Memcheck is more effective forheap-allocated data than for stack-allocated data. If you have to use this option, you may wish to consider rewriting your code to allocate on the heap ratherthan on the stack.

--main-stacksize=<number> [default: use current 'ulimit' value]
Specifies the size of the main thread's stack.

To simplify its memory management, Valgrind reserves all required space for the main thread's stack at startup. That means it needs to know the requiredstack size at startup.

By default, Valgrind uses the current 'ulimit' value for the stack size, or 16 MB, whichever is lower. In many cases this gives a stack size in the range 8to 16 MB, which almost never overflows for most applications.

If you need a larger total stack size, use --main-stacksize to specify it. Only set it as high as you need, since reserving far more space than youneed (that is, hundreds of megabytes more than you need) constrains Valgrind's memory allocators and may reduce the total amount of memory that Valgrind canuse. This is only really of significance on 32-bit machines.

On Linux, you may request a stack of size up to 2GB. Valgrind will stop with a diagnostic message if the stack cannot be allocated.

--main-stacksize only affects the stack size for the program's initial thread. It has no bearing on the size of thread stacks, as Valgrind does notallocate those.

You may need to use both --main-stacksize and --max-stackframe together. It is important to understand that --main-stacksize sets themaximum total stack size, whilst --max-stackframe specifies the largest size of any one stack frame. You will have to work out the--main-stacksize value for yourself (usually, if your applications segfaults). But Valgrind will tell you the needed --max-stackframe size, ifnecessary.

As discussed further in the description of --max-stackframe, a requirement for a large stack is a sign of potential portability problems. You arebest advised to place all large data in heap-allocated memory.

MALLOC()-RELATED OPTIONS

For tools that use their own version of malloc (e.g. Memcheck, Massif, Helgrind, DRD), the following options apply.

--alignment=<number> [default: 8 or 16, depending on the platform]

By default Valgrind's malloc, realloc, etc, return a block whose starting address is 8-byte aligned or 16-byte aligned (the value depends onthe platform and matches the platform default). This option allows you to specify a different alignment. The supplied value must be greater than or equal tothe default, less than or equal to 4096, and must be a power of two.
--redzone-size=<number> [default: depends on the tool]
Valgrind's malloc, realloc, etc, add padding blocks before and after each heap block allocated by the program being run. Such padding blocks arecalled redzones. The default value for the redzone size depends on the tool. For example, Memcheck adds and protects a minimum of 16 bytes before and aftereach block allocated by the client. This allows it to detect block underruns or overruns of up to 16 bytes.

Increasing the redzone size makes it possible to detect overruns of larger distances, but increases the amount of memory used by Valgrind. Decreasing theredzone size will reduce the memory needed by Valgrind but also reduces the chances of detecting over/underruns, so is not recommended.

Uncommon Options

These options apply to all tools, as they affect certain obscure workings of the Valgrind core. Most people won't need to use them.

--smc-check=<none|stack|all|all-non-file> [default: stack]

This option controls Valgrind's detection of self-modifying code. If no checking is done, if a program executes some code, then overwrites it with new code,and executes the new code, Valgrind will continue to execute the translations it made for the old code. This will likely lead to incorrect behaviour and/orcrashes.

Valgrind has four levels of self-modifying code detection: no detection, detect self-modifying code on the stack (which is used by GCC to implement nestedfunctions), detect self-modifying code everywhere, and detect self-modifying code everywhere except in file-backed mappings. Note that the default option willcatch the vast majority of cases. The main case it will not catch is programs such as JIT compilers that dynamically generate code and subsequentlyoverwrite part or all of it. Running with all will slow Valgrind down noticeably. Running with none will rarely speed things up, since verylittle code gets put on the stack for most programs. The VALGRIND_DISCARD_TRANSLATIONS client request is an alternative to --smc-check=all thatrequires more programmer effort but allows Valgrind to run your program faster, by telling it precisely when translations need to be re-made.

--smc-check=all-non-file provides a cheaper but more limited version of --smc-check=all. It adds checks to any translations that do notoriginate from file-backed memory mappings. Typical applications that generate code, for example JITs in web browsers, generate code into anonymous mmapedareas, whereas the 'fixed' code of the browser always lives in file-backed mappings. --smc-check=all-non-file takes advantage of this observation,limiting the overhead of checking to code which is likely to be JIT generated.

Some architectures (including ppc32, ppc64, ARM and MIPS) require programs which create code at runtime to flush the instruction cache in between codegeneration and first use. Valgrind observes and honours such instructions. Hence, on ppc32/Linux, ppc64/Linux and ARM/Linux, Valgrind always provides complete,transparent support for self-modifying code. It is only on platforms such as x86/Linux, AMD64/Linux, x86/Darwin and AMD64/Darwin that you need to use thisoption.

--read-var-info=<yes|no> [default: no]
When enabled, Valgrind will read information about variable types and locations from DWARF3 debug info. This slows Valgrind down and makes it use morememory, but for the tools that can take advantage of it (Memcheck, Helgrind, DRD) it can result in more precise error messages. For example, here are somestandard errors issued by Memcheck:
And here are the same errors with --read-var-info=yes:
--vgdb-poll=<number> [default: 5000]
As part of its main loop, the Valgrind scheduler will poll to check if some activity (such as an external command or some input from a gdb) has to behandled by gdbserver. This activity poll will be done after having run the given number of basic blocks (or slightly more than the given number of basicblocks). This poll is quite cheap so the default value is set relatively low. You might further decrease this value if vgdb cannot use ptrace system call tointerrupt Valgrind if all threads are (most of the time) blocked in a system call.
--vgdb-shadow-registers=no|yes [default: no]
When activated, gdbserver will expose the Valgrind shadow registers to GDB. With this, the value of the Valgrind shadow registers can be examined or changedusing GDB. Exposing shadow registers only works with GDB version 7.1 or later.
--vgdb-prefix=<prefix> [default: /tmp/vgdb-pipe]
To communicate with gdb/vgdb, the Valgrind gdbserver creates 3 files (2 named FIFOs and a mmap shared memory file). The prefix option controls the directoryand prefix for the creation of these files.
--run-libc-freeres=<yes|no> [default: yes]
This option is only relevant when running Valgrind on Linux.

The GNU C library (libc.so), which is used by all programs, may allocate memory for its own uses. Usually it doesn't bother to free that memory whenthe program ends-there would be no point, since the Linux kernel reclaims all process resources when a process exits anyway, so it would just slow things down.

The glibc authors realised that this behaviour causes leak checkers, such as Valgrind, to falsely report leaks in glibc, when a leak check is done at exit.In order to avoid this, they provided a routine called __libc_freeres specifically to make glibc release all memory it has allocated. Memcheck thereforetries to run __libc_freeres at exit.

Unfortunately, in some very old versions of glibc, __libc_freeres is sufficiently buggy to cause segmentation faults. This was particularlynoticeable on Red Hat 7.1. So this option is provided in order to inhibit the run of __libc_freeres. If your program seems to run fine on Valgrind, butsegfaults at exit, you may find that --run-libc-freeres=no fixes that, although at the cost of possibly falsely reporting space leaks inlibc.so.

--sim-hints=hint1,hint2,...
Pass miscellaneous hints to Valgrind which slightly modify the simulated behaviour in nonstandard or dangerous ways, possibly to help the simulation ofstrange features. By default no hints are enabled. Use with caution! Currently known hints are:
• lax-ioctls: Be very lax about ioctl handling; the only assumption is that the size is correct. Doesn't require the full buffer to be initializedwhen writing. Without this, using some device drivers with a large number of strange ioctl commands becomes very tiresome.
• enable-outer: Enable some special magic needed when the program being run is itself Valgrind.
• no-inner-prefix: Disable printing a prefix > in front of each stdout or stderr output line in an inner Valgrind being run by an outerValgrind. This is useful when running Valgrind regression tests in an outer/inner setup. Note that the prefix > will always be printed in front ofthe inner debug logging lines.
• fuse-compatible: Enable special handling for certain system calls that may block in a FUSE file-system. This may be necessary when running Valgrindon a multi-threaded program that uses one thread to manage a FUSE file-system and another thread to access that file-system.
--fair-sched=<no|yes|try> [default: no]
The --fair-sched option controls the locking mechanism used by Valgrind to serialise thread execution. The locking mechanism controls the way thethreads are scheduled, and different settings give different trade-offs between fairness and performance. For more details about the Valgrind threadserialisation scheme and its impact on performance and thread scheduling, see ???.
• The value --fair-sched=yes activates a fair scheduler. In short, if multiple threads are ready to run, the threads will be scheduled in a roundrobin fashion. This mechanism is not available on all platforms or Linux versions. If not available, using --fair-sched=yes will cause Valgrind toterminate with an error.

You may find this setting improves overall responsiveness if you are running an interactive multithreaded program, for example a web browser, onValgrind.

• The value --fair-sched=try activates fair scheduling if available on the platform. Otherwise, it will automatically fall back to--fair-sched=no.
• The value --fair-sched=no activates a scheduler which does not guarantee fairness between threads ready to run, but which in general gives thehighest performance.
--kernel-variant=variant1,variant2,...
Handle system calls and ioctls arising from minor variants of the default kernel for this platform. This is useful for running on hacked kernels or withkernel modules which support nonstandard ioctls, for example. Use with caution. If you don't understand what this option does then you almost certainly don'tneed it. Currently known variants are:
• bproc: Support the sys_broc system call on x86. This is for running on BProc, which is a minor variant of standard Linux which is sometimesused for building clusters.
--show-emwarns=<yes|no> [default: no]
When enabled, Valgrind will emit warnings about its CPU emulation in certain cases. These are usually not interesting.
--require-text-symbol=:sonamepatt:fnnamepatt
When a shared object whose soname matches sonamepatt is loaded into the process, examine all the text symbols it exports. If none of those matchfnnamepatt, print an error message and abandon the run. This makes it possible to ensure that the run does not continue unless a given shared objectcontains a particular function name.

Both sonamepatt and fnnamepatt can be written using the usual ? and * wildcards. For example: ':*libc.so*:foo?bar'. Youmay use characters other than a colon to separate the two patterns. It is only important that the first character and the separator character are the same. Forexample, the above example could also be written 'Q*libc.so*Qfoo?bar'. Multiple --require-text-symbol flags are allowed, in which case sharedobjects that are loaded into the process will be checked against all of them.

The purpose of this is to support reliable usage of marked-up libraries. For example, suppose we have a version of GCC's libgomp.so which has beenmarked up with annotations to support Helgrind. It is only too easy and confusing to load the wrong, un-annotated libgomp.so into the application. Sothe idea is: add a text symbol in the marked-up library, for example annotated_for_helgrind_3_6, and then give the flag--require-text-symbol=:*libgomp*so*:annotated_for_helgrind_3_6 so that when libgomp.so is loaded, Valgrind scans its symbol table, and if thesymbol isn't present the run is aborted, rather than continuing silently with the un-marked-up library. Note that you should put the entire flag in quotes tostop shells expanding up the * and ? wildcards.

Alternatives To Valgrind

--soname-synonyms=syn1=pattern1,syn2=pattern2,...
When a shared library is loaded, Valgrind checks for functions in the library that must be replaced or wrapped. For example, Memcheck replaces all mallocrelated functions (malloc, free, calloc, ...) with its own versions. Such replacements are done by default only in shared libraries whose soname matches apredefined soname pattern (e.g. libc.so* on linux). By default, no replacement is done for a statically linked library or for alternative libraries suchas tcmalloc. In some cases, the replacements allow --soname-synonyms to specify one additional synonym pattern, giving flexibility in the replacement.

Currently, this flexibility is only allowed for the malloc related functions, using the synonym somalloc. This synonym is usable for all tools doingstandard replacement of malloc related functions (e.g. memcheck, massif, drd, helgrind, exp-dhat, exp-sgcheck).

• Alternate malloc library: to replace the malloc related functions in an alternate library with soname mymalloclib.so, give the option--soname-synonyms=somalloc=mymalloclib.so. A pattern can be used to match multiple libraries sonames. For example,--soname-synonyms=somalloc=*tcmalloc* will match the soname of all variants of the tcmalloc library (native, debug, profiled, ... tcmalloc variants).

Note: the soname of a elf shared library can be retrieved using the readelf utility.

• Replacements in a statically linked library are done by using the NONE pattern. For example, if you link with libtcmalloc.a, memcheck willproperly work when you give the option --soname-synonyms=somalloc=NONE. Note that a NONE pattern will match the main executable and any shared libraryhaving no soname.
• To run a 'default' Firefox build for Linux, in which JEMalloc is linked in to the main executable, use--soname-synonyms=somalloc=NONE.

Debugging Valgrind Options

There are also some options for debugging Valgrind itself. You shouldn't need to use them in the normal run of things. If you wish to see the list, use the--help-debug option.

Memcheck Options

--leak-check=<no|summary|yes|full> [default: summary]

When enabled, search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set to full oryes, it also gives details of each individual leak.
--show-possibly-lost=<yes|no> [default: yes]
When disabled, the memory leak detector will not show 'possibly lost' blocks.
--leak-resolution=<low|med|high> [default: high]
When doing leak checking, determines how willing Memcheck is to consider different backtraces to be the same for the purposes of merging multiple leaks intoa single leak report. When set to low, only the first two entries need match. When med, four entries have to match. When high, all entriesneed to match.

For hardcore leak debugging, you probably want to use --leak-resolution=high together with --num-callers=40 or some such large number.

Note that the --leak-resolution setting does not affect Memcheck's ability to find leaks. It only changes how the results are presented.

--show-reachable=<yes|no> [default: no]
When disabled, the memory leak detector only shows 'definitely lost' and 'possibly lost' blocks. When enabled, the leak detector also shows 'reachable' and'indirectly lost' blocks. (In other words, it shows all blocks, except suppressed ones, so --show-all would be a better name for it.)
--undef-value-errors=<yes|no> [default: yes]
Controls whether Memcheck reports uses of undefined value errors. Set this to no if you don't want to see undefined value errors. It also has theside effect of speeding up Memcheck somewhat.
--track-origins=<yes|no> [default: no]
Controls whether Memcheck tracks the origin of uninitialised values. By default, it does not, which means that although it can tell you that anuninitialised value is being used in a dangerous way, it cannot tell you where the uninitialised value came from. This often makes it difficult to track downthe root problem.

When set to yes, Memcheck keeps track of the origins of all uninitialised values. Then, when an uninitialised value error is reported, Memcheck willtry to show the origin of the value. An origin can be one of the following four places: a heap block, a stack allocation, a client request, or miscellaneousother sources (eg, a call to brk).

For uninitialised values originating from a heap block, Memcheck shows where the block was allocated. For uninitialised values originating from a stackallocation, Memcheck can tell you which function allocated the value, but no more than that -- typically it shows you the source location of the opening braceof the function. So you should carefully check that all of the function's local variables are initialised properly.

Performance overhead: origin tracking is expensive. It halves Memcheck's speed and increases memory use by a minimum of 100MB, and possibly more.Nevertheless it can drastically reduce the effort required to identify the root cause of uninitialised value errors, and so is often a programmer productivitywin, despite running more slowly.

Accuracy: Memcheck tracks origins quite accurately. To avoid very large space and time overheads, some approximations are made. It is possible, althoughunlikely, that Memcheck will report an incorrect origin, or not be able to identify any origin.

Note that the combination --track-origins=yes and --undef-value-errors=no is nonsensical. Memcheck checks for and rejects this combination atstartup.

--partial-loads-ok=<yes|no> [default: no]
Controls how Memcheck handles word-sized, word-aligned loads from addresses for which some bytes are addressable and others are not. When yes, suchloads do not produce an address error. Instead, loaded bytes originating from illegal addresses are marked as uninitialised, and those corresponding to legaladdresses are handled in the normal way.

When no, loads from partially invalid addresses are treated the same as loads from completely invalid addresses: an illegal-address error is issued,and the resulting bytes are marked as initialised.

Note that code that behaves in this way is in violation of the the ISO C/C++ standards, and should be considered broken. If at all possible, such codeshould be fixed. This option should be used only as a last resort.

--freelist-vol=<number> [default: 20000000]
When the client program releases memory using free (in C) or delete (C++), that memory is not immediately made available for re-allocation. Instead,it is marked inaccessible and placed in a queue of freed blocks. The purpose is to defer as long as possible the point at which freed-up memory comes back intocirculation. This increases the chance that Memcheck will be able to detect invalid accesses to blocks for some significant period of time after they have beenfreed.

This option specifies the maximum total size, in bytes, of the blocks in the queue. The default value is twenty million bytes. Increasing this increases thetotal amount of memory used by Memcheck but may detect invalid uses of freed blocks which would otherwise go undetected.

--freelist-big-blocks=<number> [default: 1000000]
When making blocks from the queue of freed blocks available for re-allocation, Memcheck will in priority re-circulate the blocks with a size greater orequal to Alternatives--freelist-big-blocks. This ensures that freeing big blocks (in particular freeing blocks bigger than --freelist-vol) does notimmediately lead to a re-circulation of all (or a lot of) the small blocks in the free list. In other words, this option increases the likelihood to discoverdangling pointers for the 'small' blocks, even when big blocks are freed.

Setting a value of 0 means that all the blocks are re-circulated in a FIFO order.

--workaround-gcc296-bugs=<yes|no> [default: no]
When enabled, assume that reads and writes some small distance below the stack pointer are due to bugs in GCC 2.96, and does not report them. The 'smalldistance' is 256 bytes by default. Note that GCC 2.96 is the default compiler on some ancient Linux distributions (RedHat 7.X) and so you may need to use thisoption. Do not use it if you do not have to, as it can cause real errors to be overlooked. A better alternative is to use a more recent GCC in which this bugis fixed.

You may also need to use this option when working with GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because GCC generates code which occasionallyaccesses below the stack pointer, particularly for floating-point to/from integer conversions. This is in violation of the 32-bit PowerPC ELF specification,which makes no provision for locations below the stack pointer to be accessible.

--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS]
Any ranges listed in this option (and multiple ranges can be specified, separated by commas) will be ignored by Memcheck's addressability checking.
--malloc-fill=<hexnumber>
Fills blocks allocated by malloc, new, etc, but not by calloc, with the specified byte. This can be useful when trying to shake out obscure memorycorruption problems. The allocated area is still regarded by Memcheck as undefined -- this option only affects its contents. Note that --malloc-filldoes not affect a block of memory when it is used as argument to client requests VALGRIND_MEMPOOL_ALLOC or VALGRIND_MALLOCLIKE_BLOCK.
--free-fill=<hexnumber>
Fills blocks freed by free, delete, etc, with the specified byte value. This can be useful when trying to shake out obscure memory corruption problems. Thefreed area is still regarded by Memcheck as not valid for access -- this option only affects its contents. Note that --free-fill does not affect a blockof memory when it is used as argument to client requests VALGRIND_MEMPOOL_FREE or VALGRIND_FREELIKE_BLOCK.

Cachegrind Options

--I1=<size>,<associativity>,<line size>

--dump-before=<function>
Dump when entering function.
--zero-before=<function>
Zero all costs when entering function.
--dump-after=<function>
Dump when leaving function.
--instr-atstart=<yes|no> [default: yes]
Specify if you want Callgrind to start simulation and profiling from the beginning of the program. When set to no, Callgrind will not be able to collect anyinformation, including calls, but it will have at most a slowdown of around 4, which is the minimum Valgrind overhead. Instrumentation can be interactivelyenabled via callgrind_control -i on.

Note that the resulting call graph will most probably not contain main, but will contain all the functions executed after instrumentation wasenabled. Instrumentation can also programatically enabled/disabled. See the Callgrind include file callgrind.h for the macro you have to use in your sourcecode.

For cache simulation, results will be less accurate when switching on instrumentation later in the program run, as the simulator starts with an empty cacheat that moment. Switch on event collection later to cope with this error.

--collect-atstart=<yes|no> [default: yes]
Specify whether event collection is enabled at beginning of the profile run.

To only look at parts of your program, you have two possibilities:

1. Zero event counters before entering the program part you want to profile, and dump the event counters to a file after leaving that program part.
2. Switch on/off collection state as needed to only see event counters happening while inside of the program part you want to profile.
The second option can be used if the program part you want to profile is called many times. Option 1, i.e. creating a lot of dumps is not practical here.

Collection state can be toggled at entry and exit of a given function with the option --toggle-collect. If you use this option, collection stateshould be disabled at the beginning. Note that the specification of --toggle-collect implicitly sets --collect-state=no.

Collection state can be toggled also by inserting the client request CALLGRIND_TOGGLE_COLLECT ; at the needed code positions.

--toggle-collect=<function>
Toggle collection on entry/exit of function.
--collect-jumps=<no|yes> [default: no]
This specifies whether information for (conditional) jumps should be collected. As above, callgrind_annotate currently is not able to show you the data. Youhave to use KCachegrind to get jump arrows in the annotated code.
--collect-systime=<no|yes> [default: no]
This specifies whether information for system call times should be collected.
--collect-bus=<no|yes> [default: no]
This specifies whether the number of global bus events executed should be collected. The event type 'Ge' is used for these events.
--cache-sim=<yes|no> [default: no]
Specify if you want to do full cache simulation. By default, only instruction read accesses will be counted ('Ir'). With cache simulation, further eventcounters are enabled: Cache misses on instruction reads ('I1mr'/'ILmr'), data read accesses ('Dr') and related cache misses ('D1mr'/'DLmr'), data writeaccesses ('Dw') and related cache misses ('D1mw'/'DLmw'). For more information, see ???.
--branch-sim=<yes|no> [default: no]
Specify if you want to do branch prediction simulation. Further event counters are enabled: Number of executed conditional branches and related predictormisses ('Bc'/'Bcm'), executed indirect jumps and related misses of the jump address predictor ('Bi'/'Bim').

Helgrind Options

--free-is-write=no|yes [default: no]

When enabled (not the default), Helgrind treats freeing of heap memory as if the memory was written immediately before the free. This exposes races wherememory is referenced by one thread, and freed by another, but there is no observable synchronisation event to ensure that the reference happens before thefree.

This functionality is new in Valgrind 3.7.0, and is regarded as experimental. It is not enabled by default because its interaction with custom memoryallocators is not well understood at present. User feedback is welcomed.

--track-lockorders=no|yes [default: yes]
When enabled (the default), Helgrind performs lock order consistency checking. For some buggy programs, the large number of lock order errors reported canbecome annoying, particularly if you're only interested in race errors. You may therefore find it helpful to disable lock order checking.
--history-level=none|approx|full [default: full]
--history-level=full (the default) causes Helgrind collects enough information about 'old' accesses that it can produce two stack traces in a racereport -- both the stack trace for the current access, and the trace for the older, conflicting access. To limit memory usage, 'old' accesses stack traces arelimited to a maximum of 8 entries, even if --num-callers value is bigger.

Collecting such information is expensive in both speed and memory, particularly for programs that do many inter-thread synchronisation events (locks,unlocks, etc). Without such information, it is more difficult to track down the root causes of races. Nonetheless, you may not need it in situations where youjust want to check for the presence or absence of races, for example, when doing regression testing of a previously race-free program.

--history-level=none is the opposite extreme. It causes Helgrind not to collect any information about previous accesses. This can be dramaticallyfaster than --history-level=full.

--history-level=approx provides a compromise between these two extremes. It causes Helgrind to show a full trace for the later access, andapproximate information regarding the earlier access. This approximate information consists of two stacks, and the earlier access is guaranteed to haveoccurred somewhere between program points denoted by the two stacks. This is not as useful as showing the exact stack for the previous access (as--history-level=full does), but it is better than nothing, and it is almost as fast as --history-level=none.

--conflict-cache-size=N [default: 1000000]
This flag only has any effect at --history-level=full.

Information about 'old' conflicting accesses is stored in a cache of limited size, with LRU-style management. This is necessary because it isn't practicalto store a stack trace for every single memory access made by the program. Historical information on not recently accessed locations is periodically discarded,to free up space in the cache.

This option controls the size of the cache, in terms of the number of different memory addresses for which conflicting access information is stored. If youfind that Helgrind is showing race errors with only one stack instead of the expected two stacks, try increasing this value.

The minimum value is 10,000 and the maximum is 30,000,000 (thirty times the default value). Increasing the value by 1 increases Helgrind's memoryrequirement by very roughly 100 bytes, so the maximum value will easily eat up three extra gigabytes or so of memory.

--check-stack-refs=no|yes [default: yes]
By default Helgrind checks all data memory accesses made by your program. This flag enables you to skip checking for accesses to thread stacks (localvariables). This can improve performance, but comes at the cost of missing races on stack-allocated data.

Drd Options

--check-stack-var=<yes|no> [default: no]

• Don't enable this option when using reference-counted objects because that will result in false positives, even when that code has been annotated properlywith ANNOTATE_HAPPENS_BEFORE and ANNOTATE_HAPPENS_AFTER. See e.g. the output of the following command for an example: valgrind --tool=drd --free-is-write=yesdrd/tests/annotate_smart_pointer.
--report-signal-unlocked=<yes|no> [default: yes]
Whether to report calls to pthread_cond_signal and pthread_cond_broadcast where the mutex associated with the signal throughpthread_cond_wait or pthread_cond_timed_waitis not locked at the time the signal is sent. Sending a signal without holding a lock on theassociated mutex is a common programming error which can cause subtle race conditions and unpredictable behavior. There exist some uncommon synchronizationpatterns however where it is safe to send a signal without holding a lock on the associated mutex.
--segment-merging=<yes|no> [default: yes]
Controls segment merging. Segment merging is an algorithm to limit memory usage of the data race detection algorithm. Disabling segment merging may improvethe accuracy of the so-called 'other segments' displayed in race reports but can also trigger an out of memory error.
--segment-merging-interval=<n> [default: 10]
Perform segment merging only after the specified number of new segments have been created. This is an advanced configuration option that allows to choosewhether to minimize DRD's memory usage by choosing a low value or to let DRD run faster by choosing a slightly higher value. The optimal value for thisparameter depends on the program being analyzed. The default value works well for most programs.
--shared-threshold=<n> [default: off]
Print an error message if a reader lock has been held longer than the specified time (in milliseconds). This option enables the detection of lockcontention.
--show-confl-seg=<yes|no> [default: yes]
Show conflicting segments in race reports. Since this information can help to find the cause of a data race, this option is enabled by default. Disablingthis option makes the output of DRD more compact.
--show-stack-usage=<yes|no> [default: no]
Print stack usage at thread exit time. When a program creates a large number of threads it becomes important to limit the amount of virtual memory allocatedfor thread stacks. This option makes it possible to observe how much stack memory has been used by each thread of the the client program. Note: the DRD toolitself allocates some temporary data on the client thread stack. The space necessary for this temporary data must be allocated by the client program when itallocates stack memory, but is not included in stack usage reported by DRD.
--trace-addr=<address> [default: none]
Trace all load and store activity for the specified address. This option may be specified more than once.
--ptrace-addr=<address> [default: none]
Trace all load and store activity for the specified address and keep doing that even after the memory at that address has been freed andreallocated.
--trace-alloc=<yes|no> [default: no]
Trace all memory allocations and deallocations. May produce a huge amount of output.
--trace-barrier=<yes|no> [default: no]
Trace all barrier activity.
--trace-cond=<yes|no> [default: no]
Trace all condition variable activity.
--trace-fork-join=<yes|no> [default: no]
Trace all thread creation and all thread termination events.
--trace-hb=<yes|no> [default: no]
Trace execution of the ANNOTATE_HAPPENS_BEFORE(), ANNOTATE_HAPPENS_AFTER() and ANNOTATE_HAPPENS_DONE() client requests.
--trace-mutex=<yes|no> [default: no]
Trace all mutex activity.
--trace-rwlock=<yes|no> [default: no]
Trace all reader-writer lock activity.
--trace-semaphore=<yes|no> [default: no]
Trace all semaphore activity.

Massif Options

--heap=<yes|no> [default: yes]

Specifies whether heap profiling should be done.
--heap-admin=<size> [default: 8]
If heap profiling is enabled, gives the number of administrative bytes per block to use. This should be an estimate of the average, since it may vary. Forexample, the allocator used by glibc on Linux requires somewhere between 4 to 15 bytes per block, depending on various factors. That allocator also requiresadmin space for freed blocks, but Massif cannot account for this.
--stacks=<yes|no> [default: no]
Specifies whether stack profiling should be done. This option slows Massif down greatly, and so is off by default. Note that Massif assumes that the mainstack has size zero at start-up. This is not true, but doing otherwise accurately is difficult. Furthermore, starting at zero better indicates the size of thepart of the main stack that a user program actually has control over.
--pages-as-heap=<yes|no> [default: no]
Tells Massif to profile memory at the page level rather than at the malloc'd block level. See above for details.
--depth=<number> [default: 30]
Maximum depth of the allocation trees recorded for detailed snapshots. Increasing it will make Massif run somewhat more slowly, use more memory, and producebigger output files.
--alloc-fn=<name>
Functions specified with this option will be treated as though they were a heap allocation function such as malloc. This is useful for functions thatare wrappers to malloc or new, which can fill up the allocation trees with uninteresting information. This option can be specified multiple timeson the command line, to name multiple functions.

Note that the named function will only be treated this way if it is the top entry in a stack trace, or just below another function treated this way. Forexample, if you have a function malloc1 that wraps malloc, and malloc2 that wraps malloc1, just specifying--alloc-fn=malloc2 will have no effect. You need to specify --alloc-fn=malloc1 as well. This is a little inconvenient, but the reason is thatchecking for allocation functions is slow, and it saves a lot of time if Massif can stop looking through the stack trace entries as soon as it finds one thatdoesn't match rather than having to continue through all the entries.

Note that C++ names are demangled. Note also that overloaded C++ names must be written in full. Single quotes may be necessary to prevent the shell frombreaking them up. For example:

--ignore-fn=<name>
Any direct heap allocation (i.e. a call to malloc, new, etc, or a call to a function named by an --alloc-fn option) that occurs in afunction specified by this option will be ignored. This is mostly useful for testing purposes. This option can be specified multiple times on the command line,to name multiple functions.

Any realloc of an ignored block will also be ignored, even if the realloc call does not occur in an ignored function. This avoids thepossibility of negative heap sizes if ignored blocks are shrunk with realloc.

The rules for writing C++ function names are the same as for --alloc-fn above.

--threshold=<m.n> [default: 1.0]
The significance threshold for heap allocations, as a percentage of total memory size. Allocation tree entries that account for less than this will beaggregated. Note that this should be specified in tandem with ms_print's option of the same name.
--peak-inaccuracy=<m.n> [default: 1.0]
Massif does not necessarily record the actual global memory allocation peak; by default it records a peak only when the global memory allocation sizeexceeds the previous peak by at least 1.0%. This is because there can be many local allocation peaks along the way, and doing a detailed snapshot for every onewould be expensive and wasteful, as all but one of them will be later discarded. This inaccuracy can be changed (even to 0.0%) via this option, but Massif willrun drastically slower as the number approaches zero.
--time-unit=<i|ms|B> [default: i]
The time unit used for the profiling. There are three possibilities: instructions executed (i), which is good for most cases; real (wallclock) time (ms,i.e. milliseconds), which is sometimes useful; and bytes allocated/deallocated on the heap and/or stack (B), which is useful for very short-run programs, andfor testing purposes, because it is the most reproducible across different machines.
--detailed-freq=<n> [default: 10]
Frequency of detailed snapshots. With --detailed-freq=1, every snapshot is detailed.
--max-snapshots=<n> [default: 100]
The maximum number of snapshots recorded. If set to N, for all programs except very short-running ones, the final number of snapshots will be between N/2and N.
--massif-out-file=<file> [default: massif.out.%p]
Write the profile data to file rather than to the default output file, massif.out.<pid>. The %p and %q format specifiers can be used toembed the process ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file.

Sgcheck Options

<xi:include></xi:include>.SH 'BBV OPTIONS'

--bb-out-file=<name> [default: bb.out.%p]

--instr-count-only [default: no]
This option tells the tool to only display instruction count totals, and to not generate the actual basic block vector file. This is useful for debugging,and for gathering instruction count info without generating the large basic block vector files.

Lackey Options

--basic-counts=<no|yes> [default: yes]

5. Ratios between some of these counts.
6. The exit code of the client program.
--detailed-counts=<no|yes> [default: no]
When enabled, Lackey prints a table containing counts of loads, stores and ALU operations, differentiated by their IR types. The IR types are identified bytheir IR name ('I1', 'I8', ... 'I128', 'F32', 'F64', and 'V128').
--trace-mem=<no|yes> [default: no]
When enabled, Lackey prints the size and address of almost every memory access made by the program. See the comments at the top of the file lackey/lk_main.cfor details about the output format, how it works, and inaccuracies in the address trace. Note that this option produces immense amounts of output.
--trace-superblocks=<no|yes> [default: no]
When enabled, Lackey prints out the address of every superblock (a single entry, multiple exit, linear chunk of code) executed by the program. This isprimarily of interest to Valgrind developers. See the comments at the top of the file lackey/lk_main.c for details about the output format. Note that thisoption produces large amounts of output.
--fnname=<name> [default: main]
Changes the function for which calls are counted when --basic-counts=yes is specified.

See Also

cg_annotate(1), callgrind_annotate(1), callgrind_control(1), ms_print(1), $INSTALL/share/doc/valgrind/html/index.html orhttp://www.valgrind.org/docs/manual/index.html.

Author

The Valgrind developers.

This manpage was written by Andres Roldan <aroldan@debian.org> and the Valgrind developers.

Referenced By

lit(1),pulseaudio(1)

Table of Contents

3.1. The Client Request mechanism
3.2. Debugging your program using Valgrind gdbserver and GDB
3.2.1. Quick Start: debugging in 3 steps
3.2.2. Valgrind gdbserver overall organisation
3.2.3. Connecting GDB to a Valgrind gdbserver
3.2.4. Connecting to an Android gdbserver
3.2.5. Monitor command handling by the Valgrind gdbserver
3.2.6. Valgrind gdbserver thread information
3.2.7. Examining and modifying Valgrind shadow registers
3.2.8. Limitations of the Valgrind gdbserver
3.2.9. vgdb command line options
3.2.10. Valgrind monitor commands
3.3. Function wrapping
3.3.1. A Simple Example
3.3.2. Wrapping Specifications
3.3.3. Wrapping Semantics
3.3.4. Debugging
3.3.5. Limitations - control flow
3.3.6. Limitations - original function signatures
3.3.7. Examples

This chapter describes advanced aspects of the Valgrind coreservices, which are mostly of interest to power users who wish tocustomise and modify Valgrind's default behaviours in certain usefulways. The subjects covered are:

  • The 'Client Request' mechanism

  • Debugging your program using Valgrind's gdbserver and GDB

  • Function Wrapping

Valgrind has a trapdoor mechanism via which the clientprogram can pass all manner of requests and queries to Valgrindand the current tool. Internally, this is used extensively to make various things work, although that's not visible from theoutside.

For your convenience, a subset of these so-called clientrequests is provided to allow you to tell Valgrind facts aboutthe behaviour of your program, and also to make queries.In particular, your program can tell Valgrind about things that itotherwise would not know, leading to better results.

Clients need to include a header file to make this work.Which header file depends on which client requests you use. Someclient requests are handled by the core, and are defined in theheader file valgrind/valgrind.h. Tool-specificheader files are named after the tool, e.g.valgrind/memcheck.h. Each tool-specific header fileincludes valgrind/valgrind.h so you don't need toinclude it in your client if you include a tool-specific header. All headerfiles can be found in the include/valgrind directory ofwherever Valgrind was installed.

The macros in these header files have the magical propertythat they generate code in-line which Valgrind can spot.However, the code does nothing when not run on Valgrind, so youare not forced to run your program under Valgrind just because youuse the macros in this file. Also, you are not required to link yourprogram with any extra supporting libraries.

The code added to your binary has negligible performance impact:on x86, amd64, ppc32, ppc64 and ARM, the overhead is 6 simple integerinstructions and is probably undetectable except in tight loops.However, if you really wish to compile out the client requests, youcan compile with -DNVALGRIND (analogous to-DNDEBUG's effect onassert).

You are encouraged to copy the valgrind/*.h headersinto your project's include directory, so your program doesn't have acompile-time dependency on Valgrind being installed. The Valgrind headers,unlike most of the rest of the code, are under a BSD-style license so you mayinclude them without worrying about license incompatibility.

Here is a brief description of the macros available invalgrind.h, which work with more than onetool (see the tool-specific documentation for explanations of thetool-specific macros).

RUNNING_ON_VALGRIND:

Returns 1 if running on Valgrind, 0 if running on the real CPU. If you are running Valgrind on itself, returns the number of layers of Valgrind emulation you're running on.

VALGRIND_DISCARD_TRANSLATIONS:

Discards translations of code in the specified address range. Useful if you are debugging a JIT compiler or some other dynamic code generation system. After this call, attempts to execute code in the invalidated address range will cause Valgrind to make new translations of that code, which is probably the semantics you want. Note that code invalidations are expensive because finding all the relevant translations quickly is very difficult, so try not to call it often. Note that you can be clever about this: you only need to call it when an area which previously contained code is overwritten with new code. You can choose to write code into fresh memory, and just call this occasionally to discard large chunks of old code all at once.

Alternatively, for transparent self-modifying-code support, use--smc-check=all, or run on ppc32/Linux, ppc64/Linux or ARM/Linux.

VALGRIND_COUNT_ERRORS:

Returns the number of errors found so far by Valgrind. Can be useful in test harness code when combined with the --log-fd=-1 option; this runs Valgrind silently, but the client program can detect when errors occur. Only useful for tools that report errors, e.g. it's useful for Memcheck, but for Cachegrind it will always return zero because Cachegrind doesn't report errors.

VALGRIND_MALLOCLIKE_BLOCK:

If your program manages its own memory instead of using the standard malloc / new / new[], tools that track information about heap blocks will not do nearly as good a job. For example, Memcheck won't detect nearly as many errors, and the error messages won't be as informative. To improve this situation, use this macro just after your custom allocator allocates some new memory. See the comments in valgrind.h for information on how to use it.

VALGRIND_FREELIKE_BLOCK:

This should be used in conjunction with VALGRIND_MALLOCLIKE_BLOCK. Again, see valgrind.h for information on how to use it.

VALGRIND_RESIZEINPLACE_BLOCK:

Informs a Valgrind tool that the size of an allocated block has been modified but not its address. See valgrind.h for more information on how to use it.

VALGRIND_CREATE_MEMPOOL, VALGRIND_DESTROY_MEMPOOL, VALGRIND_MEMPOOL_ALLOC, VALGRIND_MEMPOOL_FREE, VALGRIND_MOVE_MEMPOOL, VALGRIND_MEMPOOL_CHANGE, VALGRIND_MEMPOOL_EXISTS:

These are similar to VALGRIND_MALLOCLIKE_BLOCK and VALGRIND_FREELIKE_BLOCK but are tailored towards code that uses memory pools. See Memory Pools for a detailed description.

VALGRIND_NON_SIMD_CALL[0123]:

Executes a function in the client program on the real CPU, not the virtual CPU that Valgrind normally runs code on. The function must take an integer (holding a thread ID) as the first argument and then 0, 1, 2 or 3 more arguments (depending on which client request is used). These are used in various ways internally to Valgrind. They might be useful to client programs.

Warning: Only use these if you really know what you are doing. They aren't entirely reliable, and can cause Valgrind to crash. See valgrind.h for more details.

VALGRIND_PRINTF(format, ...):

Print a printf-style message to the Valgrind log file. The message is prefixed with the PID between a pair of ** markers. (Like all client requests, nothing is output if the client program is not running under Valgrind.) Output is not produced until a newline is encountered, or subsequent Valgrind output is printed; this allows you to build up a single line of output over multiple calls. Returns the number of characters output, excluding the PID prefix.

VALGRIND_PRINTF_BACKTRACE(format, ...):

Like VALGRIND_PRINTF (in particular, the return value is identical), but prints a stack backtrace immediately afterwards.

VALGRIND_MONITOR_COMMAND(command):

Execute the given monitor command (a string). Returns 0 if command is recognised. Returns 1 if command is not recognised. Note that some monitor commands provide access to a functionality also accessible via a specific client request. For example, memcheck leak search can be requested from the client program using VALGRIND_DO_LEAK_CHECK or via the monitor command 'leak_search'. Note that the syntax of the command string is only verified at run-time. So, if it exists, it is preferable to use a specific client request to have better compile time verifications of the arguments.

VALGRIND_CLO_CHANGE(option):

Changes the value of a dynamically changeable option (a string). See Dynamically Change Options.

VALGRIND_STACK_REGISTER(start, end):

Registers a new stack. Informs Valgrind that the memory range between start and end is a unique stack. Returns a stack identifier that can be used with other VALGRIND_STACK_* calls.

Valgrind will use this information to determine if a change to the stack pointer is an item pushed onto the stack or a change over to a new stack. Use this if you're using a user-level thread package and are noticing crashes in stack trace recording or spurious errors from Valgrind about uninitialized memory reads.

Warning: Unfortunately, this client request is unreliable and best avoided.

VALGRIND_STACK_DEREGISTER(id):

Deregisters a previously registered stack. Informs Valgrind that previously registered memory range with stack id id is no longer a stack.

Warning: Unfortunately, this client request is unreliable and best avoided.

VALGRIND_STACK_CHANGE(id, start, end):

Changes a previously registered stack. Informs Valgrind that the previously registered stack with stack id id has changed its start and end values. Use this if your user-level thread package implements stack growth.

Warning: Unfortunately, this client request is unreliable and best avoided.

3.2. Debugging your program using Valgrind gdbserver and GDB

A program running under Valgrind is not executed directly by theCPU. Instead it runs on a synthetic CPU provided by Valgrind. This iswhy a debugger cannot debug your program when it runs on Valgrind.

This section describes how GDB can interact with theValgrind gdbserver to provide a fully debuggable program underValgrind. Used in this way, GDB also provides an interactive usage ofValgrind core or tool functionalities, including incremental leak searchunder Memcheck and on-demand Massif snapshot production.

The simplest way to get started is to run Valgrind with theflag --vgdb-error=0. Then follow the on-screendirections, which give you the precise commands needed to start GDBand connect it to your program.

Otherwise, here's a slightly more verbose overview.

If you want to debug a program with GDB when using the Memchecktool, start Valgrind like this:

In another shell, start GDB:

Then give the following command to GDB:

You can now debug your program e.g. by inserting a breakpointand then using the GDB continuecommand.

This quick start information is enough for basic usage of theValgrind gdbserver. The sections below describe more advancedfunctionality provided by the combination of Valgrind and GDB. Notethat the command line flag --vgdb=yes can be omitted,as this is the default value.

The GNU GDB debugger is typically used to debug a processrunning on the same machine. In this mode, GDB uses system calls tocontrol and query the program being debugged. This works well, butonly allows GDB to debug a program running on the same computer.

GDB can also debug processes running on a different computer.To achieve this, GDB defines a protocol (that is, a set of query andreply packets) that facilitates fetching the value of memory orregisters, setting breakpoints, etc. A gdbserver is an implementationof this 'GDB remote debugging' protocol. To debug a process runningon a remote computer, a gdbserver (sometimes called a GDB stub)must run at the remote computer side.

The Valgrind core provides a built-in gdbserver implementation,which is activated using --vgdb=yesor --vgdb=full. This gdbserver allows the processrunning on Valgrind's synthetic CPU to be debugged remotely.GDB sends protocol query packets (such as 'get register contents') tothe Valgrind embedded gdbserver. The gdbserver executes the queries(for example, it will get the register values of the synthetic CPU)and gives the results back to GDB.

GDB can use various kinds of channels (TCP/IP, serial line, etc)to communicate with the gdbserver. In the case of Valgrind'sgdbserver, communication is done via a pipe and a small helper programcalled vgdb, which acts as anintermediary. If no GDB is in use, vgdb can also beused to send monitor commands to the Valgrind gdbserver from a shellcommand line.

To debug a program 'prog' running underValgrind, you must ensure that the Valgrind gdbserver is activated byspecifying either --vgdb=yesor --vgdb=full. A secondary command line option,--vgdb-error=number, can be used to tell the gdbserveronly to become active once the specified number of errors have beenshown. A value of zero will therefore causethe gdbserver to become active at startup, which allows you toinsert breakpoints before starting the run. For example:

The Valgrind gdbserver is invoked at startupand indicates it is waiting for a connection from a GDB:

GDB (in another shell) can then be connected to the Valgrind gdbserver.For this, GDB must be started on the program prog:

You then indicate to GDB that you want to debug a remote target:

GDB then starts a vgdb relay application to communicate with the Valgrind embedded gdbserver:

Note that vgdb is provided as part of the Valgrinddistribution. You do not need to install it separately.

If vgdb detects that there are multiple Valgrind gdbservers thatcan be connected to, it will list all such servers and their PIDs, andthen exit. You can then reissue the GDB 'target' command, butspecifying the PID of the process you want to debug:

Once GDB is connected to the Valgrind gdbserver, it can be usedin the same way as if you were debugging the program natively:

  • Breakpoints can be inserted or deleted.

  • Variables and register values can be examined or modified.

  • Signal handling can be configured (printing, ignoring).

  • Execution can be controlled (continue, step, next, stepi, etc).

  • Program execution can be interrupted using Control-C.

And so on. Refer to the GDB user manual for a completedescription of GDB's functionality.

When developping applications for Android, you will typically usea development system (on which the Android NDK is installed) to compile yourapplication. An Android target system or emulator will be used to runthe application.In this setup, Valgrind and vgdb will run on the Android system,while GDB will run on the development system. GDB will connectto the vgdb running on the Android system using the Android NDK'adb forward' application.

Example: on the Android system, execute the following:

On the development system, execute the following commands:

GDB will use a local tcp/ip connection to connect to the Android adb forwarder.Adb will establish a relay connection between the host system and the Androidtarget system. Be sure to use the GDB delivered in theAndroid NDK system (typically, arm-linux-androideabi-gdb), as the hostGDB is probably not able to debug Android arm applications.Note that the local port nr (used by GDB) must not necessarily be equalto the port number used by vgdb: adb can forward tcp/ip between differentport numbers.

In the current release, the GDB server is not enabled by defaultfor Android, due to problems in establishing a suitable directory inwhich Valgrind can create the necessary FIFOs (named pipes) forcommunication purposes. You can stil try to use the GDB server, butyou will need to explicitly enable it using the flag --vgdb=yes or--vgdb=full.

Additionally, youwill need to select a temporary directory which is (a) writableby Valgrind, and (b) supports FIFOs. This is the main difficultpoint. Often, /sdcard satisfiesrequirement (a), but fails for (b) because it is a VFAT file systemand VFAT does not support pipes. Possibilities you could try are/data/local,/data/local/Inst (if youinstalled Valgrind there), or/data/data/name.of.my.app, if youare running a specific application and it has its own directory of that form. This last possibility may have the highest probabilityof success.

You can specify the temporary directory to use either viathe --with-tmpdir= configure timeflag, or by setting environment variable TMPDIR when running Valgrind(on the Android device, not on the Android NDK development host).Another alternative is to specify the directory for the FIFOs usingthe --vgdb-prefix= Valgrind commandline option.

We hope to have a better story for temporary directory handlingon Android in the future. The difficulty is that, unlike in standardUnixes, there is no single temporary file directory that reliablyworks across all devices and scenarios.

3.2.5. Monitor command handling by the Valgrind gdbserver

The Valgrind gdbserver provides additional Valgrind-specificfunctionality via 'monitor commands'. Such monitor commands can besent from the GDB command line or from the shell command line orrequested by the client program using the VALGRIND_MONITOR_COMMANDclient request. SeeValgrind monitor commands for thelist of the Valgrind core monitor commands available regardless of theValgrind tool selected.

The following tools provide tool-specific monitor commands:

An example of a tool specific monitor command is the Memcheck monitorcommand leak_check fullreachable any. This requests a full reporting of theallocated memory blocks. To have this leak check executed, use the GDBcommand:

GDB will send the leak_checkcommand to the Valgrind gdbserver. The Valgrind gdbserver willexecute the monitor command itself, if it recognises it to be a Valgrind coremonitor command. If it is not recognised as such, it is assumed tobe tool-specific and is handed to the tool for execution. For example:

As with other GDB commands, the Valgrind gdbserver will acceptabbreviated monitor command names and arguments, as long as the givenabbreviation is unambiguous. For example, the aboveleak_checkcommand can also be typed as:

The letters mo are recognised by GDB as beingan abbreviation for monitor. So GDB sends thestring l f r a to the Valgrindgdbserver. The letters provided in this string are unambiguous for theValgrind gdbserver. This therefore gives the same output as theunabbreviated command and arguments. If the provided abbreviation isambiguous, the Valgrind gdbserver will report the list of commands (orargument values) that can match:

Instead of sending a monitor command from GDB, you can also sendthese from a shell command line. For example, the following commandlines, when given in a shell, will cause the same leak search to be executedby the process 3145:

Note that the Valgrind gdbserver automatically continues theexecution of the program after a standalone invocation ofvgdb. Monitor commands sent from GDB do not cause the program tocontinue: the program execution is controlled explicitly using GDB commands such as 'continue' or 'next'.

Many monitor commands (e.g. v.info location, memcheck who_points_at, ...) require an address argument and an optional length: <addr> [<len>]. The arguments can also be provided by using a 'C array like syntax' by providing the address followed by the length between square brackets.

For example, the following two monitor commands provide the same information:

Valgrind's gdbserver enriches the output of theGDB info threads commandwith Valgrind-specific information.The operating system's thread number is followedby Valgrind's internal index for that thread ('tid') and bythe Valgrind scheduler thread state:

3.2.7. Examining and modifying Valgrind shadow registers

When the option --vgdb-shadow-registers=yes isgiven, the Valgrind gdbserver will let GDB examine and/or modifyValgrind's shadow registers. GDB version 7.1 or later is needed for thisto work. For x86 and amd64, GDB version 7.2 or later is needed.

For each CPU register, the Valgrind core maintains twoshadow register sets. These shadow registers can be accessed fromGDB by giving a postfix s1or s2 for respectively the firstand second shadow register. For example, the x86 registereax and its two shadowscan be examined using the following commands:

Float shadow registers are shown by GDB as unsigned integervalues instead of float values, as it is expected that theseshadow values are mostly used for memcheck validity bits.

Intel/amd64 AVX registers ymm0to ymm15 have also their shadowregisters. However, GDB presents the shadow values using two'half' registers. For example, the half shadow registers for ymm9 arexmm9s1 (lower half for set 1),ymm9hs1 (upper half for set 1),xmm9s2 (lower half for set 2),ymm9hs2 (upper half for set 2).Note the inconsistent notation for the names of the half registers:the lower part starts with an x,the upper part starts with an yand has an h before the shadow postfix.

The special presentation of the AVX shadow registers is due tothe fact that GDB independently retrieves the lower and upper half ofthe ymm registers. GDB does nothowever know that the shadow half registers have to be shown combined.

Debugging with the Valgrind gdbserver is very similar to nativedebugging. Valgrind's gdbserver implementation is quitecomplete, and so provides most of the GDB debugging functionality. Thereare however some limitations and peculiarities:

  • Precision of 'stop-at' commands.

    GDB commands such as 'step', 'next', 'stepi', breakpoints and watchpoints, will stop the execution of the process. With the option --vgdb=yes, the process might not stop at the exact requested instruction. Instead, it might continue execution of the current basic block and stop at one of the following basic blocks. This is linked to the fact that Valgrind gdbserver has to instrument a block to allow stopping at the exact instruction requested. Currently, re-instrumentation of the block currently being executed is not supported. So, if the action requested by GDB (e.g. single stepping or inserting a breakpoint) implies re-instrumentation of the current block, the GDB action may not be executed precisely.

    This limitation applies when the basic block currently being executed has not yet been instrumented for debugging. This typically happens when the gdbserver is activated due to the tool reporting an error or to a watchpoint. If the gdbserver block has been activated following a breakpoint, or if a breakpoint has been inserted in the block before its execution, then the block has already been instrumented for debugging.

    If you use the option --vgdb=full, then GDB 'stop-at' commands will be obeyed precisely. The downside is that this requires each instruction to be instrumented with an additional call to a gdbserver helper function, which gives considerable overhead (+500% for memcheck) compared to --vgdb=no. Option --vgdb=yes has neglectible overhead compared to --vgdb=no.

  • Processor registers and flags values.

    When Valgrind gdbserver stops on an error, on a breakpoint or when single stepping, registers and flags values might not be always up to date due to the optimisations done by the Valgrind core. The default value --vex-iropt-register-updates=unwindregs-at-mem-access ensures that the registers needed to make a stack trace (typically PC/SP/FP) are up to date at each memory access (i.e. memory exception points). Disabling some optimisations using the following values will increase the precision of registers and flags values (a typical performance impact for memcheck is given for each option).

    • --vex-iropt-register-updates=allregs-at-mem-access (+10%) ensures that all registers and flags are up to date at each memory access.

    • --vex-iropt-register-updates=allregs-at-each-insn (+25%) ensures that all registers and flags are up to date at each instruction.

    Note that --vgdb=full (+500%, see above Precision of 'stop-at' commands) automatically activates --vex-iropt-register-updates=allregs-at-each-insn.

  • Hardware watchpoint support by the Valgrind gdbserver.

    The Valgrind gdbserver can simulate hardware watchpoints if the selected tool provides support for it. Currently, only Memcheck provides hardware watchpoint simulation. The hardware watchpoint simulation provided by Memcheck is much faster that GDB software watchpoints, which are implemented by GDB checking the value of the watched zone(s) after each instruction. Hardware watchpoint simulation also provides read watchpoints. The hardware watchpoint simulation by Memcheck has some limitations compared to real hardware watchpoints. However, the number and length of simulated watchpoints are not limited.

    Typically, the number of (real) hardware watchpoints is limited. For example, the x86 architecture supports a maximum of 4 hardware watchpoints, each watchpoint watching 1, 2, 4 or 8 bytes. The Valgrind gdbserver does not have any limitation on the number of simulated hardware watchpoints. It also has no limitation on the length of the memory zone being watched. Using GDB version 7.4 or later allow full use of the flexibility of the Valgrind gdbserver's simulated hardware watchpoints. Previous GDB versions do not understand that Valgrind gdbserver watchpoints have no length limit.

    Memcheck implements hardware watchpoint simulation by marking the watched address ranges as being unaddressable. When a hardware watchpoint is removed, the range is marked as addressable and defined. Hardware watchpoint simulation of addressable-but-undefined memory zones works properly, but has the undesirable side effect of marking the zone as defined when the watchpoint is removed.

    Write watchpoints might not be reported at the exact instruction that writes the monitored area, unless option --vgdb=full is given. Read watchpoints will always be reported at the exact instruction reading the watched memory.

    It is better to avoid using hardware watchpoint of not addressable (yet) memory: in such a case, GDB will fall back to extremely slow software watchpoints. Also, if you do not quit GDB between two debugging sessions, the hardware watchpoints of the previous sessions will be re-inserted as software watchpoints if the watched memory zone is not addressable at program startup.

  • Stepping inside shared libraries on ARM.

    For unknown reasons, stepping inside shared libraries on ARM may fail. A workaround is to use the ldd command to find the list of shared libraries and their loading address and inform GDB of the loading address using the GDB command 'add-symbol-file'. Example:

  • GDB version needed for ARM and PPC32/64.

    You must use a GDB version which is able to read XML target description sent by a gdbserver. This is the standard setup if GDB was configured and built with the 'expat' library. If your GDB was not configured with XML support, it will report an error message when using the 'target' command. Debugging will not work because GDB will then not be able to fetch the registers from the Valgrind gdbserver. For ARM programs using the Thumb instruction set, you must use a GDB version of 7.1 or later, as earlier versions have problems with next/step/breakpoints in Thumb code.

  • Stack unwinding on PPC32/PPC64.

    On PPC32/PPC64, stack unwinding for leaf functions (functions that do not call any other functions) works properly only when you give the option --vex-iropt-register-updates=allregs-at-mem-access or --vex-iropt-register-updates=allregs-at-each-insn. You must also pass this option in order to get a precise stack when a signal is trapped by GDB.

  • Breakpoints encountered multiple times.

    Some instructions (e.g. x86 'rep movsb') are translated by Valgrind using a loop. If a breakpoint is placed on such an instruction, the breakpoint will be encountered multiple times -- once for each step of the 'implicit' loop implementing the instruction.

  • Execution of Inferior function calls by the Valgrind gdbserver.

    GDB allows the user to 'call' functions inside the process being debugged. Such calls are named 'inferior calls' in the GDB terminology. A typical use of an inferior call is to execute a function that prints a human-readable version of a complex data structure. To make an inferior call, use the GDB 'print' command followed by the function to call and its arguments. As an example, the following GDB command causes an inferior call to the libc 'printf' function to be executed by the process being debugged:

    The Valgrind gdbserver supports inferior function calls. Whilst an inferior call is running, the Valgrind tool will report errors as usual. If you do not want to have such errors stop the execution of the inferior call, you can use v.set vgdb-error to set a big value before the call, then manually reset it to its original value when the call is complete.

    To execute inferior calls, GDB changes registers such as the program counter, and then continues the execution of the program. In a multithreaded program, all threads are continued, not just the thread instructed to make the inferior call. If another thread reports an error or encounters a breakpoint, the evaluation of the inferior call is abandoned.

    Note that inferior function calls are a powerful GDB feature, but should be used with caution. For example, if the program being debugged is stopped inside the function 'printf', forcing a recursive call to printf via an inferior call will very probably create problems. The Valgrind tool might also add another level of complexity to inferior calls, e.g. by reporting tool errors during the Inferior call or due to the instrumentation done.

  • Connecting to or interrupting a Valgrind process blocked in a system call.

    Connecting to or interrupting a Valgrind process blocked in a system call requires the 'ptrace' system call to be usable. This may be disabled in your kernel for security reasons.

    When running your program, Valgrind's scheduler periodically checks whether there is any work to be handled by the gdbserver. Unfortunately this check is only done if at least one thread of the process is runnable. If all the threads of the process are blocked in a system call, then the checks do not happen, and the Valgrind scheduler will not invoke the gdbserver. In such a case, the vgdb relay application will 'force' the gdbserver to be invoked, without the intervention of the Valgrind scheduler.

    Such forced invocation of the Valgrind gdbserver is implemented by vgdb using ptrace system calls. On a properly implemented kernel, the ptrace calls done by vgdb will not influence the behaviour of the program running under Valgrind. If however they do, giving the option --max-invoke-ms=0 to the vgdb relay application will disable the usage of ptrace calls. The consequence of disabling ptrace usage in vgdb is that a Valgrind process blocked in a system call cannot be woken up or interrupted from GDB until it executes enough basic blocks to let the Valgrind scheduler's normal checking take effect.

    When ptrace is disabled in vgdb, you can increase the responsiveness of the Valgrind gdbserver to commands or interrupts by giving a lower value to the option --vgdb-poll. If your application is blocked in system calls most of the time, using a very low value for --vgdb-poll will cause a the gdbserver to be invoked sooner. The gdbserver polling done by Valgrind's scheduler is very efficient, so the increased polling frequency should not cause significant performance degradation.

    When ptrace is disabled in vgdb, a query packet sent by GDB may take significant time to be handled by the Valgrind gdbserver. In such cases, GDB might encounter a protocol timeout. To avoid this, you can increase the value of the timeout by using the GDB command 'set remotetimeout'.

    Ubuntu versions 10.10 and later may restrict the scope of ptrace to the children of the process calling ptrace. As the Valgrind process is not a child of vgdb, such restricted scoping causes the ptrace calls to fail. To avoid that, Valgrind will automatically allow all processes belonging to the same userid to 'ptrace' a Valgrind process, by using PR_SET_PTRACER.

    Unblocking processes blocked in system calls is not currently implemented on Mac OS X and Android. So you cannot connect to or interrupt a process blocked in a system call on Mac OS X or Android.

    Unblocking processes blocked in system calls is implemented via agent thread on Solaris. This is quite a different approach than using ptrace on Linux, but leads to equivalent result - Valgrind gdbserver is invoked. Note that agent thread is a Solaris OS feature and cannot be disabled.

  • Changing register values.

    The Valgrind gdbserver will only modify the values of the thread's registers when the thread is in status Runnable or Yielding. In other states (typically, WaitSys), attempts to change register values will fail. Amongst other things, this means that inferior calls are not executed for a thread which is in a system call, since the Valgrind gdbserver does not implement system call restart.

  • Unsupported GDB functionality.

    GDB provides a lot of debugging functionality and not all of it is supported. Specifically, the following are not supported: reversible debugging and tracepoints.

  • Unknown limitations or problems.

    The combination of GDB, Valgrind and the Valgrind gdbserver probably has unknown other limitations and problems. If you encounter strange or unexpected behaviour, feel free to report a bug. But first please verify that the limitation or problem is not inherent to GDB or the GDB remote protocol. You may be able to do so by checking the behaviour when using standard gdbserver part of the GDB package.

Usage: vgdb [OPTION]... [[-c] COMMAND]...

vgdb ('Valgrind to GDB') is a small program that is used as anintermediary between Valgrind and GDB or a shell.Therefore, it has two usage modes:

  1. As a standalone utility, it is used from a shell command line to send monitor commands to a process running under Valgrind. For this usage, the vgdb OPTION(s) must be followed by the monitor command to send. To send more than one command, separate them with the -c option.

  2. In combination with GDB 'target remote |' command, it is used as the relay application between GDB and the Valgrind gdbserver. For this usage, only OPTION(s) can be given, but no COMMAND can be given.

vgdb accepts the followingoptions:

--pid=<number>

Specifies the PID of the process to which vgdb must connect to. This option is useful in case more than one Valgrind gdbserver can be connected to. If the --pid argument is not given and multiple Valgrind gdbserver processes are running, vgdb will report the list of such processes and then exit.

--vgdb-prefix

Must be given to both Valgrind and vgdb if you want to change the default prefix for the FIFOs (named pipes) used for communication between the Valgrind gdbserver and vgdb.

--wait=<number>

Instructs vgdb to search for available Valgrind gdbservers for the specified number of seconds. This makes it possible start a vgdb process before starting the Valgrind gdbserver with which you intend the vgdb to communicate. This option is useful when used in conjunction with a --vgdb-prefix that is unique to the process you want to wait for. Also, if you use the --wait argument in the GDB 'target remote' command, you must set the GDB remotetimeout to a value bigger than the --wait argument value. See option --max-invoke-ms (just below) for an example of setting the remotetimeout value.

--max-invoke-ms=<number>

Gives the number of milliseconds after which vgdb will force the invocation of gdbserver embedded in Valgrind. The default value is 100 milliseconds. A value of 0 disables forced invocation. The forced invocation is used when vgdb is connected to a Valgrind gdbserver, and the Valgrind process has all its threads blocked in a system call.

If you specify a large value, you might need to increase the GDB 'remotetimeout' value from its default value of 2 seconds. You should ensure that the timeout (in seconds) is bigger than the --max-invoke-ms value. For example, for --max-invoke-ms=5000, the following GDB command is suitable:

--cmd-time-out=<number>

Instructs a standalone vgdb to exit if the Valgrind gdbserver it is connected to does not process a command in the specified number of seconds. The default value is to never time out.

--port=<portnr>

Instructs vgdb to use tcp/ip and listen for GDB on the specified port nr rather than to use a pipe to communicate with GDB. Using tcp/ip allows to have GDB running on one computer and debugging a Valgrind process running on another target computer. Example:

On the computer which hosts GDB, execute the command:

where targetip is the ip address or hostname of the target computer.

-c

To give more than one command to a standalone vgdb, separate the commands by an option -c. Example:

-l

Instructs a standalone vgdb to report the list of the Valgrind gdbserver processes running and then exit.

-T

Instructs vgdb to add timestamps to vgdb information messages.

-D

Instructs a standalone vgdb to show the state of the shared memory used by the Valgrind gdbserver. vgdb will exit after having shown the Valgrind gdbserver shared memory state.

-d

Instructs vgdb to produce debugging output. Give multiple -d args to increase the verbosity. When giving -d to a relay vgdb, you better redirect the standard error (stderr) of vgdb to a file to avoid interaction between GDB and vgdb debugging output.

This section describes the Valgrind monitor commands, availableregardless of the Valgrind tool selected. For the tool specificcommands, refer to Memcheck Monitor Commands,Helgrind Monitor Commands,Callgrind Monitor Commands andMassif Monitor Commands.

The monitor commands can be sent either from a shell command line, by using astandalone vgdb, or from GDB, by using GDB's 'monitor'command (see Monitor command handling by the Valgrind gdbserver).They can also be launched by the client program, using the VALGRIND_MONITOR_COMMANDclient request.

  • help [debug] instructs Valgrind's gdbserver to give the list of all monitor commands of the Valgrind core and of the tool. The optional 'debug' argument tells to also give help for the monitor commands aimed at Valgrind internals debugging.

  • v.info all_errors shows all errors found so far.

  • v.info last_error shows the last error found.

  • v.info location <addr> outputs information about the location <addr>. Possibly, the following are described: global variables, local (stack) variables, allocated or freed blocks, ... The information produced depends on the tool and on the options given to valgrind. Some tools (e.g. memcheck and helgrind) produce more detailed information for client heap blocks. For example, these tools show the stacktrace where the heap block was allocated. If a tool does not replace the malloc/free/... functions, then client heap blocks will not be described. Use the option --read-var-info=yes to obtain more detailed information about global or local (stack) variables.

  • v.info n_errs_found [msg] shows the number of errors found so far, the nr of errors shown so far and the current value of the --vgdb-error argument. The optional msg (one or more words) is appended. Typically, this can be used to insert markers in a process output file between several tests executed in sequence by a process started only once. This allows to associate the errors reported by Valgrind with the specific test that produced these errors.

  • v.info open_fds shows the list of open file descriptors and details related to the file descriptor. This only works if --track-fds=yes or --track-fds=all (to include stdin, stdout and stderr) was given at Valgrindr startup.

  • v.clo <clo_option>... changes one or more dynamic command line options. If no clo_option is given, lists the dynamically changeable options. See Dynamically Change Options.

  • v.set {gdb_output | log_output | mixed_output} allows redirection of the Valgrind output (e.g. the errors detected by the tool). The default setting is mixed_output.

    With mixed_output, the Valgrind output goes to the Valgrind log (typically stderr) while the output of the interactive GDB monitor commands (e.g. v.info last_error) is displayed by GDB.

    With gdb_output, both the Valgrind output and the interactive GDB monitor commands output are displayed by GDB.

    With log_output, both the Valgrind output and the interactive GDB monitor commands output go to the Valgrind log.

  • v.wait [ms (default 0)] instructs Valgrind gdbserver to sleep 'ms' milli-seconds and then continue. When sent from a standalone vgdb, if this is the last command, the Valgrind process will continue the execution of the guest process. The typical usage of this is to use vgdb to send a 'no-op' command to a Valgrind gdbserver so as to continue the execution of the guest process.

  • v.kill requests the gdbserver to kill the process. This can be used from a standalone vgdb to properly kill a Valgrind process which is currently expecting a vgdb connection.

  • v.set vgdb-error <errornr> dynamically changes the value of the --vgdb-error argument. A typical usage of this is to start with --vgdb-error=0 on the command line, then set a few breakpoints, set the vgdb-error value to a huge value and continue execution.

  • xtmemory [<filename> default xtmemory.kcg.%p.%n] requests the tool (Memcheck, Massif, Helgrind) to produce an xtree heap memory report. See Execution Trees for a detailed explanation about execution trees.

The following Valgrind monitor commands are useful forinvestigating the behaviour of Valgrind or its gdbserver in case ofproblems or bugs.

  • v.do expensive_sanity_check_general executes various sanity checks. In particular, the sanity of the Valgrind heap is verified. This can be useful if you suspect that your program and/or Valgrind has a bug corrupting Valgrind data structure. It can also be used when a Valgrind tool reports a client error to the connected GDB, in order to verify the sanity of Valgrind before continuing the execution.

  • v.info gdbserver_status shows the gdbserver status. In case of problems (e.g. of communications), this shows the values of some relevant Valgrind gdbserver internal variables. Note that the variables related to breakpoints and watchpoints (e.g. the number of breakpoint addresses and the number of watchpoints) will be zero, as GDB by default removes all watchpoints and breakpoints when execution stops, and re-inserts them when resuming the execution of the debugged process. You can change this GDB behaviour by using the GDB command set breakpoint always-inserted on.

  • v.info memory [aspacemgr] shows the statistics of Valgrind's internal heap management. If option --profile-heap=yes was given, detailed statistics will be output. With the optional argument aspacemgr. the segment list maintained by valgrind address space manager will be output. Note that this list of segments is always output on the Valgrind log.

  • v.info exectxt shows information about the 'executable contexts' (i.e. the stack traces) recorded by Valgrind. For some programs, Valgrind can record a very high number of such stack traces, causing a high memory usage. This monitor command shows all the recorded stack traces, followed by some statistics. This can be used to analyse the reason for having a big number of stack traces. Typically, you will use this command if v.info memory has shown significant memory usage by the 'exectxt' arena.

  • v.info scheduler shows various information about threads. First, it outputs the host stack trace, i.e. the Valgrind code being executed. Then, for each thread, it outputs the thread state. For non terminated threads, the state is followed by the guest (client) stack trace. Finally, for each active thread or for each terminated thread slot not yet re-used, it shows the max usage of the valgrind stack.

    Showing the client stack traces allows to compare the stack traces produced by the Valgrind unwinder with the stack traces produced by GDB+Valgrind gdbserver. Pay attention that GDB and Valgrind scheduler status have their own thread numbering scheme. To make the link between the GDB thread number and the corresponding Valgrind scheduler thread number, use the GDB command info threads. The output of this command shows the GDB thread number and the valgrind 'tid'. The 'tid' is the thread number output by v.info scheduler. When using the callgrind tool, the callgrind monitor command status outputs internal callgrind information about the stack/call graph it maintains.

  • v.info stats shows various valgrind core and tool statistics. With this, Valgrind and tool statistics can be examined while running, even without option --stats=yes.

  • v.info unwind <addr> [<len>] shows the CFI unwind debug info for the address range [addr, addr+len-1]. The default value of <len> is 1, giving the unwind information for the instruction at <addr>.

  • v.set debuglog <intvalue> sets the Valgrind debug log level to <intvalue>. This allows to dynamically change the log level of Valgrind e.g. when a problem is detected.

  • v.set hostvisibility [yes*|no] The value 'yes' indicates to gdbserver that GDB can look at the Valgrind 'host' (internal) status/memory. 'no' disables this access. When hostvisibility is activated, GDB can e.g. look at Valgrind global variables. As an example, to examine a Valgrind global variable of the memcheck tool on an x86, do the following setup:

    After that, variables defined in memcheck-x86-linux can be accessed, e.g.

  • v.translate <address> [<traceflags>] shows the translation of the block containing address with the given trace flags. The traceflags value bit patterns have similar meaning to Valgrind's --trace-flags option. It can be given in hexadecimal (e.g. 0x20) or decimal (e.g. 32) or in binary 1s and 0s bit (e.g. 0b00100000). The default value of the traceflags is 0b00100000, corresponding to 'show after instrumentation'. The output of this command always goes to the Valgrind log.

    The additional bit flag 0b100000000 (bit 8) has no equivalent in the --trace-flags option. It enables tracing of the gdbserver specific instrumentation. Note that this bit 8 can only enable the addition of gdbserver instrumentation in the trace. Setting it to 0 will not disable the tracing of the gdbserver instrumentation if it is active for some other reason, for example because there is a breakpoint at this address or because gdbserver is in single stepping mode.

Valgrind allows calls to some specified functions to be intercepted andrerouted to a different, user-supplied function. This can do whatever itlikes, typically examining the arguments, calling onwards to the original,and possibly examining the result. Any number of functions may bewrapped.

Function wrapping is useful for instrumenting an API in some way. Forexample, Helgrind wraps functions in the POSIX pthreads API so it can knowabout thread status changes, and the core is able to wrapfunctions in the MPI (message-passing) API so it can knowof memory status changes associated with message arrival/departure.Such information is usually passed to Valgrind by using clientrequests in the wrapper functions, although the exact mechanism may vary.

Supposing we want to wrap some function

A wrapper is a function of identical type, but with a special namewhich identifies it as the wrapper for foo.Wrappers need to includesupporting macros from valgrind.h.Here is a simple wrapper which prints the arguments and return value:

To become active, the wrapper merely needs to be present in a textsection somewhere in the same process' address space as the functionit wraps, and for its ELF symbol name to be visible to Valgrind. Inpractice, this means either compiling to a .o and linking it in, orcompiling to a .so and LD_PRELOADing it in. The latter is moreconvenient in that it doesn't require relinking.

All wrappers have approximately the above form. There are threecrucial macros:

Alternatives To Valgrind On Windows

I_WRAP_SONAME_FNNAME_ZU: this generates the real name of the wrapper.This is an encoded name which Valgrind notices when reading symboltable information. What it says is: I am the wrapper for any functionnamed foo which is found in an ELF shared object with an empty('NONE') soname field. The specification mechanism is powerful inthat wildcards are allowed for both sonames and function names. The details are discussed below.

Alternative To Valgrind Mac

VALGRIND_GET_ORIG_FN: once in the wrapper, the first priority isto get hold of the address of the original (and any other supportinginformation needed). This is stored in a value of opaque type OrigFn.The information is acquired using VALGRIND_GET_ORIG_FN. It is crucialto make this macro call before calling any other wrapped functionin the same thread.

CALL_FN_W_WW: eventually we willwant to call the function beingwrapped. Calling it directly does not work, since that just gets usback to the wrapper and leads to an infinite loop. Instead, the resultlvalue, OrigFn and arguments arehanded to one of a family of macros of the form CALL_FN_*. Thesecause Valgrind to call the original and avoid recursion back to thewrapper.

This scheme has the advantage of being self-contained. A library ofwrappers can be compiled to object code in the normal way, and doesnot rely on an external script telling Valgrind which wrappers pertainto which originals.

Each wrapper has a name which, in the most general case says: I am thewrapper for any function whose name matches FNPATT and whose ELF'soname' matches SOPATT. Both FNPATT and SOPATT may contain wildcards(asterisks) and other characters (spaces, dots, @, etc) which are not generally regarded as valid C identifier names.

This flexibility is needed to write robust wrappers for POSIX pthreadfunctions, where typically we are not completely sure of either thefunction name or the soname, or alternatively we want to wrap a wholeset of functions at once.

For example, pthread_create in GNU libpthread is usually aversioned symbol - one whose name ends in, eg, @GLIBC_2.3. Hence weare not sure what its real name is. We also want to cover any sonameof the form libpthread.so*.So the header of the wrapper will be

In order to write unusual characters as valid C function names, aZ-encoding scheme is used. Names are written literally, except thata capital Z acts as an escape character, with the following encoding:

Hence libpthreadZdsoZd0 is an encoding of the soname libpthread.so.0and pthreadZucreateZAZa is an encoding of the function name pthread_create@*.

The macro I_WRAP_SONAME_FNNAME_ZZ constructs a wrapper name in whichboth the soname (first component) and function name (second component)are Z-encoded. Encoding the function name can be tiresome and isoften unnecessary, so a second macro,I_WRAP_SONAME_FNNAME_ZU, can beused instead. The _ZU variant is also useful for writing wrappers forC++ functions, in which the function name is usually already mangledusing some other convention in which Z plays an important role. Havingto encode a second time quickly becomes confusing.

Since the function name field may contain wildcards, it can beanything, including just *.The same is true for the soname.However, some ELF objects - specifically, main executables - do nothave sonames. Any object lacking a soname is treated as if its sonamewas NONE, which is why the original example above had a nameI_WRAP_SONAME_FNNAME_ZU(NONE,foo).

Note that the soname of an ELF object is not the same as itsfile name, although it is often similar. You can find the soname ofan object libfoo.so using the commandreadelf -a libfoo.so | grep soname.

The ability for a wrapper to replace an infinite family of functionsis powerful but brings complications in situations where ELF objectsappear and disappear (are dlopen'd and dlclose'd) on the fly.Valgrind tries to maintain sensible behaviour in such situations.

For example, suppose a process has dlopened (an ELF object withsoname) object1.so, which contains function1. It starts to usefunction1 immediately.

After a while it dlopens wrappers.so,which contains a wrapperfor function1 in (soname) object1.so. All subsequent calls to function1 are rerouted to the wrapper.

If wrappers.so is later dlclose'd, calls to function1 are naturally routed back to the original.

Alternatively, if object1.sois dlclose'd but wrappers.so remains,then the wrapper exported by wrappers.sobecomes inactive, since thereis no way to get to it - there is no original to call any more. However,Valgrind remembers that the wrapper is still present. If object1.so iseventually dlopen'd again, the wrapper will become active again.

In short, valgrind inspects all code loading/unloading events toensure that the set of currently active wrappers remains consistent.

A second possible problem is that of conflicting wrappers. It is easily possible to load two or more wrappers, both of which claimto be wrappers for some third function. In such cases Valgrind willcomplain about conflicting wrappers when the second one appears, andwill honour only the first one.

Figuring out what's going on given the dynamic nature of wrappingcan be difficult. The --trace-redir=yes option makes this possibleby showing the complete state of the redirection subsystem aftereverymmap/munmapevent affecting code (text).

There are two central concepts:

  • A 'redirection specification' is a binding of a (soname pattern, fnname pattern) pair to a code address. These bindings are created by writing functions with names made with the I_WRAP_SONAME_FNNAME_{ZZ,_ZU} macros.

  • An 'active redirection' is a code-address to code-address binding currently in effect.

The state of the wrapping-and-redirection subsystem comprises a set ofspecifications and a set of active bindings. The specifications areacquired/discarded by watching all mmap/munmapevents on code (text)sections. The active binding set is (conceptually) recomputed fromthe specifications, and all known symbol names, following any changeto the specification set.

--trace-redir=yes shows the contents of both sets following any such event.

-v prints a line of text each time an active specification is used for the first time.

Hence for maximum debugging effectiveness you will need to use bothoptions.

One final comment. The function-wrapping facility is closelytied to Valgrind's ability to replace (redirect) specifiedfunctions, for example to redirect calls to malloc to itsown implementation. Indeed, a replacement function can beregarded as a wrapper function which does not call the original.However, to make the implementation more robust, the two kindsof interception (wrapping vs replacement) are treated differently.

--trace-redir=yes shows specifications and bindings for bothreplacement and wrapper functions. To differentiate the two, replacement bindings are printed using R-> whereas wraps are printed using W->.

For the most part, the function wrapping implementation is robust.The only important caveat is: in a wrapper, get hold ofthe OrigFn information using VALGRIND_GET_ORIG_FN before calling anyother wrapped function. Once you have the OrigFn, arbitrarycalls between, recursion between, and longjumps out of wrappersshould work correctly. There is never any interaction between wrappedfunctions and merely replaced functions (eg malloc), so you can callmalloc etc safely from within wrappers.

The above comments are true for {x86,amd64,ppc32,arm,mips32,s390}-linux.Onppc64-linux function wrapping is more fragile due to the (arguablypoorly designed) ppc64-linux ABI. This mandates the use of a shadowstack which tracks entries/exits of both wrapper and replacementfunctions. This gives two limitations: firstly, longjumping out ofwrappers will rapidly lead to disaster, since the shadow stack willnot get correctly cleared. Secondly, since the shadow stack hasfinite size, recursion between wrapper/replacement functions is onlypossible to a limited depth, beyond which Valgrind has to abort therun. This depth is currently 16 calls.

For all platforms ({x86,amd64,ppc32,ppc64,arm,mips32,s390}-linux)all the abovecomments apply on a per-thread basis. In other words, wrapping isthread-safe: each thread must individually observe the aboverestrictions, but there is no need for any kind of inter-threadcooperation.

As shown in the above example, to call the original you must use amacro of the form CALL_FN_*. For technical reasons it is impossibleto create a single macro to deal with all argument types and numbers,so a family of macros covering the most common cases is supplied. Inwhat follows, 'W' denotes a machine-word-typed value (a pointer or aC long), and 'v' denotes C's void type.The currently available macros are:

The set of supported types can be expanded as needed. It isregrettable that this limitation exists. Function wrapping has provendifficult to implement, with a certain apparently unavoidable level ofickiness. After several implementation attempts, the presentarrangement appears to be the least-worst tradeoff. At least it worksreliably in the presence of dynamic linking and dynamic codeloading/unloading.

You should not attempt to wrap a function of one type signature with awrapper of a different type signature. Such trickery will surely leadto crashes or strange behaviour. This is not a limitationof the function wrapping implementation, merely a reflection of thefact that it gives you sweeping powers to shoot yourself in the footif you are not careful. Imagine the instant havoc you could wreak bywriting a wrapper which matched any function name in any soname - ineffect, one which claimed to be a wrapper for all functions in theprocess.

In the source tree, memcheck/tests/wrap[1-8].c provide a series ofexamples, ranging from very simple to quite advanced.

mpi/libmpiwrap.c is an example of wrapping a big, complex API (the MPI-2 interface). This file defines almost 300 different wrappers.