Aggregation

The Backtrace debugger generates stand-alone structured snapshot files that contain significantly more information than a callstack, without the bloat of a traditional core dump. All crashes are automatically aggregated and deduplicated so your team can respond to instability more quickly and triage more effectively. Learn more about deduplication here.

Automated Analysis and Assistive Debugging

Backtrace continuously improves the automated analysis capabilities of the Backtrace Debugger based on real-world fault data.

Accuracy

The Backtrace debugger supports a lot of DWARF2, DWARF3, DWARF4 and DWARF5 debug information, including popular GNU extensions. It also supports multiple unwinding personalities, allowing for behaviors matching both GCC ABI and standard DWARF ABI (don't be plagued by unnecessarily optimized out variables).

If heuristics are applied to frame unwinding, suspect frames are clearly marked as such so developers are not left guessing the integrity of the fault data. If variables are optimized out, rather than providing invalid values as other debuggers might, Backtrace will indicate the variable is optimized out and why so your engineers are not left investigating down the wrong path.

Automatically Highlight Variables associated with an Invalid Access

One of the first things any developer will do is manually chase pointers and object hierarchies in their debugger to find the relevant set of variables. The process has to be repeated when analyzing every single dump and since it is manual, it is also error-prone. It is easy to miss aliases to the same variable or region of memory across other threads, for example. Backtrace helps you save time by automatically chasing down these aliases for you.

An example is below. A crash (due to a use-after-free) occurs accessing the name variable. Backtrace automatically tags the variable for quick reference, and will do so across all threads, automatically chasing object references.

Backtrace achieves this by examining executable code, variable addresses and variable values with-in an acceptable error margin (allowing heuristic matches).

Fault Disambiguation

Backtrace disambiguates the type of fault and help identify important bugs. Your traditional debugger will in many cases be unable to disambiguate an invalid memory read, versus a write,  stack overflow or alignment error. Backtrace will examine memory, registers and executable code to automatically provide this disambiguation.

Below is an example of what an alignment bug might look like with a traditional debugger. Unless the engineer has an understanding of the underlying instruction, this looks like the program is crashing while accessing perfectly valid memory locations.

[Switching to Thread 0x7ffff6d4d700 (LWP 6876)]0x0000000000403c29 in ck_pr_cas_64_2 (set=0x7ffff6d4c8b0, compare=0x7ffff6d4c8b0, target=0x7ffff6d4c8b8) at /opt/backtrace/include/gcc/x86_64/ck_pr.h:461
461 __asm__ __volatile__("movq 0(%4), %%rax;"
(gdb) bt
#0 0x0000000000403c29 in ck_pr_cas_64_2 (set=0x7ffff6d4c8b0, compare=0x7ffff6d4c8b0,target=0x7ffff6d4c8b8) at /opt/backtrace/include/gcc/x86_64/ck_pr.h:461
#1 scenario_align_fault (e=0x61f2c0) at align.c:15
#2 0x0000000000402d01 in recurse (state=state@entry=0x61f2c0) at crash.c:301
#3 0x0000000000402d22 in inlined_recurse_b (e=0x61f2c0) at crash.c:114
#4 inlined_recurse_a (e=0x61f2c0) at crash.c:122
#5 inlined_recurse (e=0x61f2c0) at crash.c:130
#6 recurse (state=state@entry=0x61f2c0) at crash.c:296
#7 0x0000000000402e58 in crash_thread_begin (ep=0x61f2c0) at crash.c:321
#8 0x00007ffff74276aa in start_thread (arg=0x7ffff6d4d700) at pthread_create.c:333#9 0x00007ffff715ceed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) p *target
$1 = 2
(gdb) p *compare
$2 = 1

Backtrace validates the constraints of the underlying instruction against registers and memory will automatically identify this as an alignment bug at snapshot generation time. Below is the same dump viewed using the terminal-based snapshot viewer of Backtrace.

Backtrace disambiguates all sorts of faults including: failed assertions, alignment errors, different manifestation of stack overflows, security-related bugs, dispatch errors, machine check errors, memory accesses and more.

Highlight Invalid Function Calls

  • Backtrace ships with a ruleset specifying constraints for how functions should be called. The ruleset is itself extensible and users can add their own rules. For example, a rule can be defined for memcpy(a, b, c)  specifying the following invariants:
  • [a, a + c)  must be writable.
  • [b, b + c) must be readable.
  • c  must be an integer that looks sane with respect to the address space constraints of the host architecture.

If a crash was to occur and a variable contains a memcpy call violating these constraints, it will be explicitly flagged. As another example, a rule is specified for realloc(a, b)  requiring that:

  • a is NULL or allocator-managed memory.
  • b is an integer and must appear sane with respect  to host architecture address space.
  • It must not execute in signal context (it is not signal safe).

If a crash was to occur and a thread contains a call to realloc  violating any of these constraints, it will be flagged.

By default, Backtrace has rules for a host of standardized functions and this list is regularly updated and maintained.

Disambiguate and Detect Heap Corruption

Backtrace will analyze registers, variable, memory and memory allocator metadata to detect many forms of heap corruption, all from a single application snapshot without any impact to run-time performance of your application. 

Backtrace integrates into the following memory allocators:

  • jemalloc
  • ptmalloc
  • tcmalloc
  • FreeBSD Kernel UMA Allocator

And is able to detect many forms of heap corruption, including but not limited to:

  • Double free, where a dynamically allocated object is erroneously freed twice.
  • Invalid free, where an invalid address is requested for reclamation.
  • Use after free, an active alias exists to a reclaimed object.
  • Type violation, the size of the allocation pointed to by the type does not meet the size requirements of the underlying type.

Backtrace can also detect latent bugs in your software, and can be run on live processes. This helps root cause investigation for fault but can also help uncover latent bugs that may eventually cause instability.

See examples and learn more here.

Environment and Architectural Constraints

The debugger will automatically highlight if environment or architecture conditions lead to a fault, potentially reducing hours of research into a red herring. Identify quickly that an assertion around a failed resource acquisition or memory lock was due to an environment or architectural limit being exceeded.

The debugger will validate resource utilization of your application at fault time and immediately highlight any violations to this. This includes limits defined by getrlimit(3) .

Stale Pointer Detection

A stale pointer is subtle and mong memory errors, it is one of the most difficult to debug. It happens when a memory region is freed or reallocated but the old references, aka aliases, to the memory are not updated properly. Usage thereafter of these stale references causes unpredictable consequences. They can be benign; they can trigger intermittent failure, or an outright crash. Even if there isn’t any observable error, a stale pointer is a ticking time bomb. With a slight context change or new code added, it may become catastrophic. With multi-threaded programs, race conditions usually result in stale pointers which add another dimension of complexity to debugging.

If --stale-pointer  is passed to the Backtrace debugger,it will apply process-wide mark-sweep on a faulting memory location up to a configurable number of levels deep to detect back references to the faulting memory address. After detecting these aliases, Backtrace applies heuristics using type information and allocator metadata to determine the potential types that are aliasing each other. For example, being able to identify that struct A as a stale reference to a region of memory that is pointed to by a reference to a type of struct B . This helps isolate the family of data structures and subsystems that have invalid memory management.

The screenshot below presents such an example, where a stale pointer is isolated down to seven potential types.

See examples and learn more here.

Security and Malware Analysis

The Backtrace debugger includes several leading features related to the detection of security vulnerabilities, exploitable crash conditions, and more. This allows you to retroactively identify malware in executables as well as postmortem so  you can identify crashes that may actually present security problems for your software. By default, Backtrace will automatically highlight faults that involve:

  • Execution attempts of non-executable pages.
  • Faults involving instructions that control execution flow.
  • Stack overflow and fortification assertion failures, indicative of heap and stack smashing possibilities.

If --security  is passed to ptrace , then it is possible to enable advanced security and malware analysis features. These features are configurable and include:

  • Detection of dynamically shared objects and other file-backed memory regions containing executable code that are inconsistent with linker data associated with the executable.
  • Detection of loaded memory segments that are both writeable and executable.
  • Detection of inconsistencies in the PLT/GOT, constructs and destructors with linker and executable information to detect infected functions.
  • Detection of modifications to executable code and the text segment, identifying potential malware.
  • Analysis of executable code and instruction pointer to detect the presence of ROP chains and shellcode, indicative of potential exploitation attempts. This currently requires a snapshot be generated from a live process.
  • Invalid constructor and destructor functions.

Performance

There are two components impacting performance of application recovery in a post-mortem state. One is memory dump generation and the other is symbolic analysis for extracting a callstack and application state. 

Let us compare the performance of the Backtrace debugger to that of GDB and LLDB on a complex C++ project, such as Google Chrome. In this experiment, we are using Chromium 35.0.1916.144 with 466 mapped segments and 1 thread. There is approximately 2.6GB worth of debug data in a single executable here. We will request a backtrace of a running process.

GDB takes 2.6GB of resident memory and 54 seconds.
LLDB takes 3.0GB of resident memory and 130 seconds.
Backtrace takes 0.46GB of resident memory and 00.61 seconds.

This demonstrates complexity as the size of debug information scales. Performance is also affected as the number of memory segments and threads scale. Below is a comparison of Backtrace with and without variables (bt  and bt-nv  respectively) compared to GDB, LLDB and Glider below.

The memory dump generation process can be avoided all-together by having Backtrace snapshot live processes, only generating a full dump on disk if necessary. As far as debugger performance is concerned, the Backtrace debugger is orders of magnitude faster than industry-standard debuggers such as GDB and LLDB. This performance allows for faster recovery times and enables the debugger to perform additional analysis.

Detailed Snapshots

Custom Attributes

Attach any number of custom attributes to your dump files, such as version, data center, hardware model and more. There is no limit to the number of custom attributes. This allows you to quickly assess the impact of instability when used with the query builder: allowing you to aggregate with custom attributes any way you please. See which version a crash was introduced, or which data centers or how many events were dropped by an event in a few seconds.

Environment Information

Aggregate on scheduler, process, memory and all manners of other statistics to quickly identify external factors that may have lead to software stability problems. By default, there are 50 attributes available to query against including memory utilization, scheduler utilization and more.

File Attachments

You are able to attach files and directories trees to the snapshot file with ease. These can then be extracted and downloaded directly through the web browser or terminal, making it easy to provide all the information needed for triage and investigation of instability.

Variable Information

All snapshots can include global variables, static variables, thread-local variables and will include all variables reachable on the stack. Objects will also be dereferenced and serialized into the snapshot. This allows developers to access variable information at the time of fault, greatly improving mean time to resolution.

View Dumps in your Terminal

The Backtrace hydra  tool allows you to view Backtrace snapshots in your terminal and includes advanced source-code integration, regular expression search and a lot more. Learn more here.

View Dumps in your Web Browser

The Backtrace web debugger and web console allow you to view dumps directly in your web browser, without having to install any additional tools. Your dumps are automatically aggregated into a centralized object store so your engineers can access dumps with a click of a button. Learn more here.

Did this answer your question?