Hydra takes a snapshot generated by running our ptrace tracer on an application and organizes the data in an easy to grok terminal interface.
It's designed to focus the engineer on the most relevant information pertaining to the root cause of the failure condition. The faulting frame will be immediately selected; suspect variables will be annotated inline; the entire crash will be classified into a type of fault (null dereference, memory write, etc.). Various commands and integrations are provided to ease navigation through application state, giving you more information faster and less tediously.
Sound pretty useful? Good. Let's take a closer look at how Hydra helps you squash the many-headed bugs that affect your production applications.
Feeding the hand that bites you
First things first: how do we run hydra?
$ hydra <snapshot file>
Assuming you have configured
coroner you can also use:
$ coroner view <snapshot file>
This will instantly print the exact root cause of the crash, and will even commit a fix for you. You can go home.
Well, maybe not yet. What'll really happen is you'll see a colorful ncurses interface (unless you passed
-m to hydra, in which case you'll see just black and white. Happy?), which should hopefully remind you of some of your favorite tools, like
tig. So how is this organized?
Pane? You don't know what pane is
The interface is split into a series of panes, each one containing distinct portions of an application's state.
The panes, in order, are
We'll get more into what the router pane contains later, but for now, know that it has various bits of metadata about the application: system information (memory/cpu usage), registers, kernel frames, etc. There are also some really cool, configurable integrations like source code management.
The only pane that doesn't really change its contents is the threads pane (I say really, here, because you can still change the way threads are displayed; we'll get into thread grouping and pane maximization later). The frames, variables, and router panes all change depending on the context the user is in; we call these panes context-aware. Frames will be populated according to the current thread selected; variables and registers will be populated according to the current frame; and so on.
Rules, rules, so many rules
There are also rules in between each pane. Without them, the panes would just bleed into each other, it'd be ugly and confusing, you wouldn't see any pretty colors, and you'd probably just go back to using gdb. They happen to contain useful information, too.
The rules, in order, are (they don't really have logical names):
- Title (list of some navigation hotkeys, Hydra version)
- Application name and time of trace
- For the current frame: Basename of the object file, and instruction/stack pointers
- Process map entry for the current variable, assembly instruction for the current frame
- Router pane title bar (tab list and current pane's name)
- Context - this contains a brief indication/description of the current context you're in. It'll usually have a position indicator on the right side (i.e., you're currently on variable 30 out of 340 million; go faster). On the left side, it might show the current thread's TID, or the current variable's type. This all changes if you run a command, which we'll get into later; a status message may be temporarily displayed here so you know what's going on if your command fails (e.g., if you run /samy_bank_password, you'll get That's private!).
So given that we have so many contextual panes, how do you actually change the context (e.g., how do you change focus from threads to frames, or from frames to the router pane)? Well, navigation is pretty vim-like (sorry, Emacs users; I have small hands and don't want repetitive strain injury). Movement within a pane is handled by the expected
hjkl keys; some special keys like
H (go to the top of the current view of a list) and
L (same as
H, but bottom) are also supported. Page up/page down do the usual. Switching panes is handled by either
tab (switches focus to the next pane; this wraps around if you reach the last pane) or pre-set marks- 1 for threads, 2 for frames, 3 for variables, and 4 for router.
What does all this actually look like? I could write a thousand more words, or I can just show you. Here:
This is the initial view. What's nice is you're immediately focused on the faulting frame, and can see the signal information directly under it. No more parsing a gdb stacktrace to find what frame to jump to.
Below that, you can clearly see all the variables of the faulting frame (remember the mention of contexts above? Well, this is currently showing the variable context for the faulted frame, since that's what's selected). What's that colorful text below that variable? Those are inlined annotations. We'll get into that later, but basically, our tracer automatically deduced you did something naughty with that variable. Here, it looks like you dereferenced a NULL pointer. How dare you!
Wait, you could swear your application actually had five threads, but you see only three. And what's that funny symbol next to one of the threads? That, my friends, is two features in one -- thread grouping and item collapsing. We'll get into that later (boy do I have a lot of explaining to do), but to give you an idea, Hydra has automatically determined that a group of your application's threads (three of them, in this case) are pretty much the same, and thus you probably need to look at only one of them. Unless you need to look at more of them, of course, at which point you can expand that group and investigate.
What's that in the bottom pane? Is that...is that your code? Why yes, that's another cool feature: source code integration. That's the default router pane tab that's opened when you first start Hydra. We'll get into configuration of this feature later, but this will show the faulting line (along with the entire file, not just the function call and line, like what gdb shows by default). It'll even pull in the correct version of your code according to the tag/version of the particular crash!
Cool, commonly used features
Those are the basics, but there is so much more you can do with Hydra. Let's take the red pill and go deeper...because what's the point of a fancy ncurses UI without some cool features?
Source code integration
Let's take a deeper look into one of the first things you'll see (and ultimately, want to see) when opening a crash -- source code.
Configuring source code integration
You can configure hydra to show relevant sections of source code in the peripheral pane.
~/.hydra.cf file, you need to add a
[scm] section. Example:
crash_app.trigger=/home/user/projects/crash_app,version,git -C %s checkout -q %0
<app_name>.map commands map an
object (as in object file, not an instance of a class) or a
functionname to a source code folder for the application
<app_name>. After the colon is a regular expression to match the name of the object or function, followed by a corresponding source code path. The commands are processed in order from stop to bottom, and the first match that meets the criteria determines the path that hydra will search for the source code for that object or function.
without a regular expression match is a wildcard - it will match anything in the same .
<app_name>.ignore= followed by a regular expression will instruct hydra to ignore any matches for any following lines. So in the example above, matches on object files that start with lib - and haven't already matched one of the earlier
.map=object lines - will stop the search, and hydra will not associate that object file with source code.
You can also trigger a command to run with
<app_name>.trigger= The most common use case for this is to trigger a
git checkout of the correct branch when using hydra.
trigger, the first parameter is the source code path, the last parameter is the command to execute. Between these, you can specify one or more KVs whose values are used as positional variables in the trigger command. In the example above, the
version KV value maps to parameter
%0. If you list additional KVs, those would be
%2 and so forth.
%s refers to the project path.
If it doesn't look like a trigger that you've set up is firing, keep in mind the following caveats:
A trigger will not fire until code from the specified folder is accessed by hydra, which in happens when a frame that uses that code is highlighted in hydra. Also, a trigger will only fire if:
- The path in the trigger line was previously matched in a .map= line.
- At least one KV attribute is specified in the .trigger= line, and
- Listed attributes are present in the snapshot (i.e. ptrace was called with --kv= flag with the attributes)
Context: Any list-type pane with an indicated hierarchy (
Any item with an indicated hierarchy (i.e., a
- character next to the item) may be collapsed or expanded to hide or show, respectively, an item's "children." In the thread pane, children may be members of a particular thread group; in the variable pane, members of a struct or array; in the process tab, structured heap metadata (arenas, thread caches, etc.); and so forth.
The one exception to this default collapsing behavior regards the display of inlined variable annotations. See
Inlined annotations for details.
Any annotations on a variable will be displayed directly under the variable. If a variable chain is collapsed, but one of the variables in the chain is annotated, the minimum number of variables necessary will be displayed along with the annotation itself (i.e., the annotated variable and its owners).
Of course, variables across frames may be annotated; even within a single frame, there may be thousands of variables, obscuring the view of any annotations. Annotation jumping is useful here; simply open the 'Warnings' tab of the bottom pane (by pressing 'w'), scroll through the list to the annotation you're interested in, and press 'enter'. The thread, frame, and variable views will update to the position of the annotation's owner.
Context: Any pane
- M (shift + m)
All panes support maximization. Certain panes, when maximized, may have a context associated with them (e.g. a maximized thread pane will have a frame pane context); the maximized pane will take the majority of the space, while the contextual panes will occupy the rest. All other panes will be hidden. To restore all panes and sizes, press either
M again or a macro movement hotkey to one of the hidden panes (moving between shown panes will not force size restoration).
Context: Any list-type pane
- n - go to next search result
- N - go to previous search result
All list-type panes support regex searches. All columns will be searched independently (e.g. in the thread display,
top frame symbol will each be searched).
Context: Any pane
- :<index position>
All panes support index jumping. All but the source code management pane are 0-based; the SCM pane is 1-based (like any vim file). Indices less than the first element or greater than the last will jump to the first or last elements, respectively.
Context: Any pane for commands, thread pane for display
- :group <group-type>
- Supported group-types:callstack
- :sort [sort-type]
- Supported sort-types:tid
Threads are automatically grouped according to the current group-type. Within each group, they are sorted according to the sort-type.
According to callstack grouping, threads with identical callstacks will be grouped together. With tid sorting, threads are ordered according to their thread ids. There are currently no other supported group- and sort-types.
Threads will be grouped by callstack and sorted by tid by default.
:ungroup will ungroup all threads;
:rsortwill reverse the current sorting order within each group (e.g. threads can be reverse sorted by tid).
Faulted threads are always grouped separately from non-faulted threads, and will always appear first in the thread list (faulted threads have an
F indicator next to them).
By default, Hydra looks for a configuration file at
Below is a sample configuration for the
crash.trigger=/home/djoseph/projects/crash,version,git -C %s checkout -q %0
editor=vim +%l %s
A Deep(er) Dive
Remember all those explanations we punted earlier? Well, if they still haven't been clarified, here they are!
- j - Move down one item
- k - Move up one item
- h - Scroll left (if there is text off the screen)
- l - Scroll right (if there is text off the screen)
- H - Move to the top of the current view (vim-behavior)
- L - Move to the bottom of the current view
- PgUp - Move up one full page of items
- PgDn - Move down one full page of items
- Home - Move to the first item
- End - Move to the last item
- :<index> - Jump to the specified position
- tab - Move to the next pane
- 1-4 - Move to the pane associated with the number
- 1 - Threads
- 2 - Frames
- 3 - Variables
- 4 - Router
Context: Any non-router pane
- u - Show position of current selection of current pane
- :j <position> - refocus to the provided position
Any state of the top three panes (threads, frames, and variables) may be immediately refocused by "jumping" to its position -- similar to annotation jumping. Press
u to show the position URL of the current selection, and feed that into the global
:j <position> command to refocus to that state.
- :s <root>
- :! <command>
- :j <position>
- :sort <sort-type>
- :group <group-type>
Immediately run global commands
Context: Hydra command-line parameter
- -e "<command> <args>"
All global commands, excluding regex search, may be run immediately on start-up. This is useful for sharing state with other users -- provide them a snapshot and a position URL (via the
u command), and have them open it by doing:
hydra <snapshot> -e "j <position>"
General application and system statistics at the time of the crash. Some examples:
- Number of threads
- Process Memory
- RSS Peak
- System Memory
- CPU Usage
- I/O Wait
Any contextual data associated with a particular variable, e.g. heap allocation statistics.
Any process-wide metadata associated with the application/crash. This will contain all trace-wide annotations, e.g. heap metadata/statistics.
All key-value attributes associated with the application/crash. Some of these are automatically generated, but others may be specified by the user via ptrace (see ptrace documentation for more details). Some examples:
- process age
- process tag/version
All registers for the currently selected frame.
The process map entries for the application (e.g. from /proc/
<pid>/maps on Linux). The selected entry will change whenever the variable selection changes (to the entry containing the variable).
- <enter> - save file to disk
- :s <root> - save all attached files to <root>
All files attached to the snapshot via ptrace (see ptrace documentation for instructions). Metadata will be shown along with the full path of the file.
The classification(s) of the crash, generated by ptrace (e.g. whether this crash was a segmentation violation, a null dereference, a memory write error, etc.).
The stack of the most recent kernel frames for the current thread (these were not necessarily executed after the thread's last user-space frame).
The variables with global and thread-local storage that were requested at the time of the trace (via ptrace - see ptrace documentation for how to do this). Variables will be organized into a hierarchy of
[Thread (for TLS variables)]-[Object]-[CU].
Source code integration
- <enter> - open source file in the configured editor
- <space> - center view on the last-executed line of the frame
Source code for the currently selected frame. Index jumping is supported, but regex searches are not. The initial line selected will be the last-executed line of the frame.
- <enter> - jump to annotation owner (i.e. Hydra will refocus onto the thread, frame, or variable owning the annotation; backtrace annotations are not jumpable)
All annotations, excluding those of the JSON type, contained in the snapshot. Users may jump to a selected annotation by pressing
JSON-type annotations are shown in either the
Context router tabs; see those sections for details.
Columns for each pane, in order from left to right (panes with a single column or containing simple key-value lists are omitted):
- Thread stateF - faulteds - sleepingS - stoppedD - diskT - tracedZ - zombieX - dead? - unknown
- Basename of the object file of the top frame
- Thread name
- (Only appears for kernel-space core files) Thread's PID
- Symbol of the top frame
- Frame number
- Symbol (or address if no symbol could be resolved)
- Source code location (directory/source file/line)
- Signal information will be shown under the faulting frame but won't follow the frames pane's column specifications