Useful Guides and Articles on Kernel Debugging
This is a collection of useful information about how to debug LITMUSRT and Linux in general. Please add your own tips or links to your favorite debugging tutorials.
Oops Decoding
Your kernel crashed and you are staring at some kind of crash dump (called "the oops"). Now what?
Al Viro adds some further insight on Oops decoding and debugs an example.
Al Viro goes through a second Oops and explains the process.
Disassemble a kernel image
You have figured out that something bad occurred at some address. Unfortunately, you don't know what part of your code caused the crash. You can use this method to determine the line of C code that generated a given assembly instruction (assuming you have debugging symbols enabled in your kernel configuration):
objdump -S -d vmlinux | less
Copy&paste the offending address into the search dialog of less and you should see the offending source code. (Be patient, this may take a few seconds on slow machines, the kernel is large.)
Well-known addresses
When you loop at a kernel oops you may spot some special addresses. These special addresses are canaries used by the kernel. Their intent is to raise awareness of ab erroneous state by triggering (noisy) crash faults (instead of failing silently).
Here's a short table of the most common one and what they mean.
Address |
Meaning |
Caused by |
Constant |
0x6b6b6b6b |
use after free |
slab poisoning |
POISON_FREE |
0xa5a5a5a5 |
use of uninitialized memory |
slab poisoning |
POISON_INUSE |
0x00100100 |
use of invalid next pointer |
list poisoning |
LIST_POISON1 |
0x00200200 |
use of invalid prev pointer |
list poisoning |
LIST_POISON2 |
0xcccccccc |
use of an address before or beyond allocated memory; or |
probably an off-by-one error / buffer overrun |
SLUB_RED_ACTIVE |
0xcccccccc |
use of uninitialized init memory |
you might have accessed a per-cpu variable of an offline CPU |
POISON_FREE_INITMEM |
Have a look at include/linux/poison.h for additional, less commonly encountered poison values.
Dump the TRACE() buffer
When the kernel crashes/panics/locks-up inside KVM with gdb attached, it is possible to rescue to the buffer that stores the TRACE() debug messages (if TRACE() is enabled in the kernel configuration, of course).
In gdb, dump the memory buffer to a file using the dump memory command.
dump binary memory dump.bin (debug_buffer.buf debug_buffer.buf+100000)
Extract the strings from the dump in dump.bin.
strings dump.bin > trace.txt
The log messages are now in trace.txt. Note, however, that the buffer may have wrapped around. Use the sequence number that is part of each message to figure out what the last message(s) before the crash were.