= Useful Guides and Articles on Kernel Debugging = This is a collection of useful information about how to debug LITMUS^RT^ and Linux in general. Please add your own tips or links to your favorite debugging tutorials. == Oops Decoding == Your kernel crashed and you are staring at some kind of crash dump (called "the oops"). Now what? * [[http://kerneltrap.org/mailarchive/linux kernel/2008/1/8/546623|Linus Torvalds explains some Oops decoding techniques.]] * [[http://kerneltrap.org/mailarchive/linux-kernel/2008/1/8/546884|Al Viro adds some further insight on Oops decoding and debugs an example.]] * [[http://kerneltrap.org/mailarchive/linux-kernel/2008/1/14/567425|Al Viro goes through a second Oops and explains the process.]] == Disassemble a kernel image == You have figured out that something bad occurred at some address. Unfortunately, you don't know what part of your code caused the crash. You can use this method to determine the line of C code that generated a given assembly instruction (assuming you have debugging symbols enabled in your kernel configuration): {{{ objdump -S -d vmlinux | less }}} Copy&paste the offending address into the search dialog of `less` and you should see the offending source code. (Be patient, this may take a few seconds on slow machines, the kernel is large.) == Well-known addresses == When you loop at a kernel oops you may spot some special addresses. These special addresses are canaries used by the kernel. Their intent is to raise awareness of ab erroneous state by triggering (noisy) crash faults (instead of failing silently). Here's a short table of the most common one and what they mean. || '''Address''' || '''Meaning''' || '''Caused by''' || '''Constant''' || || `0x6b6b6b6b` || use after free || slab poisoning || `POISON_FREE` || || `0xa5a5a5a5` || use of uninitialized memory || slab poisoning || `POISON_INUSE` || || `0x00100100` || use of invalid next pointer || list poisoning || `LIST_POISON1` || || `0x00200200` || use of invalid prev pointer || list poisoning || `LIST_POISON2` || || `0xcccccccc` || use of an address before or beyond allocated memory; or || probably an off-by-one error / buffer overrun || `SLUB_RED_ACTIVE` || || `0xcccccccc` || use of uninitialized init memory || you might have accessed a per-cpu variable of an offline CPU || `POISON_FREE_INITMEM` || Have a look at `include/linux/poison.h` for additional, less commonly encountered poison values. == Dump the TRACE() buffer == When the kernel crashes/panics/locks-up inside KVM with `gdb` attached, it is possible to rescue to the buffer that stores the `TRACE()` debug messages (if `TRACE()` is enabled in the kernel configuration, of course). 1. In `gdb`, dump the memory buffer to a file using the `dump memory` command. {{{ dump binary memory dump.bin (debug_buffer.buf debug_buffer.buf+100000) }}} 2. Extract the strings from the dump in `dump.bin`. {{{ strings dump.bin > trace.txt }}} 3. The log messages are now in `trace.txt`. Note, however, that the buffer may have wrapped around. Use the sequence number that is part of each message to figure out what the last message(s) before the crash were.