\input{configpres} \title{Kernel-Debugging} \maketitle \section{Kernel Configuration} \begin{frame} \frametitle{Kernel hacking / printk and dmesg options} \begin{itemize} \item Show timing information on printks \item Default message log level (1-7) \item Enable dynamic printk() support \end{itemize} \end{frame} \begin{frame} \frametitle{Kernel hacking / Compile-time checks and compiler options} \begin{itemize} \item Compile the kernel with debug info \item Debug Filesystem \end{itemize} \end{frame} \begin{frame} \frametitle{Kernel hacking / Memory Debugging} \begin{itemize} \item Kernel memory leak detector \end{itemize} reports leaks in /sys/kernel/debug/kmemleak see also: Documentation/kmemleak.txt \end{frame} \begin{frame} \frametitle{Kernel hacking / Debug Lockups and Hangs} \begin{itemize} \item Debug Lockups and Hangs \begin{description} \item[Softlockup] loop in kernel mode \item[Hardlockup] CPU loop in kernel mode without letting IRQs run \item[Hung task] task is uninterruptible (D state) \end{description} \end{itemize} stack trace is printed on detection \end{frame} \begin{frame} \frametitle{Kernel hacking / Lock Debugging} \begin{itemize} \item RT Mutex debugging, deadlock detection \item Lock debugging: prove locking correctness, see Documentation/lockdep-design.txt. \end{itemize} \end{frame} \begin{frame} \frametitle{Kernel hacking / Tracing} \begin{itemize} \item Kernel Function (Graph) Tracer \item Scheduling Latency Tracer \item Enable [k/u]probes-based dynamic events \item enable/disable function tracing dynamically \item Ring buffer benchmark stress tester (!!don't use it!!) \end{itemize} \end{frame} \begin{frame} \frametitle{Kernel hacking} \begin{description} \item[Remote debugging over FireWire] Documentation/debugging-via-ohci1394.txt \end{description} \end{frame} \section{printk} \begin{frame}[fragile] \frametitle{printk is your friend!!} Usage is similar to printf() in userspace. Different loglevels: \small \begin{verbatim} KERN_EMERG "<0>" /* system is unusable */ KERN_ALERT "<1>" /* action must be taken immediately */ KERN_CRIT "<2>" /* critical conditions */ KERN_ERR "<3>" /* error conditions */ KERN_WARNING "<4>" /* warning conditions */ KERN_NOTICE "<5>" /* normal but significant condition */ KERN_INFO "<6>" /* informational */ KERN_DEBUG "<7>" /* debug-level messages */ \end{verbatim} \end{frame} \section{printk} \begin{frame}[fragile] \frametitle{Example} \begin{verbatim} printk(KERN_EMERG "Fatal error!\n"); \end{verbatim} Setting the loglevel via kernel commandline: \begin{verbatim} loglevel=7 \end{verbatim} Loglevel in procfs: \begin{verbatim} # console_loglevel, default_message_loglevel, # minimum_console_level and default_console_loglevel $ cat /proc/sys/kernel/printk 4 4 1 7 \end{verbatim} \end{frame} \section{dynamic printk} \begin{frame} \frametitle{dynamic printk} controlled by debugfs: dynamic\_debug/control format: filename:lineno [module]function flags format \begin{description} \item[filename] source file of the debug statement \item[lineno] line number of the debug statement \item[module] module that contains the debug statement \item[function] function that contains the debug statement \item[flags] '=p' means the line is turned 'on' for printing \item[format] the format used for the debug statement \end{description} Use pr\_debug() and dev\_dbg() in your code. see also: Documentation/dynamic-debug-howto.txt \end{frame} \section{Logging messages} \begin{frame}[fragile] \frametitle{Serial Console} Kernel configuration (example): \begin{verbatim} Device Drivers ---> Character devices ---> Serial drivers ---> [*] Console on 8250/16550 and compatible serial port \end{verbatim} Kernel commandline (examples) \begin{verbatim} console=ttyS0,115200 \end{verbatim} \begin{verbatim} console=ttyS0,115200n8 \end{verbatim} \begin{verbatim} console=ttyAM0,115200 \end{verbatim} \end{frame} \begin{frame}[fragile] \frametitle{Netconsole} Kernel commandline: \begin{verbatim} netconsole=[src-port]@[src-ip]/[], [tgt-port]@/[tgt-macaddr] \end{verbatim} src-port defaults to 6665. tgt-port defaults to 6666. \begin{verbatim} netconsole=@/,@10.10.0.2/ \end{verbatim} On the host side: \begin{verbatim} $ nc -l -u -p 6666 # or (depending on your netcat version) $ nc -l -u 6666 \end{verbatim} \end{frame} \section{Analyzing Backtraces} \begin{frame}[fragile] \frametitle{Oops...Something went wrong ;-)} Note: PC is at c0009378 \tiny \begin{verbatim} CPU: 0 Not tainted (2.6.37 #9) PC is at prepare_namespace+0x170/0x1d4 LR is at do_unlinkat+0x10c/0x14c pc : [] lr : [] psr: 80000013 sp : c783dfc8 ip : c783df20 fp : c783dfe0 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : 00000013 r6 : c0054a1c r5 : c0022995 r4 : c03969a4 r3 : 00000000 r2 : 00000000 r1 : c78c7000 r0 : 00000000 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 00093177 Table: 00004000 DAC: 00000017 Process swapper (pid: 1, stack limit = 0xc783c260) Stack: (0xc783dfc8 to 0xc783e000) dfc0: c783dfd4 c0396940 c00084fc c783dff4 c783dfe4 c0008610 dfe0: c0009214 00000000 00000000 c783dff8 c0054a1c c0008508 00000000 00000000 Backtrace: [] (prepare_namespace+0x0/0x1d4) from [] (kernel_init+0x114/0x154) r5:c00084fc r4:c0396940 [] (kernel_init+0x0/0x154) from [] (do_exit+0x0/0x660) r4:00000000 Code: e3500000 13a03601 15843000 e3a03000 (e5d31000) ---[ end trace 4ed5c061b76895d8 ]--- \end{verbatim} \end{frame} \begin{frame}[fragile] \frametitle{Analyzing Backtraces: addr2line} If you compiled your kernel with debug info, you can use addr2line to decode the address: \small \begin{verbatim} $ arm-none-linux-gnueabi-addr2line -e vmlinux c0009378 linux-2.6.37/init/do_mounts.c:488 \end{verbatim} \end{frame} \section{Debugging early crashes} \begin{frame}[fragile] \frametitle{Early printk} \begin{verbatim} Kernel hacking ---> [*] Kernel low-level debugging functions [*] Early printk \end{verbatim} \begin{verbatim} earlyprintk=serial,ttyAMA0,115200,keep \ console=ttyAMA0,115200 \end{verbatim} \end{frame} \section{Magic SysRQ} \begin{frame} \frametitle{Magic Sysrequest Keys} Sysrequest (Alt-Print) or break signal (via serial console) followed by a specific key: \begin{itemize} \item t: Show task states \item p: Show registers \item q: Show all timers \item b: Force reboot \item e: Terminate all tasks \item l: Show backtrace of all active CPUs \item s: Sync \item h: Help \end{itemize} A sysrequest can also be triggered by echoing the values to /proc/sysrq-trigger. \end{frame} \section{ftrace} \begin{frame}[fragile] \frametitle{trace\_printk()} \begin{itemize} \item Writes to the tracing ring buffer \item Can be used in any context \item Useful for debugging ''high volume areas'' \item Syntax similar to printk(): \begin{verbatim} trace_printk("my_var -> %d\n", my_var); \end{verbatim} \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{ftrace\_dump\_on\_oops} Example: \begin{verbatim} # Kernel Commandline: ftrace=sched_switch ftrace_dump_on_oops \end{verbatim} \tiny \begin{verbatim} Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 35 [#1] last sysfs file: Dumping ftrace buffer: --------------------------------- swapper-1 0dN... 9034us : 1:120:R + [000] 2:120:S kthreadd swapper-1 0..... 10044us : 1:120:D ==> [000] 2:120:R kthreadd kthreadd-2 0d.... 10529us : 2:120:R + [000] 15:120:R kswapd0 kthreadd-2 0..... 10577us : 2:120:S ==> [000] 14:120:R kworker/0:1 kworker/-14 0d.... 10623us : 14:120:D + [000] 4:120:D kworker/0:0 kworker/-14 0..... 10656us : 14:120:D ==> [000] 4:120:R kworker/0:0 kworker/-4 0d.... 10848us : 4:120:R + [000] 14:120:D kworker/0:1 kworker/-4 0..... 11943us : 4:120:S ==> [000] 14:120:R kworker/0:1 kworker/-14 0..... 12004us : 14:120:S ==> [000] 15:120:R kswapd0 [...] \end{verbatim} \end{frame} \section{Qemu} \begin{frame}[fragile] \frametitle{Kerneldebugging with Qemu} \begin{verbatim} $ qemu-system-arm -M versatilepb -m 128 \ -S -s -kernel zImage \end{verbatim} \begin{verbatim} $ arm-none-linux-gnueabi-gdb vmlinux (gdb) target remote localhost:1234 Remote debugging using localhost:1234 0x00000000 in ?? () (gdb) break start_kernel (gdb) c Continuing. Breakpoint 1, start_kernel () at linux-2.6.37-rc4/init/main.c:539 539 smp_setup_processor_id(); \end{verbatim} \end{frame} \section{KGDB} \begin{frame} \frametitle{KGDB} \begin{itemize} \item KDB: Simplistic shell style interface: Inspect registers, process list, ... \item KGDB: Source-level debugging \end{itemize} \end{frame} \begin{frame} \frametitle{KDB} \begin{itemize} \item NOT a source level debugger!! \item Enter KDB with SysRQ-g: echo g $>$ /proc/sysrq-trigger \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{KGDB} Kernel configuration: \begin{verbatim} Kernel hacking ---> [*] KGDB: kernel debugging with remote gdb ---> <*> KGDB: use kgdb over the serial console (NEW) \end{verbatim} Kernel Commandline: \begin{verbatim} kgdboc=ttyAMA0 kgdbwait \end{verbatim} \end{frame} \begin{frame}[fragile] \frametitle{KGDB} 1)\\ \begin{verbatim} qemu-system-arm -M versatilepb -m 128 \ -serial tcp:localhost:2345,server \ -kernel zImage -append "kgdboc=ttyAMA0 kgdbwait" \end{verbatim} 2)\\ telnet localhost 2345\\ \begin{verbatim} kgdb: Wait for connection from remote gdb... \end{verbatim} CTRL-] quit \end{frame} \begin{frame}[fragile] \frametitle{KGDB} 3)\\ \begin{verbatim} arm-none-linux-gnueabi-gdb vmlinux (gdb) target remote localhost:2345 Remote debugging using localhost:2345 kgdb_breakpoint () at linux-2.6.37-rc4/kernel/debug/debug_core.c:959 959 arch_kgdb_breakpoint(); (gdb) \end{verbatim} \end{frame} \section{User Mode Linux} \begin{frame} \frametitle{UML: User Mode Linux} \begin{alertblock}{What is User Mode Linux?} User Mode Linux allows you to run the Linux kernel as a normal userspace process. \end{alertblock} \end{frame} \begin{frame}[fragile] \frametitle{UML: Build and boot a UML kernel} \begin{lstlisting}[language=bash] $ make ARCH=um defconfig $ make ARCH=um $ file linux linux: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.8, not stripped $ ./linux \end{lstlisting} \end{frame} \begin{frame}[fragile] \frametitle{UML: Parameters} \begin{lstlisting}[language=bash] $ ./linux ubd0=my_disk.img \ eth0=slirp,,slirp \ hostfs=${HOME} \end{lstlisting} \begin{itemize} \item ubd0: The ubdX parameter maps a file in the underlying filesystem to a device. This is used to specify the filesystem(s) \item eth0=slirp,,slirp: This enables limited access to the network, without having root permissions \item hostfs: Tell UML which host directories can be mounted inside the UML environment. For example: ''mount none /mnt/myhost -t hostfs'' inside UML will mount your hosts / to /mnt/myhost \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Running UML in gdb} \begin{lstlisting}[language=bash] $ gdb ./linux (gdb) handle SIGSEGV pass nostop noprint Signal Stop Print Pass to program Description SIGSEGV No No Yes Segmentation fault (gdb) b start_kernel Breakpoint 1 at 0x80493ca: file /home/devel/images/linux-2.6.37/init/main.c, line 539. (gdb) r Starting program: /home/devel/images/build_um/linux Locating the bottom of the address space ... 0x1000 Locating the top of the address space ... 0xc0000000 Core dump limits : soft - 0 hard - NONE [...] Adding 10047488 bytes to physical memory to account for exec-shield gap Breakpoint 1, start_kernel () at /home/devel/images/linux-2.6.37/init/main.c:539 539 smp_setup_processor_id(); (gdb) c Continuing. Linux version 2.6.37 (devel@ltx) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) Thu Feb 10 06:25:23 UTC 2011 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 10561 [...] \end{lstlisting} \end{frame} \input{tailpres}