I have never really been a big fan of debuggers. Not that I think they aren’t useful, I just have never really had a use for them. I typically get all the debugging I need out of print or trace debugging (I heard it was called scaffolding once, but I can’t find any references on the internet to it.) Most of the code I have written on my own has been short simple exercises for school. I did one really big project that spanned 3 courses and 2 semesters, but I still did print debugging because it was what I was used to. It got the job done.

Since working at StorageCraft, the project I’m working on are orders of magnitude larger than anything I’ve worked on before. The code base was easy to jump into because the code I was working on is written in small chunks. I was able to get really good mileage out of a test-driven development module because the code was already working on Windows, and I just needed to port it to Linux with out introducing any bugs on the Windows side. When all my unit tests passed, and it was time to build the full product line and test it’s features, using a debugger saved me hours because I could easily set break points and run the code again, instead of adding a print statement, re-compiling, and then running it again. I was able to get Code::Blocks working on Linux for a GUI debugger.

On to the good part. Doing kernel development has been a lot of fun, but I couldn’t use my newly acquired debugging skills with GDB, so I went back to using printk to debug my kernel module. I am running RHEL6 Workstation on my machine. I obviously don’t want to actively develop kernel modules on my workstation, so I am using the live-cd creator tools to build a CentOS 6 iso with just enough software to build the kernel modules for me. I chose a live CD because it don’t have to worry about corrupting my root filesystem when I kernel panic. Every time I boot the machine, it comes up in a clean state, with full read-write access to the OS. Perfect, in my mind. I use KVM to boot the VM, and access the VM via the console, so I don’t have to depend on moving my mouse around to access the root login prompt on the VM. Everything from gnome-terminal.

Well, we’re to the point now where debugging the entire kernel module from printk statements is getting difficult. My co-worker figured out how to get KGDB working in a Gentoo VM. Using his instructions as a starting point, I found out the RHEL 6 has everything I needed just sitting there waiting for me to use it. Here’s the kickstart I’m using to generate a live CD:

# Kickstart file automatically generated by anaconda.
#sudo livecd-creator --verbose --fslabel=centos_kerneldev --cache=/var/tmp/livecache --config=kerneldev-ks.cfg

url --url=http://mirror/centos/6/os/x86_64
lang en_US.UTF-8
keyboard us
network --device eth0 --bootproto dhcp
rootpw  --iscrypted #################
firewall --service=ssh
authconfig --enableshadow --passalgo=sha512 --enablefingerprint
selinux --disabled
timezone --utc America/Denver
bootloader --timeout=3 --location=mbr --driveorder=vda --append="console=ttyS0,115200 kgdboc=ttyS1"
services --disabled kdump,libvirt-guests,lvm2-monitor,mdmonitor,postfix,rpcbind,rpcgssd,iptables,ip6tables,abrt
firstboot --disable

repo --name="local"  --baseurl=http://mirror/local/ --cost=50
repo --name="CentOS"  --baseurl=http://mirror/centos/6/os/x86_64/ --cost=100
repo --name="CentOS Updates"  --baseurl=http://mirror/centos/6/updates/x86_64/ --cost=1000
repo --name="EPEL"  --baseurl=http://mirror/fedora-epel/6/x86_64/ --cost=1000


echo '*.* @@rsyslog.hostname' >> /etc/rsyslog.conf
#I echo an NFS mount point into /etc/fstab for my source code here.

With I put a comment in the kickstart to remind me how to build the ISO :). I also put some other things in post specifically related to how I want to access my kernel module and source code from inside my VM. You don’t want to see any of that 🙂 I also have a local repository that I can drop things into if I want to override an upstream rpm with. For instance, I ran into a bug with the new version of GDB not working well with KGDB. So I downloaded the SRPM for my kernel (2.6.32-131.6.1.el6), added the patch, and rebuilt it. The bugzilla bug says it should be fixed in an upcoming kernel version, so eventually I won’t need to put my kernel packages in my personal repo anymore.

In my Kickstart, you’ll notice that I want to run a console on ttyS0, and kgdboc on ttyS1. We need to expose these both to the host OS. Here’s my qemu xml definition for one such VM:

<domain type='kvm' id='32'>
    <type arch='x86_64' machine='rhel6.1.0'>hvm</type>
    <boot dev='cdrom'/>
    <bootmenu enable='no'/>
  <clock offset='localtime'/>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='none' io='threads'/>
      <source file='/var/lib/libvirt/images/centos_kerneldev.iso'/>
      <target dev='hdc' bus='ide'/>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' unit='0'/>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    <interface type='network'>
      <mac address='52:54:00:b0:f9:10'/>
      <source network='KernelDev'/>
      <target dev='vnet0'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    <serial type='pty'>
      <source path='/dev/pts/14'/>
      <target port='0'/>
      <alias name='serial0'/>
    <serial type='tcp'>
      <source mode='bind' host='' service='4555'/>
      <protocol type='raw'/>
      <target port='1'/>
      <alias name='serial1'/>
    <console type='pty' tty='/dev/pts/14'>
      <source path='/dev/pts/14'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5900' autoport='yes'/>
    <sound model='ich6'>
      <alias name='sound0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>

You may find it easier to configure the VM from the gui, but taking screenshots and walking you through that interface just isn’t interesting to me. It should be easy for you to read the <serial> portions of the configuration file, and apply the options in the GUI.

Next, we need the host machine to have the source code and vmlinux image to debug. All I needed on RHEL6 was to install ‘kernel-debug-debuginfo’ which pulled in ‘kernel-debuginfo-common’ as a dependancy.

yum -y install kernel-debug-debuginfo

Now, the only concern here is making sure your kernel-debug-debuginfo pulls in the same version for the kernel you’re running inside your VM. This is one more reason why the local repository is useful. Just build or download the RPMs for the kernel version you need, and stick them all in the same place. Once there, install them from the same place in your VM and in your Host.

With all that in place, we’re finally ready to boot our VM. If you want to debug the kernel at boot, you need to add “kgdbwait” to the bootloader option in the kickstart (and rebuild the iso ofcourse). I don’t need to, so I let the VM come all the way up. Then I build and insmod my kernel module from the commandline (or a unit test script.)

insmod mymodule.ko
cat /sys/module/mymodule/sections/.text
echo g > /proc/sysrq-trigger

With these three commands, I have everything I need to finally run gdb. I need the output of the .text file to tell gdb where to load my kernel module’s symbols, otherwise breakpoints will not have the correct address. gdb has an -x parameter that you can use to automatically run some gdb commands. For instance, my gdb commands file looks likes this:

file /usr/lib/debug/lib/modules/2.6.32-131.6.1.el6.x86_64.debug/vmlinux
dir .
add-symbol-file mymodule.ko 0xffffffffa0398000
target remote localhost:4555

The add-symbol-file line’s 3rd argument is the output from the .text file from inside the VM. My unittest will probably start outputting a gdb commands file eventually, so I can just run ‘./unit_test.sh’ from inside my vm, which writes out a gdb commands file to an NFS share on my host machine, and then from my host, I can run gdb -x commands_file. I haven’t reached that point yet, but I really wanted to get what I have done so far written down.

One last note about break points. You can use gdb to set breakpoints right after running the ‘target remote’ command before running the ‘continue’ command. Alternatively, you can execute the function kgdb_breakpoint() from inside your code to trigger the break point. This is useful for when you want to break into your code when there is an exceptional case, instead of using BUG() to crash your kernel, and dump a call trace (which I have done A LOT).