Home

Advertisement

Sun, Jul. 15th, 2007, 05:47 pm
All praise systemtap

Systemtap is an easy and powerful - yet kludgy - framework to instrument (linux) kernel internals. It allows one to define probes triggered at function entry or exit, and even permit dereferencing functions arguments (ie. you can dig all the way down through structures members). Those probes are scriptables with a concise language, and can be loaded at runtime (you can execute a new script tracing the kernel without rebooting).

It proven quite useful for my current needs (detecting power hogs on a running linux desktop). Here's an example. I wanted to pinpoint all applications spinning block devices for no reason. This not only includes applications reading and writing files, but also those causing inodes metadata changes (like atime), as far as it happens to actually spin the disks (reading cached metadata is ok).

Linux offers an ugly procfs interface for this purpose: 
echo 1 > /proc/sys/vm/block_dump

This will log all applications causing block devices accesses in ... the kernel ring buffer. So you end up with a polluted dmesg  and klogd/syslogd logging like a mad (causing new disks activity, and so on). Knowing nothing about kernel's internals, I just grepped for block_dump to find every instrumented functions, and emulated this with the following systemtap script :

#! stap
# Display block I/O consumers (doing reads, writes and dirtied inodes),
# exactly as "echo 1 > /proc/sys/vm/block_dump"
# but on stdout rather than polluting kernel ring buffer (dmesg).

probe kernel.function("submit_bio") {
        op = $rw & 1 ? "write" : "read"
        printf("%s(%d) %s on device %s\n", execname(), pid(), op,
                kernel_string($bio->bi_bdev->bd_disk->disk_name))
}

probe kernel.function("__mark_inode_dirty") {
        s_id = kernel_string($inode->i_sb->s_id)
        if (($inode->i_state & $flags) != $flags && ($inode->i_ino || s_id == "bdev")) {
                printf("%s(%d) dirtied inode %d on device %s\n",
                        execname(), pid(), $inode->i_ino, s_id)
        }
}

Simple, isn't it ? So I started cooking a top(1) like utility to trace the same things. Problem: I don't know how to clear the screen without this ugly system("clear"). Any thoughts ?

ps: would it be acceptable to convert the block_dump interface to something more like /proc/timer_stats ?

Sun, Jul. 15th, 2007, 12:15 pm
Legs on the road to power efficiency

Intel's PowerTOP utility made me aware of the Linux power consumption mess. There's a lot of low hanging fruits here. A short list of things I'll investigate:
  • NetworkManager. Freackin' power drain, hard to fix. More on this in a later post.
  • SCIM. This one sucks power on all Asians' linux desktops. Bad SCIM, bad.
  • Red Hat bug 204948 aka "Userspace sucks (wakeups)"
  • Ubuntu's misnamed power-management-in-ubuntu blueprint
  • thinkpad-keys, in Ubuntu's hotkey-setup. Unneeded with kernel 2.6.22 upward: ensure it's replaced by proper ACPI events handling.
  • High Resolution Timers patchset. The "force enable hpet" series makes quite a difference on my ICH4-M system.
  • Sensible defaults on distros setups. Like AC97 power saving feature, efficient frequency scaling governor, ...
  • Why the hell isn't the thinkpad_acpi (formerly ibm_acpi) kernel module autoloaded ?
  • Write tools to track relevant things that PowerTOP doesn't show (like block I/O and DMA activity). Systemtap will be of use.
On my X40 laptop, the default Ubuntu Gutsy desktop drains ~14W (thanks to NetworkManager behavior, it can't even enter C3 or C4 ACPI c-states). Manual tweaks brings it down to ~11W: this should be the default setup. Despite Arjan's recent LKML post, there's still room for improvement.

Sun, Jul. 15th, 2007, 11:53 am
First post

First post

Advertisement