profiling the resume path
Marcelo Tosatti
marcelo at kvack.org
Thu Sep 6 03:16:27 EDT 2007
Hi Thomas,
I found something that looks strange behaviour while profiling the
resume-from-RAM path on OLPC.
A little explanation on traces: they have been gathered using Arnaldo's
refactoring of latencytrace patches
>From 3e81bb2749ab132f7ff68a22880754e206d7986a Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme at ghostprotocols.net>
Date: Wed, 25 Jul 2007 15:12:22 -0300
Subject: [PATCH 1/9] [MCOUNT]: Add basic support for gcc profiler
instrumentation
If CONFIG_MCOUNT is selected and /proc/sys/kernel/mcount_enabled is set to a
non-zero value the mcount routine will be called everytime we enter a kernel
function that is not marked with the "notrace" attribute.
Then it records that into a ring buffer.
so,
This "no need resched!" printk was added inside the
while (!need_resched()) {
void (*idle)(void);
...
}
loop of cpu_idle().
And I gathered two different traces of the exact same kernel.
Trace #1:
[ 0.059910] ohci_hcd 0000:00:0f.4: powerup ports
[ 0.060119] no need resched!
[ 0.070230] no need resched!
[ 0.097732] no need resched!
[ 0.106032] tick_nohz_stop_sched_tick: next_jiffies = 4294947976, last_jiffies = 4294947974 jiffies = 4294947974
[ 0.106174] tick_nohz_stop_sched_tick: expires=[106,800000000]
[ 0.106270] tick_nohz_stop_sched_tick: now=[106,786802612]
[ 0.106369] no need resched!
[ 0.119478] tick_nohz_stop_sched_tick: next_jiffies = 4294947978, last_jiffies = 4294947976 jiffies = 4294947976
[ 0.119618] tick_nohz_stop_sched_tick: expires=[106,820000000]
[ 0.119714] tick_nohz_stop_sched_tick: now=[106,800253059]
[ 0.119813] no need resched!
Then the resume path restarts at (this is from mctrace, I added another
msleep right after msleep(20) for diagnosis).
[ 0.119336] ohci_pci_resume+0x8c/0xa0 -> msleep+0xb/0x27
Trace #2:
Another run with the same kernel.
[ 0.059903] ohci_hcd 0000:00:0f.4: powerup ports
[ 0.060108] no need resched!
[ 0.070225] no need resched!
[ 0.097726] no need resched!
[ 0.125226] no need resched!
[ 0.152728] no need resched!
[ 0.180230] no need resched!
[ 0.206887] tick_nohz_stop_sched_tick: next_jiffies = 4294942086,
last_jiffies = 4294942084 jiffies = 4294942084
[ 0.207028] tick_nohz_stop_sched_tick: expires=[47,900000000]
[ 0.207123] tick_nohz_stop_sched_tick: now=[47,886800917]
Which restarts at
[ 0.220191] ohci_pci_resume+0x8c/0xa0 -> msleep+0xb/0x27
Sorry, but how the hell that can happen?
Both runs look totally hosed since the msleep(20) call from
ohci_pci_resume happens at 59ms, and we have an interrupt at 97ms (on
both traces) which should get the system out of the idle loop.
Full traces:
http://dev.laptop.org/~marcelo/mctrace/trace1.txt
http://dev.laptop.org/~marcelo/mctrace/trace2.txt
Whoever sets TIF_NEED_RESCHED is not doing its job properly?
More information about the Devel
mailing list