#10802 NORM Not Tri: suspend sometimes hangs with several blocked tasks
Zarro Boogs per Child
bugtracker at laptop.org
Sat Apr 2 04:53:43 EDT 2011
#10802: suspend sometimes hangs with several blocked tasks
--------------------------+-------------------------------------------------
Reporter: sascha_silbe | Owner: buytenh
Type: defect | Status: new
Priority: normal | Milestone: Not Triaged
Component: kernel | Version: Development source as of this date
Keywords: | Next_action: diagnose
Verified: 0 | Deployment_affected:
Blockedby: | Blocking:
--------------------------+-------------------------------------------------
With a kernel based on OLPC 2.6.35 (commit 2447ae65) suspend sometimes
hangs (magic SysRq via serial console still works):
{{{
[ 2346.129716] Restarting tasks ... done.
[ 2359.363513] PM: Syncing filesystems ... done.
[ 2359.377963] Freezing user space processes ... (elapsed 0.01 seconds)
done.
[ 2359.403030] Freezing remaining freezable tasks ... (elapsed 0.01
seconds) done.
[ 2359.425706] dcon_source_switch to DCON
[ 2359.465513] olpc-dcon: The DCON has control
[ 2359.470673] HDA Intel 0000:00:14.0: PCI INT A disabled
[ 2359.476266] ehci_hcd 0000:00:10.4: PCI INT D disabled
[ 2359.481681] uhci_hcd 0000:00:10.2: PCI INT C disabled
[ 2359.487075] uhci_hcd 0000:00:10.1: PCI INT B disabled
[ 2359.492489] uhci_hcd 0000:00:10.0: PCI INT A disabled
[ 2359.497920] viafb_suspend!
[ 2359.501522] viafb 0000:00:01.0: PCI INT A disabled
[ 2359.506666] mmc1:0001:1: suspend: PM flags = 0x3
[ 2523.010056] INFO: task kjournald:1438 blocked for more than 120
seconds.
[ 2523.017190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 2523.025536] kjournald D ea21ee60 2596 1438 2 0x00000000
[ 2523.032169] ea21ee80 00000046 0b300001 ea21ee60 ea21ee64 b0420fcf
00000001 ea35d530
[ 2523.040558] a63754ab 00000226 ea35d530 ea35d530 00000000 ea21eecc
ea21ee8c b088cdf0
[ 2523.048942] ea21eec4 ea21ee94 b04ce488 ea21eeb0 b088d2d7 b04ce455
b19cc9b8 b04ce455
[ 2523.057297] Call Trace:
[ 2523.059912] [<b0420fcf>] ? get_parent_ip+0xb/0x31
[ 2523.065051] [<b088cdf0>] io_schedule+0x21/0x31
[ 2523.069874] [<b04ce488>] sync_buffer+0x33/0x37
[ 2523.074730] [<b088d2d7>] __wait_on_bit+0x39/0x60
[ 2523.079737] [<b04ce455>] ? sync_buffer+0x0/0x37
[ 2523.084683] [<b04ce455>] ? sync_buffer+0x0/0x37
[ 2523.089611] [<b088d39d>] out_of_line_wait_on_bit+0x9f/0xa7
[ 2523.095578] [<b043de1c>] ? wake_bit_function+0x0/0x3c
[ 2523.101062] [<b04ce3e9>] __wait_on_buffer+0x1e/0x21
[ 2523.106356] [<b0540051>] journal_commit_transaction+0x9e8/0xf5b
[ 2523.112772] [<b088cda7>] ? schedule+0x448/0x470
[ 2523.117688] [<b0890ae4>] ? sub_preempt_count+0x9/0x8e
[ 2523.123183] [<b0420fcf>] ? get_parent_ip+0xb/0x31
[ 2523.128286] [<b05430c5>] kjournald+0xc1/0x1e1
[ 2523.133053] [<b043dde8>] ? autoremove_wake_function+0x0/0x34
[ 2523.139155] [<b0543004>] ? kjournald+0x0/0x1e1
[ 2523.143992] [<b043dab0>] kthread+0x63/0x68
[ 2523.148442] [<b043da4d>] ? kthread+0x0/0x68
[ 2523.153003] [<b0402cf6>] kernel_thread_helper+0x6/0x10
[ 2523.158625] INFO: task rtcwake:895 blocked for more than 120 seconds.
[ 2523.165502] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 2523.173839] rtcwake D e9c89e50 2744 895 2399 0x00000000
[ 2523.180481] e9c89e70 00000086 00000001 e9c89e50 b0890b5c 00000246
e9c89e5c ea2ce0c0
[ 2523.188853] 5e5aa79f 00000225 ea2ce0c0 58a5589b 00000225 e9c89e84
e9c89ea4 b044344d
[ 2523.197214] 00000000 0000156a b0a961b0 00000000 ea2ce0c0 b043dde8
b0a961a8 b0a961a8
[ 2523.205618] Call Trace:
[ 2523.208241] [<b0890b5c>] ? sub_preempt_count+0x81/0x8e
[ 2523.213828] [<b044344d>] async_synchronize_cookie_domain+0xaf/0x103
[ 2523.220594] [<b043dde8>] ? autoremove_wake_function+0x0/0x34
[ 2523.226708] [<b04434b3>] async_synchronize_cookie+0x12/0x14
[ 2523.232751] [<b04434cd>] async_synchronize_full+0x18/0x36
[ 2523.238590] [<b06adbbc>] dpm_suspend_start+0x344/0x36b
[ 2523.244182] [<b0452790>] suspend_devices_and_enter+0x4a/0x17f
[ 2523.250392] [<b0452988>] enter_state+0xc3/0x10a
[ 2523.255311] [<b0452269>] state_store+0x98/0xac
[ 2523.260155] [<b04521d1>] ? state_store+0x0/0xac
[ 2523.265069] [<b060a175>] kobj_attr_store+0x16/0x22
[ 2523.270283] [<b04f302b>] sysfs_write_file+0xc4/0xef
[ 2523.275564] [<b04b07f5>] vfs_write+0x8f/0x117
[ 2523.280320] [<b04f2f67>] ? sysfs_write_file+0x0/0xef
[ 2523.285695] [<b04b091e>] sys_write+0x40/0x65
[ 2523.290360] [<b040278c>] sysenter_do_call+0x12/0x22
[ 2523.295638] INFO: task async/1:897 blocked for more than 120 seconds.
[ 2523.302499] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 2523.310839] async/1 D 00000225 3368 897 2 0x00000000
[ 2523.317433] ea35be88 00000046 5e5a98f8 00000225 ea7925b0 eaa7561c
b0a93b70 eaa755f0
[ 2523.325843] 5e5a77f1 00000225 eaa755f0 eb1209d4 7fffffff 7fffffff
ea35bedc b088cf5c
[ 2523.334184] 00000000 86ced74a 0000001c 00000000 eaa7561c 00000001
ea7925e4 b0a93b70
[ 2523.342596] Call Trace:
[ 2523.345223] [<b088cf5c>] schedule_timeout+0x1c/0x1f2
[ 2523.350617] [<b0420fcf>] ? get_parent_ip+0xb/0x31
[ 2523.355714] [<b0890b5c>] ? sub_preempt_count+0x81/0x8e
[ 2523.361289] [<b088c876>] wait_for_common+0xb9/0x101
[ 2523.366570] [<b0425f25>] ? default_wake_function+0x0/0x12
[ 2523.372426] [<b06acd4d>] ? dpm_wait_fn+0x0/0x14
[ 2523.377344] [<b088c95d>] wait_for_completion+0x17/0x19
[ 2523.382934] [<b06acd2f>] dpm_wait+0x27/0x29
[ 2523.387487] [<b06acd5d>] dpm_wait_fn+0x10/0x14
[ 2523.392333] [<b06a6f6d>] device_for_each_child+0x2d/0x57
[ 2523.398089] [<b06ad7ba>] __device_suspend+0x22/0xe0
[ 2523.403393] [<b06adbff>] async_suspend+0x1c/0x4a
[ 2523.408404] [<b044324f>] async_thread+0xf0/0x1d3
[ 2523.413432] [<b0425f25>] ? default_wake_function+0x0/0x12
[ 2523.419274] [<b044315f>] ? async_thread+0x0/0x1d3
[ 2523.424388] [<b043dab0>] kthread+0x63/0x68
[ 2523.433414] [<b0402cf6>] kernel_thread_helper+0x6/0x10
}}}
This is on an XO-1.5 running from an external, ext3 formatted SD card. To
reproduce the bug, I ran powerd with a 5 second timeout and triggered a
wake-up via IPv6 ping with an interval of 30s. It took over an hour to
trigger; during regular use, it happens about once per day.
Diff against log messages of a "good" suspend ("-" = good, "+" = hang):
{{{
Restarting tasks ... done.
PM: Syncing filesystems ... done.
-PM: Preparing system for mem sleep
-Freezing user space processes ... (elapsed 0.02 seconds) done.
+Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
-PM: Entering mem sleep
dcon_source_switch to DCON
olpc-dcon: The DCON has control
HDA Intel 0000:00:14.0: PCI INT A disabled
@@ -14,9 +12,251 @@ uhci_hcd 0000:00:10.0: PCI INT A disabled
viafb_suspend!
viafb 0000:00:01.0: PCI INT A disabled
mmc1:0001:1: suspend: PM flags = 0x3
-sdhci_reset: slot wakeups enabled
-sdhci_pci_probe: Enable PME set to 0x103
-PM: suspend of devices complete after 101.093 msecs
-PM: late suspend of devices complete after 78.596 msecs
-ACPI: Preparing to enter system sleep state S3
-PM: Saving platform NVS memory
+INFO: task kjournald:1438 blocked for more than 120 seconds.
[...]
}}}
--
Ticket URL: <http://dev.laptop.org/ticket/10802>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system
More information about the Bugs
mailing list