#10802 NORM Not Tri: suspend sometimes hangs with several blocked tasks

Zarro Boogs per Child bugtracker at laptop.org
Sat Apr 2 04:53:43 EDT 2011


#10802: suspend sometimes hangs with several blocked tasks
--------------------------+-------------------------------------------------
 Reporter:  sascha_silbe  |                 Owner:  buytenh                           
     Type:  defect        |                Status:  new                               
 Priority:  normal        |             Milestone:  Not Triaged                       
Component:  kernel        |               Version:  Development source as of this date
 Keywords:                |           Next_action:  diagnose                          
 Verified:  0             |   Deployment_affected:                                    
Blockedby:                |              Blocking:                                    
--------------------------+-------------------------------------------------
 With a kernel based on OLPC 2.6.35 (commit 2447ae65) suspend sometimes
 hangs (magic SysRq via serial console still works):

 {{{
 [ 2346.129716] Restarting tasks ... done.
 [ 2359.363513] PM: Syncing filesystems ... done.
 [ 2359.377963] Freezing user space processes ... (elapsed 0.01 seconds)
 done.
 [ 2359.403030] Freezing remaining freezable tasks ... (elapsed 0.01
 seconds) done.
 [ 2359.425706] dcon_source_switch to DCON
 [ 2359.465513] olpc-dcon: The DCON has control
 [ 2359.470673] HDA Intel 0000:00:14.0: PCI INT A disabled
 [ 2359.476266] ehci_hcd 0000:00:10.4: PCI INT D disabled
 [ 2359.481681] uhci_hcd 0000:00:10.2: PCI INT C disabled
 [ 2359.487075] uhci_hcd 0000:00:10.1: PCI INT B disabled
 [ 2359.492489] uhci_hcd 0000:00:10.0: PCI INT A disabled
 [ 2359.497920] viafb_suspend!
 [ 2359.501522] viafb 0000:00:01.0: PCI INT A disabled
 [ 2359.506666] mmc1:0001:1: suspend: PM flags = 0x3
 [ 2523.010056] INFO: task kjournald:1438 blocked for more than 120
 seconds.
 [ 2523.017190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
 this message.
 [ 2523.025536] kjournald     D ea21ee60  2596  1438      2 0x00000000
 [ 2523.032169]  ea21ee80 00000046 0b300001 ea21ee60 ea21ee64 b0420fcf
 00000001 ea35d530
 [ 2523.040558]  a63754ab 00000226 ea35d530 ea35d530 00000000 ea21eecc
 ea21ee8c b088cdf0
 [ 2523.048942]  ea21eec4 ea21ee94 b04ce488 ea21eeb0 b088d2d7 b04ce455
 b19cc9b8 b04ce455
 [ 2523.057297] Call Trace:
 [ 2523.059912]  [<b0420fcf>] ? get_parent_ip+0xb/0x31
 [ 2523.065051]  [<b088cdf0>] io_schedule+0x21/0x31
 [ 2523.069874]  [<b04ce488>] sync_buffer+0x33/0x37
 [ 2523.074730]  [<b088d2d7>] __wait_on_bit+0x39/0x60
 [ 2523.079737]  [<b04ce455>] ? sync_buffer+0x0/0x37
 [ 2523.084683]  [<b04ce455>] ? sync_buffer+0x0/0x37
 [ 2523.089611]  [<b088d39d>] out_of_line_wait_on_bit+0x9f/0xa7
 [ 2523.095578]  [<b043de1c>] ? wake_bit_function+0x0/0x3c
 [ 2523.101062]  [<b04ce3e9>] __wait_on_buffer+0x1e/0x21
 [ 2523.106356]  [<b0540051>] journal_commit_transaction+0x9e8/0xf5b
 [ 2523.112772]  [<b088cda7>] ? schedule+0x448/0x470
 [ 2523.117688]  [<b0890ae4>] ? sub_preempt_count+0x9/0x8e
 [ 2523.123183]  [<b0420fcf>] ? get_parent_ip+0xb/0x31
 [ 2523.128286]  [<b05430c5>] kjournald+0xc1/0x1e1
 [ 2523.133053]  [<b043dde8>] ? autoremove_wake_function+0x0/0x34
 [ 2523.139155]  [<b0543004>] ? kjournald+0x0/0x1e1
 [ 2523.143992]  [<b043dab0>] kthread+0x63/0x68
 [ 2523.148442]  [<b043da4d>] ? kthread+0x0/0x68
 [ 2523.153003]  [<b0402cf6>] kernel_thread_helper+0x6/0x10
 [ 2523.158625] INFO: task rtcwake:895 blocked for more than 120 seconds.
 [ 2523.165502] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
 this message.
 [ 2523.173839] rtcwake       D e9c89e50  2744   895   2399 0x00000000
 [ 2523.180481]  e9c89e70 00000086 00000001 e9c89e50 b0890b5c 00000246
 e9c89e5c ea2ce0c0
 [ 2523.188853]  5e5aa79f 00000225 ea2ce0c0 58a5589b 00000225 e9c89e84
 e9c89ea4 b044344d
 [ 2523.197214]  00000000 0000156a b0a961b0 00000000 ea2ce0c0 b043dde8
 b0a961a8 b0a961a8
 [ 2523.205618] Call Trace:
 [ 2523.208241]  [<b0890b5c>] ? sub_preempt_count+0x81/0x8e
 [ 2523.213828]  [<b044344d>] async_synchronize_cookie_domain+0xaf/0x103
 [ 2523.220594]  [<b043dde8>] ? autoremove_wake_function+0x0/0x34
 [ 2523.226708]  [<b04434b3>] async_synchronize_cookie+0x12/0x14
 [ 2523.232751]  [<b04434cd>] async_synchronize_full+0x18/0x36
 [ 2523.238590]  [<b06adbbc>] dpm_suspend_start+0x344/0x36b
 [ 2523.244182]  [<b0452790>] suspend_devices_and_enter+0x4a/0x17f
 [ 2523.250392]  [<b0452988>] enter_state+0xc3/0x10a
 [ 2523.255311]  [<b0452269>] state_store+0x98/0xac
 [ 2523.260155]  [<b04521d1>] ? state_store+0x0/0xac
 [ 2523.265069]  [<b060a175>] kobj_attr_store+0x16/0x22
 [ 2523.270283]  [<b04f302b>] sysfs_write_file+0xc4/0xef
 [ 2523.275564]  [<b04b07f5>] vfs_write+0x8f/0x117
 [ 2523.280320]  [<b04f2f67>] ? sysfs_write_file+0x0/0xef
 [ 2523.285695]  [<b04b091e>] sys_write+0x40/0x65
 [ 2523.290360]  [<b040278c>] sysenter_do_call+0x12/0x22
 [ 2523.295638] INFO: task async/1:897 blocked for more than 120 seconds.
 [ 2523.302499] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
 this message.
 [ 2523.310839] async/1       D 00000225  3368   897      2 0x00000000
 [ 2523.317433]  ea35be88 00000046 5e5a98f8 00000225 ea7925b0 eaa7561c
 b0a93b70 eaa755f0
 [ 2523.325843]  5e5a77f1 00000225 eaa755f0 eb1209d4 7fffffff 7fffffff
 ea35bedc b088cf5c
 [ 2523.334184]  00000000 86ced74a 0000001c 00000000 eaa7561c 00000001
 ea7925e4 b0a93b70
 [ 2523.342596] Call Trace:
 [ 2523.345223]  [<b088cf5c>] schedule_timeout+0x1c/0x1f2
 [ 2523.350617]  [<b0420fcf>] ? get_parent_ip+0xb/0x31
 [ 2523.355714]  [<b0890b5c>] ? sub_preempt_count+0x81/0x8e
 [ 2523.361289]  [<b088c876>] wait_for_common+0xb9/0x101
 [ 2523.366570]  [<b0425f25>] ? default_wake_function+0x0/0x12
 [ 2523.372426]  [<b06acd4d>] ? dpm_wait_fn+0x0/0x14
 [ 2523.377344]  [<b088c95d>] wait_for_completion+0x17/0x19
 [ 2523.382934]  [<b06acd2f>] dpm_wait+0x27/0x29
 [ 2523.387487]  [<b06acd5d>] dpm_wait_fn+0x10/0x14
 [ 2523.392333]  [<b06a6f6d>] device_for_each_child+0x2d/0x57
 [ 2523.398089]  [<b06ad7ba>] __device_suspend+0x22/0xe0
 [ 2523.403393]  [<b06adbff>] async_suspend+0x1c/0x4a
 [ 2523.408404]  [<b044324f>] async_thread+0xf0/0x1d3
 [ 2523.413432]  [<b0425f25>] ? default_wake_function+0x0/0x12
 [ 2523.419274]  [<b044315f>] ? async_thread+0x0/0x1d3
 [ 2523.424388]  [<b043dab0>] kthread+0x63/0x68
 [ 2523.433414]  [<b0402cf6>] kernel_thread_helper+0x6/0x10
 }}}


 This is on an XO-1.5 running from an external, ext3 formatted SD card. To
 reproduce the bug, I ran powerd with a 5 second timeout and triggered a
 wake-up via IPv6 ping with an interval of 30s. It took over an hour to
 trigger; during regular use, it happens about once per day.

 Diff against log messages of a "good" suspend ("-" = good, "+" = hang):

 {{{
  Restarting tasks ... done.
  PM: Syncing filesystems ... done.
 -PM: Preparing system for mem sleep
 -Freezing user space processes ... (elapsed 0.02 seconds) done.
 +Freezing user space processes ... (elapsed 0.01 seconds) done.
  Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
 -PM: Entering mem sleep
  dcon_source_switch to DCON
  olpc-dcon: The DCON has control
  HDA Intel 0000:00:14.0: PCI INT A disabled
 @@ -14,9 +12,251 @@ uhci_hcd 0000:00:10.0: PCI INT A disabled
  viafb_suspend!
  viafb 0000:00:01.0: PCI INT A disabled
  mmc1:0001:1: suspend: PM flags = 0x3
 -sdhci_reset: slot wakeups enabled
 -sdhci_pci_probe: Enable PME set to 0x103
 -PM: suspend of devices complete after 101.093 msecs
 -PM: late suspend of devices complete after 78.596 msecs
 -ACPI: Preparing to enter system sleep state S3
 -PM: Saving platform NVS memory
 +INFO: task kjournald:1438 blocked for more than 120 seconds.
 [...]
 }}}

-- 
Ticket URL: <http://dev.laptop.org/ticket/10802>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list