#8301 BLOC 8.2.0 (: Fast suspend/resume cycle causes a libertas crash

Zarro Boogs per Child bugtracker at laptop.org
Fri Sep 5 19:38:01 EDT 2008


#8301: Fast suspend/resume cycle causes a libertas crash
------------------------+---------------------------------------------------
   Reporter:  cjb       |       Owner:  dsaxena                        
       Type:  defect    |      Status:  new                            
   Priority:  blocker   |   Milestone:  8.2.0 (was Update.2)           
  Component:  kernel    |     Version:  not specified                  
 Resolution:            |    Keywords:  blocks-:8.2.0 cjbfor8.2 relnote
Next_action:  diagnose  |    Verified:  0                              
  Blockedby:            |    Blocking:                                 
------------------------+---------------------------------------------------

Comment(by dcbw):

 Awesome.

 So the lbs_cmd_async() in lbs_stop_card() is probably coming from
 lbs_set_mcast_worker(), which is scheduled from lbs_eth_stop() which is
 called via netif_carrier_off().  Not a problem, but shouldn't be
 happening.  We should be checking priv->surprise_removed before sending
 any new commands to the card from stop/remove paths.

 The return value of lbs_stop_card() can be ignored because nothing
 actually touches it, and in fact it now returns void on 2.6.27.

 I think the lbs_set_mac_control() and lbs_cmd_async() bits during
 lbs_remove_card() are the same thing since that code also does
 netif_carrier_off().

 We can ignore the lack of a 'leave' for lbs_remove_rtap(), which is also
 fixed in 2.6.27.

 The core problem looks like kthread_should_stop() isn't returning true in
 lbs_thread(), otherwise it would break out before the
 priv->surpriseremoved check in lbs_thread(), or it would set shouldsleep
 to 0 after and break out in the next iteration.

 Can you put something like this in, rebuild and reproduce?  We need to see
 if it's getting block in one of the timer/queue cancellation calls in
 lbs_remove_card() before it's able to hit the kthread_stop() near the
 bottom.

 {{{
         lbs_remove_mesh(priv);
 +lbs_deb_main("%s: 1\n", __func__);
         lbs_remove_rtap(priv);

         dev = priv->dev;

 +lbs_deb_main("%s: 2\n", __func__);
         cancel_delayed_work_sync(&priv->scan_work);
 +lbs_deb_main("%s: 3\n", __func__);
         cancel_delayed_work_sync(&priv->assoc_work);
 +lbs_deb_main("%s: 4\n", __func__);
         cancel_work_sync(&priv->mcast_work);
 +lbs_deb_main("%s: 5\n", __func__);
         destroy_workqueue(priv->work_thread);
 +lbs_deb_main("%s: 6\n", __func__);

         if (priv->psmode == LBS802_11POWERMODEMAX_PSP) {
                 priv->psmode = LBS802_11POWERMODECAM;
 +lbs_deb_main("%s: 7\n", __func__);
                 lbs_ps_wakeup(priv, CMD_OPTION_WAITFORRSP);
         }

 +lbs_deb_main("%s: 8\n", __func__);
         memset(wrqu.ap_addr.sa_data, 0xaa, ETH_ALEN);
         wrqu.ap_addr.sa_family = ARPHRD_ETHER;
         wireless_send_event(priv->dev, SIOCGIWAP, &wrqu, NULL);

         /* Stop the thread servicing the interrupts */
 +lbs_deb_main("%s: 9\n", __func__);
         priv->surpriseremoved = 1;
         kthread_stop(priv->main_thread);
 +lbs_deb_main("%s: 10\n", __func__);

         lbs_free_adapter(priv);
 }}}

-- 
Ticket URL: <http://dev.laptop.org/ticket/8301#comment:25>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list