#6528 NORM Never A: Packets that wake the laptop from suspend are often lost

Zarro Boogs per Child bugtracker at laptop.org
Wed Feb 20 08:50:16 EST 2008


#6528: Packets that wake the laptop from suspend are often lost
----------------------+-----------------------------------------------------
 Reporter:  gnu       |       Owner:  dilinger                         
     Type:  defect    |      Status:  new                              
 Priority:  normal    |   Milestone:  Never Assigned                   
Component:  kernel    |     Version:  Development build as of this date
 Keywords:  libertas  |    Verified:  0                                
 Blocking:            |   Blockedby:                                   
----------------------+-----------------------------------------------------
 I found this problem while trying to reproduce #4616 on modern hardware
 and software.

     Setup:

     * Two XO's, MP G1G1s. One is using build 656, the other update.1-691
 (the "target" machine).

     * In !NetworkManager screen, put both laptops on my local access point
 (TrendNET TEW432-BRP).  Wait a few minutes for things to settle down. Go
 to donut screen, make sure both of them say they're on the access point.

     * Start a terminal on each laptop. Become root.

     * On the update.1-691 machine, do "ethtool -s eth0 wol um".  This
 enables wakeups on multicast packets.

     * "ping6 -I eth0 ff02::1" on the other machine.

     * This will ping the all-nodes multicast address. The laptop that
 sends this should get back a unicast IPv6 ping response from each node on
 the network. Keep moving the mouse on the update.1-691 laptop to avoid
 suspending.

     * On each laptop, it can see itself (btw, ping6 prints its own address
 on its first line of output). It prints a very low latency response (e.g.
 0.154 ms) packet from its own kernel.  It should print one or more
 "(DUP!)" packets from the other laptop as well.

     * Now let the update.1-691 laptop automatically suspend itself.
 Verify the power LED is off with occasional blink.

     * rerun "ping6 -I eth0 ff02::1" from the other laptop.  Examine the
 output from ping6.  The "icmp_seq=1" packet never produces any packets
 from the update.1-691 laptop.  However, icmp_seq=2 has one or more packets
 received from the update.1-691 laptop.

 The bug is that the icmp_seq=1 packet was not responded to (or that the
 response packet got swallowed somewhere and lost).

 This can also be reproduced using unicast packets, producing a slightly
 different effect.  Set up on access point as above.  You can skip the
 ethtool command.

     *  On update.1-691 machine, run "/sbin/ifconfig".  Note the IPv4
 address assigned to eth0.  (192.168.1.105 in my case).  Do not allow the
 laptop to suspend before you run the next command.

     *  On other laptop, run "ping 192.168.1.105".  This will do an ARP
 exchange between the two laptops (telling the other laptop your MAC-level
 address), and preloading its ARP cache.  It should also produce output
 showing that each sequence numbered packet was received and echoed from
 the update.1-691 laptop.  This is normal.  Stop the ping command.

     * Now let the update.1-691 laptop suspend.  Soon thereafter, rerun the
 ping command on the other laptop.  What I saw was that icmp_seq=1 was
 echoed (in 1013 ms, i.e. a second), then icmp_seq=2 was lost, then
 icmp_seq=3 and all subsequent packets were echoed with timings of 3 to 5
 ms.

     * When I tried it again (letting the target suspend, and re-pinging),
 both the first two packets were dropped; the first response printed was
 for icmp_seq=3.

 Proposed fix:  Every packet received by the Libertas during suspend should
 be buffered and handed to the kernel when possible.  No packets should
 ordinarily be dropped by being in suspend.

 This could be a kernel problem, or could be a Libertas firmware problem.
 I'm starting reporting it as kernel problem (partly because there's no
 category for mesh or Libertas bugs). It should be possible to produce a
 smoking gun in kernel logs, that show whether the chip ever gives the
 kernel the missing packets.

 By the way, if you do this test when configured to use the Mesh rather
 than an access point, and you use multicast, you encounter a worse bug,
 #6527.  On the other hand, if you use unicast ping on the mesh, it ends up
 looking about the same (icmp_seq=1 is responded to in 1016ms; =2 dropped,
 =3 and onward in 3 to 4 ms).

-- 
Ticket URL: <http://dev.laptop.org/ticket/6528>
One Laptop Per Child <http://dev.laptop.org>
OLPC bug tracking system



More information about the Bugs mailing list