#6528 NORM Never A: Packets that wake the laptop from suspend are often lost
Zarro Boogs per Child
bugtracker at laptop.org
Wed Feb 20 08:50:16 EST 2008
#6528: Packets that wake the laptop from suspend are often lost
----------------------+-----------------------------------------------------
Reporter: gnu | Owner: dilinger
Type: defect | Status: new
Priority: normal | Milestone: Never Assigned
Component: kernel | Version: Development build as of this date
Keywords: libertas | Verified: 0
Blocking: | Blockedby:
----------------------+-----------------------------------------------------
I found this problem while trying to reproduce #4616 on modern hardware
and software.
Setup:
* Two XO's, MP G1G1s. One is using build 656, the other update.1-691
(the "target" machine).
* In !NetworkManager screen, put both laptops on my local access point
(TrendNET TEW432-BRP). Wait a few minutes for things to settle down. Go
to donut screen, make sure both of them say they're on the access point.
* Start a terminal on each laptop. Become root.
* On the update.1-691 machine, do "ethtool -s eth0 wol um". This
enables wakeups on multicast packets.
* "ping6 -I eth0 ff02::1" on the other machine.
* This will ping the all-nodes multicast address. The laptop that
sends this should get back a unicast IPv6 ping response from each node on
the network. Keep moving the mouse on the update.1-691 laptop to avoid
suspending.
* On each laptop, it can see itself (btw, ping6 prints its own address
on its first line of output). It prints a very low latency response (e.g.
0.154 ms) packet from its own kernel. It should print one or more
"(DUP!)" packets from the other laptop as well.
* Now let the update.1-691 laptop automatically suspend itself.
Verify the power LED is off with occasional blink.
* rerun "ping6 -I eth0 ff02::1" from the other laptop. Examine the
output from ping6. The "icmp_seq=1" packet never produces any packets
from the update.1-691 laptop. However, icmp_seq=2 has one or more packets
received from the update.1-691 laptop.
The bug is that the icmp_seq=1 packet was not responded to (or that the
response packet got swallowed somewhere and lost).
This can also be reproduced using unicast packets, producing a slightly
different effect. Set up on access point as above. You can skip the
ethtool command.
* On update.1-691 machine, run "/sbin/ifconfig". Note the IPv4
address assigned to eth0. (192.168.1.105 in my case). Do not allow the
laptop to suspend before you run the next command.
* On other laptop, run "ping 192.168.1.105". This will do an ARP
exchange between the two laptops (telling the other laptop your MAC-level
address), and preloading its ARP cache. It should also produce output
showing that each sequence numbered packet was received and echoed from
the update.1-691 laptop. This is normal. Stop the ping command.
* Now let the update.1-691 laptop suspend. Soon thereafter, rerun the
ping command on the other laptop. What I saw was that icmp_seq=1 was
echoed (in 1013 ms, i.e. a second), then icmp_seq=2 was lost, then
icmp_seq=3 and all subsequent packets were echoed with timings of 3 to 5
ms.
* When I tried it again (letting the target suspend, and re-pinging),
both the first two packets were dropped; the first response printed was
for icmp_seq=3.
Proposed fix: Every packet received by the Libertas during suspend should
be buffered and handed to the kernel when possible. No packets should
ordinarily be dropped by being in suspend.
This could be a kernel problem, or could be a Libertas firmware problem.
I'm starting reporting it as kernel problem (partly because there's no
category for mesh or Libertas bugs). It should be possible to produce a
smoking gun in kernel logs, that show whether the chip ever gives the
kernel the missing packets.
By the way, if you do this test when configured to use the Mesh rather
than an access point, and you use multicast, you encounter a worse bug,
#6527. On the other hand, if you use unicast ping on the mesh, it ends up
looking about the same (icmp_seq=1 is responded to in 1016ms; =2 dropped,
=3 and onward in 3 to 4 ms).
--
Ticket URL: <http://dev.laptop.org/ticket/6528>
One Laptop Per Child <http://dev.laptop.org>
OLPC bug tracking system
More information about the Bugs
mailing list