powerd and wlanpacket [PATCH] revised patches

Paul Fox pgf at laptop.org
Sun Feb 26 10:46:23 EST 2012


hi jerry --

let me make sure i understand these.

jerry wrote:
 > 
 > Hey Paul:
 > I'm back, and I think I can explain this now once you patch powerd for
 > some better logging around set_wake_on_wlan, that would be patch 1. 

patch 1 has no functional effect, just logging.

 > 
 > Think it would be best to set_wake_on_wlan as early as possible. The
 > only time that set_wake_on_wlan was being run is just at wlan_associated
 > which I think is a little late. While tailing powerd.trace from tty2
 > with patch 1 you can see the that wlan_associated is the only place that
 > is was getting set. The problem is just after that occurs you have DCON
 > taking over and I think the ethtool and DCON are in a race condition,
 > the userspace may become frozen too soon for ethtool to set the wlan
 > card correctly. 

first, to clarify:  "DCON" doesn't take over the system.  powerd uses
the DCON to blank the screen -- that's all it does.  it has no effect
whatsoever on wireless.  i don't believe there can be a race condition
there.

what _does_ happen, soon after we prepare for sleeping by setting the
wireless wakeup conditions with ethtool, is we call rtcwake.  that's
where the system suspends.

what makes you think there's a race condition between ethtool (which
i believe to be a synchronous operation), and suspending via rtcwake?
i'm not denying it might be happening, but i'm skeptical.  (i'm also
certainly willing to move the ethtool call earlier, just in case)

 > 
 > Patch 2 does coming out of suspend, 

so in patch 2, you call ethtool immediately after we return from
rtcwake, to be ready for the next suspend.  okay.


 > while patch 3 does the same thing
 > early in the initialization process.

i understand that to make sure the right conditions are set to start
with, we'd need to also set them when powerd starts.  but this isn't
the right place.  the dcon thaw has nothing to do with when wireless
should be ready.


 > Patch 4 covers when the network
 > maybe becomes unavailable and needs to be reset.

i'll take your word for it for now that this can be an issue.
 
 > Patch 5 is for improved
 > logging around wlanpacket to support my idea that the info should not be
 > acted upon. 

this is more than just a logging patch.  it removes the fake_useractive
event.  after we talked about this the other night on IRC, i've played
with this.  i'm worried that it will cause us to sleep too much, but
you have far more experience with this in real networks than i do.  it
will certainly let the screen dim and/or blank more often.

 > 
 > Patch 6 disables the call to set_wake_on_wlan at wlan_associated as this
 > wol has already been set, or to disable wol if you don't want blanked
 > wakeups.

this patch introduces a syntax error, so i doubt you've tested
it as-is.  i assume you changed the message wording after testing?

---------

i'll take a closer look at all these, but i seems your patches do
basically two things: 
    1) you've moved the setting of wol parameters as early as possible
	(patch 2), but then you've had to set them in other places, in
	case they get unset, or to undo what patch 2 did if it's no
	longer correct (patch 3, 4, 6).

	assuming you're right that there's a need to call ethtool to
	establish wake-on-lan conditions earlier, i believe we can
	find an "earlier" place that doesn't require all of these
	changes.  in 3.0 and later kernels, the wol settings are
	sticky, and don't get cleared after suspend/resume.  so i'd
	rather keep the special case(s) for older kernels as
	constrained as possible.

    2) you've changed the behavior after the system wakes due to
	an arriving packet:  we'll no longer fully wake up, as
	if the user has hit a key, and restart the idle counters, but
	instead will basically go right back to sleep.  if the packet
	was an "interesting" one, then we're relying on the IP tables
	filters to keep us from sleeping too soon.  i think this aspect
	needs to be strengthened.  i think i have a way to deal with that.


does the above sound about right?  if so, i'll float a patch for you
to try, probably tomorrow.

in the meantime:  can you say something about the actual effects of
these patches?  what specific differences in behavior have you
observed as a result of these changes?  is collaboration more
effective?  if so, how are you testing/measuring that?  as i've
understood from sridhar's posts, collaboration is the biggest issue
you're facing.  dimming the screen more aggressively will certainly
save power, but i thought networking was the bigger concern, for now.

paul

 > 
 > 
 > >  > >  > prepare_for_wakeupsource() was reading $WAKEUP_SOURCE which I believe is
 > >  > > 
 > >  > > i think you mean get_wakeupsource()?
 > >  > 
 > >  > Yes, sorry that is what I meant, easy to lose what your looking at and
 > >  > trying to type after looking at tracing for too long. lol..
 > >  > 
 > >  > > 
 > >  > >  > feedback from the firmware. 
 > >  > 
 > >  > Since $WAKEUP_SOURCE exists on 2.6.35(XO-1.5), the wakeup type is
 > >  > recorded in a field in the kernel that tells the reason for the last
 > >  > wakeup. I don't think this field is reset until there is a new entry.
 > >  > There could be an entry several seconds old.
 > > 
 > > right.  it's not updated until the next wakeup.
 > > 
 > 
 > I think $wakeupsource should be discarded if it is wlanpacket, the WLAN
 > card already knows it is busy servicing the network. We just have to
 > trigger the needed rtcwake call.
 > 
 > >  > 
 > >  > >  > Noticed that the sleep was 6-8 second, just
 > >  > >  > a bit longer than BUSYCHECK. 
 > >  > 
 > >  > When BUSYCHECK is done its loop in cpu_or_network_busy() snooze() is
 > >  > called next where get_wakeupsource reads $WAKEUP_SOURCE. The entry
 > >  > present in $WAKEUP_SOURCE may in fact be from the last BUSYCHECK and
 > >  > actually older that the time spent in the just ended BUSYCHECK. 
 > > 
 > > get_wakeupsource is called after rtcwake returns -- i.e., after we've
 > > suspended.  so i don't see how it can ever be stale.  i.e., the flow is
 > >     sleep_action()
 > > 	laptop_busy
 > > 	    cpu_or_network_busy()
 > > 		BUSYCHECK loop
 > > 	snooze
 > > 	    set_wakeup_events
 > > 	    prepare_for_wakeupsource
 > > 	    ...
 > > 	    rtcwake
 > > 	    ...
 > > 	    get_wakeup_source
 > > 
 > > 
 > > perhaps there's still some dyslexia regarding the pair of
 > > prepare_for_wakeupsource() / get_wakup_source() ?   the former
 > > is a null routine on xo-1 and 1.5.
 > > 
 > 
 > I think $wakeupsource should be discarded if it is wlanpacket
 > 
 > >  > 
 > >  > >  > That is why when you disable WAKE_ON_WLAN
 > >  > >  > the XOs will go to sleep, until-sleep_type does toggle to the next stage
 > >  > >  > as until_blank-soft after rtcalarm appears in $wakeupsource. 
 > >  > 
 > >  > Now if $wake_on_wlan is set, "wlanpacket" gets a fake_useractive, gets
 > >  > logged, you never get to "rtcalarm" and the XO doesn't dim or blank.
 > >  > Tried testing the XO where it's the only one on the AP to rule out a
 > >  > chatty network as the source of the wlanpacket with no change in the
 > >  > above noted 6-8 second wlanpacket break in snooze.    
 > > 
 > > so i gather you're seeing what you believe to be spurious wlanpacket
 > > wakeups.
 > > 
 > 
 > No, not really, the wakeups are happening, we just shouldn't trigger a
 > reset_idlecounters at this point. That's what will result if we break
 > from that loop.
 > 
 > >  > 
 > >  > >  > I believe
 > >  > >  > the message need to be ignored at this point, and to just use
 > >  > >  > cpu_or_network_busy() to determine if the laptop blanks or not.
 > >  > > 
 > >  > 
 > >  > If there is net traffic present, $monitor_network_activity should be
 > >  > able to prevent until-sleep_type from progressing to the next stage. 
 > > 
 > > you may well be right about this.  but i'm still not sure how reducing
 > > the time we spend awake will improve the laptop's performance on the
 > > network.
 > > 
 > 
 > Actually we what to progress to the next stage, to allow the screen to
 > dim or blank. 
 > 
 > >  > 
 > >  > > sorry.  i've tried a couple of times to follow you through that
 > >  > > paragraph, and i keep losing you.  disabling WAKE_ON_WLAN will keep
 > >  > > the laptop from waking up on wlan traffic.  
 > >  > 
 > >  > Not really, `ethtool eth0` reports wol as 'd' anyway. (that is a another
 > >  > issue), but $monitor_network_activity does catch active connections or
 > >  > pings.
 > > 
 > > on 2.6.35 kernels, the ethtool wol wakeup conditions are lost after
 > > every resume, so we're careful to reestablish those wakeups (i.e.,
 > > we call ethtool again) on every suspend.  this seems to be fixed in
 > > 3.0 and later kernels. 
 > > 
 > 
 > Patch 2 should take care of that.
 > 
 > >  > 
 > >  > > i'm pretty sure that if
 > >  > > you do that, you'll never get the 'wlanpacket' event that your patch
 > >  > > affects.  so i'm confused.
 > >  > > 
 > >  > 
 > >  > Well if wlanpacket wakes the XO needlessly, is there really a point to
 > >  > trusting its feedback? 
 > >  > 
 > >  > > perhaps i don't understand the goal.  i thought the problem you were having
 > >  > > with your deployments was that idle suspend was interfering with collaboration.
 > >  > > what's the problem we're trying to solve here?
 > >  > > 
 > >  > 
 > >  > DCON likes to kick in rendering the WLAN card unavailable, when it comes
 > >  > back online every 11 seconds or so NetworkManager is forced to renew its
 > >  > dhcp lease again. This is another issue. Think that network traffic is
 > >  > unneeded, and may lead to excessive network loading of the dhcp server. 
 > > 
 > > when you say "DCON likes to kick in rendering the WLAN card
 > > unavailable", are you saying that the wlan becomes unavailable when
 > > the laptop goes to sleep?  does the network light go out?  do you
 > > need to reassociate on every wakeup?
 > > 
 > 
 > Think this was related to the late setting of wol.
 > 
 > > this sounds like a networkmanager issue.  if you're reassociating and
 > > renewing your lease after every resume, i can well imagine you're having
 > > connectivity problems.
 > > 
 > > paul
 > > 
 > >  > 
 > >  > > (and, for completeness, on which laptop?  i think you're working with
 > >  > > 1.5 machines is that right?)
 > >  > > 
 > >  > 
 > >  > I'm working on XO-1.5s currently, but I have XO-1 and XO-1.75 to test
 > >  > with also. Hope I've been clearer in my explanation this time.
 > >  > 
 > >  > Jerry
 > >  > 
 > >  > 
 > > 
 > > =---------------------
 > >  paul fox, pgf at laptop.org
 > 
 > FWIW in running TIME_SLEEP="15" TIME_DIM="60" TIME_BLANK="60"
 > WAKE_ON_WLAN="yes WLAN_WAKE_FROM_BLANK_IDLE_SLEEP="yes"
 > 
 > Jerry
 > 
 > 
 > 

=---------------------
 paul fox, pgf at laptop.org


More information about the Devel mailing list