mesh portal discovery
david at lang.hm
david at lang.hm
Thu Jan 10 03:23:28 EST 2008
On Wed, 9 Jan 2008, John Watlington wrote:
> On Jan 9, 2008, at 6:50 PM, Mikus Grinbergs wrote:
>>> Just switch off the Legacy IP, as we should have
>>> done months ago, and get on with making things work properly.
>>> Anything else is a distraction.
>> I sympathize with how overworked OLPC developers are. But a number
>> of G1G1 systems are getting into the hands of articulate net-aware
>> people. If they become disenchanted by the Legacy IP performance of
>> the OLPC, what they say might result in hurting the whole project.
> You misunderstood our local IPv6 evangelist, he wasn't proposing to
> IPv4 on the laptop, just not to support it on the school server
> mesh. Given that
> all mesh capable devices will support IPv6, he's probably got a point.
> Here is my take-home summary of this thread:
> Short term solution is to turn off IPv6 on the mesh, and tell kids
> that if their
> network performance degrades, they should "click on the circle again"
> which will trigger an IPv4 DHCP discovery of the nearest MPP.
> Long term solution is probably to move to IPv6 only, using a user space
> agent to decide which RAs to listen to. This user space agent can
> Javier's suggestion to avoid flapping between MPPs. Mobile IPv6 would
> be frosting on the cake, but doesn't help with the primary problem of
I'm trying to make sure I fully understand the problem
it sounds as if you have a good mechanism in the mesh for the laptops to
send packets to the nearest MPP
the problem is that if they get an IP address from a MPP that is a long
way away (either initially due to a problem or over time as the laptop
moves more hops away from the MPP) the fact that reply packets will always
go the the MPP that gave out the IP address (due to normal IP routing)
results in a slow reply as these packets start taking longer to get from
the MPP to the laptop.
is this correct so far?
this problem is further complicated by the IPv6 equivalent of DHCP makeing
it more likely that the initial 'registration' with a MPP is less optimal.
and to top things off, since the replies are typically larger then the
requests (which is why people live with DSL that is only 512Kb outbound,
but is 1.5Mb inbound) the additional delays on the inbound leg are
I am makeing the assumption that the MPP machines know the wireless
topology from each of their points of view i.e. not only do they know
how to get to the wireless nodes from themselves (and how many hops away
they are), but they also know this information for each of the other MPP
nodes. If this assumption is not true currently, a daemon would need to be
run to keep the MPP boxes in agreement over who is the best gateway to the
If I am on track so far let me see if I can divide the resulting problem
into three cases
1. the MPP boxes involved are 'owned' by seperate entites and may not know
about each other over the wired network.
2. the MPP boxes involved are associated with each other, but may be two
or more network hops away from each other, but all managed as part of the
same set (with egress filters configured so that outbound traffic could
come from any of the MPP boxes)
3. the MPP boxes involved with the mesh network are tightly coupled (all
connected with a high-speed wire network on the same subnet (no routing
between them, all on the same broadcase domain)
addressing these one at a time.
for #1 I can't think of any reasonable way to move a machine from talking
to one MPP to another short of true mobile IP solutions.
for #2 the basic approach is the same as LVS uses in tunneling mode see
http://www.linuxvirtualserver.org/VS-IPTunneling.html for a diagram and
This is basicly what I was suggesting earlier, don't worry about the
outbound traffic, just bounce the inbound traffic to the closest node (via
a tunnel) before sending it over the air. this chould be a matter of
useing the existing LVS code and changing the server selection logic with
something that is aware of the wireless topology.
to avoid a routing loop where the packet gets bounced back and forth
between MPP boxes, you should be able to set things up so that the load
balancing is only done on packets coming in from the outside (I don't know
if iptables can do this stock, but it should be a simple, if ugly hack to
make packets arriving through a tunnel bypass the LVS code and get
inserted just past it in the IP stack)
the worst case with this model should be that some inbound packets get
relayed to the wrong MPP and make more hops then they need to over the
for #3 I am looking at other server load balancing options, specificly
the clusterIP target available in iptables
what this does is to define an IP address that exists on all machine and
uses a multicast MAC address, this forces the switch to send the packet to
every port of the switch. The systems then run a match on the incoming
packet to decide if they should deal with it or ignore it (in the existing
code a hash on sourceip, sourceip-sourceport, or
sourceip-sourceport-destport). if this instead did a lookup to the mesh
information to decide if this was the closest node or not, and if it is go
ahead and route it over to the wireless, if not drop it.
note that there is a race condition where one node may decide it's not the
closest before another decides that it is. If node moves are infrequent
this may not require further attention (TCP packets will get resent if
they get dropped), if they are too frequent the retries will cause too
many delays and the race would need to be narrowed or eliminated.
the race condition where two nodes both decide that they are closest (one
getting added before the other removes the entry) is not nearly the same
problem as all it would result in is an extra copy of the packet being
sent over the air (which does eat up bandwidth, but should not cause
both #2 and #3 above only work really well if there is no NAT taking place
on the MPP boxes (NAT can take place between the MPP boxes and the
Internet, as long as the MPP boxes can pretend it's not there)
If NAT is running on the MPP box then when a node migrates from one to the
other the state of any connections would need to migrate as well. This is
the same problem that is faced by a HA pair of firewalls that don't want
to loose connections when they fail over, and there are tools to deal with
this, see http://people.netfilter.org/pablo/conntrack-tools/ This is not a
nice approach, and with the dependancy on userspace to replicate the data
it is prone to gaps in coverage as data is mibrated around, but if moves
are infrequent enough this could be acceptable.
in some ways #2 is the nicest as there is only one copy of the packet
around, but the need to setup the tunnels and the more extensive
configureation neededare drawbacks, #3 is simple, but is far more likely
to end up sending extra copies of packets over the air, and definantly
will impact switch performance (as it effectivly turns your switch into a
both of them involve (relativly) simple changes to existing kernel code,
taking a chunk of code that's makeing a decision and replacing it with
code that looks at the mesh info instead.
I have sucessfully avoided useing IPv6 to this point ;-) but I don't know
of any reason why these strategies shouldn't work with it just as well as
Unfortuantly I don't know anything about the mesh code to begin trying to
code this myself. since this only needs to look at the destination IP
address and look it up in the mesh table I would take a stab at it if I
did understand where to find the mesh data. My guess is that it would be
less work for someone who understands the mesh data to try this than it
would be to educate me on the mesh data as you are running against a
deadline (and the fact that I have to fly from LA to Atlanta this weekend
won't help matters any)
More information about the Devel