mesh portal discovery

david at lang.hm david at lang.hm
Thu Jan 10 03:23:28 EST 2008


On Wed, 9 Jan 2008, John Watlington wrote:

> On Jan 9, 2008, at 6:50 PM, Mikus Grinbergs wrote:
>
>>> Just switch off the Legacy IP, as we should have
>>> done months ago, and get on with making things work properly.
>>> Anything else is a distraction.
>>
>> I sympathize with how overworked OLPC developers are.  But a number
>> of G1G1 systems are getting into the hands of articulate net-aware
>> people.  If they become disenchanted by the Legacy IP performance of
>> the OLPC, what they say might result in hurting the whole project.
>
> You misunderstood our local IPv6 evangelist, he wasn't proposing to
> disable
> IPv4 on the laptop, just not to support it on the school server
> mesh.  Given that
> all mesh capable devices will support IPv6, he's probably got a point.
>
>
> Here is my take-home summary of this thread:
>
> Short term solution is to turn off IPv6 on the mesh, and tell kids
> that if their
> network performance degrades, they should "click on the circle again"
> which will trigger an IPv4 DHCP discovery of the nearest MPP.
>
> Long term solution is probably to move to IPv6 only, using a user space
> agent to decide which RAs to listen to.  This user space agent can
> implement
> Javier's suggestion to avoid flapping between MPPs.   Mobile IPv6 would
> be frosting on the cake, but doesn't help with the primary problem of
> MPP
> selection.

I'm trying to make sure I fully understand the problem

it sounds as if you have a good mechanism in the mesh for the laptops to 
send packets to the nearest MPP

the problem is that if they get an IP address from a MPP that is a long 
way away (either initially due to a problem or over time as the laptop 
moves more hops away from the MPP) the fact that reply packets will always 
go the the MPP that gave out the IP address (due to normal IP routing) 
results in a slow reply as these packets start taking longer to get from 
the MPP to the laptop.

is this correct so far?

this problem is further complicated by the IPv6 equivalent of DHCP makeing 
it more likely that the initial 'registration' with a MPP is less optimal.

and to top things off, since the replies are typically larger then the 
requests (which is why people live with DSL that is only 512Kb outbound, 
but is 1.5Mb inbound) the additional delays on the inbound leg are 
significantly worse.


I am makeing the assumption that the MPP machines know the wireless 
topology from each of their points of view i.e. not only do they know 
how to get to the wireless nodes from themselves (and how many hops away 
they are), but they also know this information for each of the other MPP 
nodes. If this assumption is not true currently, a daemon would need to be 
run to keep the MPP boxes in agreement over who is the best gateway to the 
laptop.



If I am on track so far let me see if I can divide the resulting problem 
into three cases


1. the MPP boxes involved are 'owned' by seperate entites and may not know 
about each other over the wired network.

2. the MPP boxes involved are associated with each other, but may be two 
or more network hops away from each other, but all managed as part of the 
same set (with egress filters configured so that outbound traffic could 
come from any of the MPP boxes)

3. the MPP boxes involved with the mesh network are tightly coupled (all 
connected with a high-speed wire network on the same subnet (no routing 
between them, all on the same broadcase domain)


addressing these one at a time.

for #1 I can't think of any reasonable way to move a machine from talking 
to one MPP to another short of true mobile IP solutions.

for #2 the basic approach is the same as LVS uses in tunneling mode see 
http://www.linuxvirtualserver.org/VS-IPTunneling.html for a diagram and 
explination

   This is basicly what I was suggesting earlier, don't worry about the 
outbound traffic, just bounce the inbound traffic to the closest node (via 
a tunnel) before sending it over the air. this chould be a matter of 
useing the existing LVS code and changing the server selection logic with 
something that is aware of the wireless topology.

to avoid a routing loop where the packet gets bounced back and forth 
between MPP boxes, you should be able to set things up so that the load 
balancing is only done on packets coming in from the outside (I don't know 
if iptables can do this stock, but it should be a simple, if ugly hack to 
make packets arriving through a tunnel bypass the LVS code and get 
inserted just past it in the IP stack)

the worst case with this model should be that some inbound packets get 
relayed to the wrong MPP and make more hops then they need to over the 
air.

for #3 I am looking at other server load balancing options, specificly 
the clusterIP target available in iptables 
http://flaviostechnotalk.com/wordpress/index.php/2005/06/12/loadbalancer-less-clusters-on-linux

what this does is to define an IP address that exists on all machine and 
uses a multicast MAC address, this forces the switch to send the packet to 
every port of the switch. The systems then run a match on the incoming 
packet to decide if they should deal with it or ignore it (in the existing 
code a hash on sourceip, sourceip-sourceport, or 
sourceip-sourceport-destport). if this instead did a lookup to the mesh 
information to decide if this was the closest node or not, and if it is go 
ahead and route it over to the wireless, if not drop it.

note that there is a race condition where one node may decide it's not the 
closest before another decides that it is. If node moves are infrequent 
this may not require further attention (TCP packets will get resent if 
they get dropped), if they are too frequent the retries will cause too 
many delays and the race would need to be narrowed or eliminated.

the race condition where two nodes both decide that they are closest (one 
getting added before the other removes the entry) is not nearly the same 
problem as all it would result in is an extra copy of the packet being 
sent over the air (which does eat up bandwidth, but should not cause 
other problems)


both #2 and #3 above only work really well if there is no NAT taking place 
on the MPP boxes (NAT can take place between the MPP boxes and the 
Internet, as long as the MPP boxes can pretend it's not there)

If NAT is running on the MPP box then when a node migrates from one to the 
other the state of any connections would need to migrate as well. This is 
the same problem that is faced by a HA pair of firewalls that don't want 
to loose connections when they fail over, and there are tools to deal with 
this, see http://people.netfilter.org/pablo/conntrack-tools/ This is not a 
nice approach, and with the dependancy on userspace to replicate the data 
it is prone to gaps in coverage as data is mibrated around, but if moves 
are infrequent enough this could be acceptable.


in some ways #2 is the nicest as there is only one copy of the packet 
around, but the need to setup the tunnels and the more extensive 
configureation neededare drawbacks, #3 is simple, but is far more likely 
to end up sending extra copies of packets over the air, and definantly 
will impact switch performance (as it effectivly turns your switch into a 
hub)

both of them involve (relativly) simple changes to existing kernel code, 
taking a chunk of code that's makeing a decision and replacing it with 
code that looks at the mesh info instead.

I have sucessfully avoided useing IPv6 to this point ;-) but I don't know 
of any reason why these strategies shouldn't work with it just as well as 
IPv4.

Unfortuantly I don't know anything about the mesh code to begin trying to 
code this myself. since this only needs to look at the destination IP 
address and look it up in the mesh table I would take a stab at it if I 
did understand where to find the mesh data. My guess is that it would be 
less work for someone who understands the mesh data to try this than it 
would be to educate me on the mesh data as you are running against a 
deadline (and the fact that I have to fly from LA to Atlanta this weekend 
won't help matters any)

David Lang



More information about the Devel mailing list