IPv6 Multihoming/PI using Tunnel Endpoint Lookup (pi-in-6) Version 1.0 2005-10-17 Discussion: The simplicity and flexibility of 6to4 (rfc3056/3068) makes it tempting to adapt it for multihoming/PI purposes in IPv6. The idea is that dynamic stateless tunnels could be used to isolate end site addresses from the Default Free Zone (DFZ). The limitations of the 6to4 method are the inability to embed full IPv6 addresses inside IPv6 addresses and the desire to have PI based addresses. To solve this we could use reverse DNS to look up tunnel endpoints and assign unique /48 prefixes out of special "unroutable" space. Operation: A unique /48 out of special "multihoming/PI" space is assigned to an end site. For discussion let's say this space is in 3000::/16. The corresponding section of 0.0.0.3.ip6.arpa is then delegated to the end-site. Outbound traffic from those addresses is not tunneled and needs no special handling unless the destination is also in 3000:/16. The site can use conventional bgp feeds, default routes, PPLB etc. for outbound path selection. No prefixes are inserted into the DFZ to facilitate this. The source address would be from within the site's /48 in 3000::/16. Packets sent to 3000::/16 addresses would be encapsulated in another IPv6 packet (6in6) with a tunnel endpoint as the destination of the outer header. The tunnel endpoint is determined by a DNS lookup in 0.0.0.3.ip6.arpa for one or more "TEP" records. the TEP records are set and modified by the end-site as desired. Tunnel end points would typically be addresses on the edge routers of the multihomed/PI site. A device encapsulating a pi-in-6 packet should try to cache the TEP record(s) for some minimum time (say five minutes) or the DNS TTL whichever is longer. Refresh of the TEP record(s) should attempted befor they expire from the cache so packet flow is normally uninterrupted. Attempting a refresh every 60 seconds might be reasonable. Like 6to4 the encapsulation could theoretically occur anywhere on the Internet but will be most efficient when done closest to the node originating the packet, optimally at the node sending the packet itself. 3000::/16 could be anycasted to facilitate migration until this is widely deployed. If the node doing the encapsulation has access to routing information it may use that information to pick which TEP to use but it is not expected to have a specific route for the TEP address. Various methods of testing availability of a given TEP could be used but it is ultimately the responsibility of the multihomed/PI end site to make the TEP records reflect current availability/preference. When the packet reaches the tunnel endpoint (or optionally any router that has a /48 or longer route for the inner header destination address) it is extracted and forwarded as a regular packet to the destination. destination. Example: Site A is assigned 3000:0:A::/48 and assigns 3000:0:A::1 to a node (node A). Site A also has two upstream ISP's and is assigned 2001:db8:1::/48 from one and 2001:db8:2::/48 from another. On it's site exit router it has loopback addresses of 2001:db8:1::1 and 2001:db8:2::1 assigned. Site B is assigned 2001:db8:B::/48 and assigns 2001:db8:B::1 to a node (node B). For node A to send a packet to node B it simply creates a normal IPv6 packet with 3000:0:A::1 as the source and 2001:db8:B::1 as the destination. No special handling or tunneling is necessary. At node A's site exit router(s) the decision of which path to use is made the same way that those decisions are made in ipv4 today. A site exit router may have full BGP feeds from upstream ISP's though no prefixes are advertised to those ISP's. Fig. 1 Packets to normal addresses need no special handling: 2001:db8:1::1 +--------+ 3000:0:A::1 +-| router |-+ 2001:db8:B::1 +--------+ | +--------+ | +--------+ | node A |---+ +---- Internet ------| node B | +--------+ | +--------+ | | +--------+ +-| router |-+ | +--------+ | 2001:db8:2::1 | | 3000:0:A::1->2001:db8:B::1 For node B to send a packet to node A, it also creates a normal IPv6 packet with 2001:db8:B::1 as the source and 3000:0:A::1 as the destination. The routing/forwarding system on that node or perhaps the next hop router from that node sees that the destination is in 3000::/16 and encapsulates the packet inside another IPv6 packet. To determine the destination address (tunnel endpoint) of the outer header, the encapsulating node would perform a DNS lookup for "TEP" records in the 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.A.0.0.0.0.0.0.0.0.0.0.3.ip6.arpa zone. In this example a TEP records for "2001:db8:1::1" and "2001:db8:2::1" are found. If both addresses are determined (or assumed) to be reachable the destination address of the outer header is set to one of the two at random. We'll pick 2001:db8:1::1 for example. Any TEP records found are cached by the encapsulating node to speed up future processing. The source address of the outer header is set to any valid globally unique unicast address on the node performing encapsulation. The encapsulated packet is then forwarded to 2001:db8:1::1 where the inner packet is extracted and forwarded to 3000:0:A::1. Fig. 2 If encapsulation is done on the originating node: 2001:db8:B::1->3000:0:A::1 | | 2001:db8:1::1 | +--------+ | 2001:db8:B::1 +-| router |-+ | 3000:0:A::1 +--------+ | +--------+ | | +--------+ | node B |------------------ Internet -+ +---| node A | +--------+ | | +--------+ | +--------+ | +-| router |-+ | +--------+ | 2001:db8:2::1 | 2001:db8:B::1->2001:db8:1::1[2001:db8:B::1->3000:0:A::1] or 2001:db8:B::1->2001:db8:2::1[2001:db8:B::1->3000:0:A::1] Fig. 3 If encapsulation is done on the next hop router: 2001:db8:B::1->3000:0:A::1 2001:db8:B::1->3000:0:A::1 | | | | | 2001:db8:1::1 | | +--------+ | 2001:db8:B::1 | 2001:db8:B::2 +-| router |-+ | 3000:0:A::1 +--------+ | +--------+ | +--------+ | | +--------+ | node B |-----| router |--- Internet -+ +---| node A | +--------+ +--------+ | | +--------+ | +--------+ | +-| router |-+ | +--------+ | 2001:db8:2::1 | 2001:db8:B::1->2001:db8:1::1[2001:db8:B::1->3000:0:A::1] or 2001:db8:B::1->2001:db8:2::1[2001:db8:B::1->3000:0:A::1] Problems: Obviously there are concerns about these supposedly "unroutable" prefixes leaking into the DFZ. Since nobody has control over what prefixes others accept or filter, at least this method would allow multihoming to work when those routes are (hopefully) filtered. Then there is the issue of delay in looking up the tunnel endpoint and resources consumed in lookup and caching of the results. For typical end user workstations this should not be a significant problem. Servers communicating with large numbers of pi-in-6 clients may need specialized hardware to optimize the lookup and encapsulation but it is not unusual for such operatins to already use specialized hardware (load balancing, caches etc) between the servers and clients. At least traffic from pi-in-6 addresses to non-pi-in-6 addresses can be sent natively without any lookup or tunneling. Conideration should be given to the impact this would have on the global DNS system. The suggestion of widely anycasting 3000::/16 is intended only as a transitional step. The goal should be to have any node that generates IPv6 packets to support lookup and encapsulation or at least have it occur at the first hop. Kevin Loch kloch@hotnic.net