Draft M. Py Ipv6mh working document April 29, 2002 Multi Homing Aliasing Protocol (MHAP) draft-py-mhap-01a.txt Status of this Memo This document is a working document of the ipv6mh mailing list. The ipv6mh web page is http://arneill-py.sacramento.ca.us/ipv6mh Copyright Notice Copyright (C) Michel Py (2002). All Rights Reserved. Abstract This document describes a protocol for IPv6 Network Layer multihoming (MHAP) that does not affect the size of the routing table in the IPv6 DFZ (Default Free Zone) and does not use tunnels. MHAP is a router solution and covers home/soho to very large environments. Acknowledging the imperfections inherent to its design, MHAP's goal is to facilitate the initial deployment of IPv6 by providing a base for multihoming support. MHAP is a Network Layer protocol, and the "Home" of MHAP is an IPv6 address. MHAP is a multi-address multi-homing protocol (MAMH). MHAP provides fault tolerance, very good application compatibility, and simple configuration. It can be described as a semi-symmetric, end-to- end, address aliasing protocol. Based on BGP4+ routing information, MHAP aliases twice (see 6.1.24), leaving end-to-end traffic unchanged. MHAP is a concept that has not yet been implemented. However, its building blocks are well known, which should facilitate a rapid development. Py MHAP 1a [Page 1] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Table of contents 1. Conventions used in this document..............................2 2. Introduction...................................................2 3. Problem........................................................3 4. Goals and non-goals............................................3 5. Protocol design................................................4 5.1 Use of address aliasing vs. Tunnels........................4 5.2 Network layer solution.....................................5 5.3 Centralized multihoming routing table......................5 5.4 Faster lookup with fixed-sized prefix......................5 5.5 Separation of the routing tables...........................5 - Distinction between singlehomed and multihomed traffic...5 - Simplified routing on high-bandwidth routers.............5 - Low load on routers containing the multihoming table.....5 - Restriction of the multihoming table distribution........5 - Use of geographical PI addresses.........................5 5.6 Distribution of the aliasing load..........................6 5.7 Use of BGP4+ to determine the best path....................7 5.8 Stateful protocol..........................................7 6. Protocol definition, description, and requirements.............7 6.1 Terminology and descriptions of terms......................7 6.2 Protocol requirements and implementation..................11 6.3 MHAP requests, replies and other datagrams................17 6.4 Compromises...............................................21 6.5 Flowcharts................................................23 7. Fault tolerance...............................................25 8. Load balancing................................................25 9. Application compatibility.....................................25 10. Security considerations.......................................26 11. IANA Considerations...........................................26 12. Registry considerations.......................................26 13. Datagram structure............................................29 14. Topology......................................................30 15. Statement of direction........................................35 16. Revision history..............................................36 17. Acknowledgements..............................................36 18. Compliance with the requirements..............................37 19. Full Copyright Statement......................................38 20. References....................................................38 21. Editor's address..............................................40 1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC-2119]. Py MHAP 1a [Page 2] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 2. Introduction Companies have been driven to multihome their IPv4 networks because of four important features that multihoming offers: - Fault tolerance and continuous connectivity to the Internet by the means of several links (often from different providers). - Better response times by being logically closer to the customer's network and avoiding bottlenecks, such as congested interconnects and Network Access Points, which also provides natural load balancing among different providers. - Cost: Many ISPs and organizations have found interesting to peer with other non-transit entities. For example, large content providers such as portals will typically reduce costs by peering with tier-2 ISPs that do not charge for transit. - Provider-independent addressing. This document assumes that fault tolerance, better response times, cost savings and PI addressing are the cornerstones of multihoming and proposes a protocol (MHAP) for IPv6 multihoming at the Network Layer. 3. Problem The way Network Layer multihoming is achieved in IPv4 is by requesting a block of addresses (commonly called a PI block) independent of the providers' address space. The fault tolerance is achieved by multiple links advertising this block to multiple providers. Propagated in the DFZ by multiple sources, this achieves better response time (and natural load balancing) because the traffic will follow the path deemed as the best by the routing protocol (BGP4+). This approach has successfully provided IPv4 Network Layer multihoming and is in danger of succumbing to its own success. At the time of writing, the number of networks in the routing table of DFZ routers can reach 200,000 (some 120,000 public, plus one's own networks). Widespread issues linked to the size of the routing table have arisen, such as: - Memory needed: A large number of installed routers have a 128MB DRAM limit, which will not contain the current growth of the DFZ's routing table for very long. - Processing power and delay needed to handle updates: Despite optimized algorithms and hardware assistance, updating/indexing a table with 200,000 rows is no trivial task. - Lookup speed / forwarding speed: The size of the routing table lookup is aggravated by the longest match rule. At the time of the writing, the long-term evolution of the Internet is largely unknown. Whether or not we will ever have household appliances that are IPv6 enabled and multihomed remains to be seen, but the potential exists. The multihoming mechanisms that are currently in place for IPv4 could easily be applied to IPv6, at the expense of scalability. If today one can purchase or build a router that handles 200,000 IPv4 routes at OC- Py MHAP 1a [Page 3] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 192 speeds, the model that handles 500,000 IPv4 and 10,000,000 IPv6 networks at OC-768 speeds has not been manufactured yet. 4. Goals and non-goals The goal of this document is to provide a mid-term design for an IPv6 Network Layer multihoming solution that is: - Scalable beyond short-term needs. - Easy to configure and administer. - Transparent to the upper layers. There is no goal at this time to provide an IPv4 solution. Although the protocol could be adapted to IPv4, the editor does not think that it can realistically implemented on today's Internet without a successful IPv6 implementation, and, by the time that is realized, solving the IPv4 problem might be a non-issue. There is no goal to replace BGP4+. To the contrary, MHAP heavily relies on BGP4+ as the routing protocol. 5. Protocol design The guidelines that have driven MHAP's design are as follows: 5.1 Use of address aliasing vs. Tunnels Tunnels have been commonly discussed as a solution to the IPv6 multihoming problem. The editor's analysis is that tunnels that use the same protocol both as the payload and the encapsulation protocol are using tunneling mechanisms to achieve functions that are related to addressing matters. Tunneling IPv6 into IPv4 is fine; the tunneling mechanism is required to transport a protocol (IPv6) over a network (IPv4) that does not understand it. However, tunneling IPv6 into IPv6 for the purpose of solving what is indeed an addressing issue might not be the optimal solution. The editor is familiar with tunnels and has used them since the early '90s to transport IPX over IP on Novell servers. The use of tunnels for IPv6 multihoming purposes is indeed a way to hide the real IPv6 address to the network. There are two potential problems with tunnels: - The encapsulation process reduces the MTU. - It is generally admitted that the migration to IPv6 will involve large amounts of tunneling IPv6 into IPv4. Adding a second layer of IPv6 into IPv6 tunneling, although feasible, might produce some implementation challenges, especially if encryption is to be used. The editor thinks that a semi-symmetric, end-to-end address aliasing solution can be more efficient (because of the encapsulation overhead, or the lack thereof), more logical (because the multihoming problem is an addressing issue), and globally simpler than a tunneling solution. 5.2 Network Layer solution Py MHAP 1a [Page 4] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Transport Layer solutions have been commonly discussed as a solution to the IPv6 multihoming problem. There are three potential problems with Transport Layer multihoming solutions: - They tend to be extensions of connection-oriented mechanisms (mostly TCP) that might not be optimal for connectionless protocols, such as UDP. - They might not be completely in line with the OSI layer separation idea; a Transport Layer solution does not affect ICMP, (a layer 3 protocol). A multihoming solution without ICMP capabilities would be extremely difficult to troubleshoot. - They might not be completely in line with broader ideas, such as end- to-end connectivity and the commonly admitted fact that routing is a layer 3 topic. The editor thinks that a semi-symmetric, end-to-end address aliasing solution can be easier to troubleshoot (because of separation between layers), and more logical (because it contains the multihoming problem within layer 3, where it belongs) than a Transport Layer based solution. 5.3 Centralized multihoming routing table The one design element that has made possible today's IPv4 multihoming is the presence of each multihomed block in the DFZ's routing table. It is not issue-free, but it has delivered so far. The design proposed in this document does not intend to suppress completely the centralized routing table but rather tries to minimize the inconveniences that it causes. The PI address used by MHAP can be: - An MHAP block, targeted to large multinational organizations. - A geographically aggregatable PI address, for other multihomed sites. 5.4 Faster lookup with fixed-sized prefix One of the aggravating factors of the size of the routing table is the longest match rule. This document proposes a fixed size prefix for multihoming purposes, which will allow faster routing table lookups by skipping the longest match rule process. 5.5 Separation of the routing tables A widely recognized problem is the size of the DFZ's routing table, which causes issues both in terms of lookup speed and time required to process updates. These factors are aggravating the fact that the routers that need to handle the DFZ's table are required to process extremely large numbers of packets per second and accommodate multiple, very high capacity circuits. This document proposes a solution that dramatically reduces the size of the DFZ's table (for routers that need to process the bulk of the traffic) and dramatically reduces the traffic on routers that need to handle a large routing table by the following means: Py MHAP 1a [Page 5] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 5.5.1. Distinction between singlehomed and multihomed traffic: Two global separate routing tables are to be kept: One for singlehomed traffic (the DFZ routing table) and the second one for multihomed networks (the MHAP routing table). This document formally distinguishes singlehomed and multihomed traffic. The main idea behind MHAP is to transform multihomed traffic into singlehomed traffic by the means of a semi-symmetric address aliasing process (described below). 5.5.2. Simplified routing on high-bandwidth routers: Since backbone routers would no longer need to handle multihomed traffic, the IPv6 DFZ could be summarized in the spirit that has guided its inception. To take the summarization to an unrealistically absurd level, the IPv6 backbone could be summarized at the /16 boundary (the so-called 8k DFZ's routing table), and the 6bone could be summarized at the /32 boundary without compromising the multihoming capabilities of MHAP. 5.5.3. Low load on routers containing the multihoming table: This document calls for a centralized routing table that would contain all centralized multihomed prefixes. However, the routers ("MHAP rendezvous points") containing this table (the MHAP routing table that is likely to be bigger than the DFZ table itself) would not be required to process large amounts of traffic; only the very first packets of a session to a host on a multihomed network would hit these routers. 5.5.4. Restriction of the multihoming centralized table distribution to a reasonably small number of MHAP rendezvous points. 5.5.5. Further reduction of the multihoming routing table by the use of geographically aggregatable PI addresses. MHAP enables the existence of geographically aggregatable PI addresses without require ring the physical infrastructure that maps the aggregation boundaries. 5.6 Distribution of the aliasing load: Extreme scalability will be achieved by sharing the processing power required by MHAP on each end router ("MHAP client", a router close to the user's workstation or server being accessed). The processing requirements of MHAP are likely to be comparable or better to those of NAT, so a router capable of handling n egress user NAT sessions would also be able to handle n egress MHAP sessions. 5.7 Use of BGP4+ to determine the best path: MHAP is not a routing protocol and relies on BGP4+ for decisions regarding the selection of the optimal path. Implementation of MHAP should not change the way network administrators administer their BGP autonomous systems. 5.8 Stateful protocol: There is no design guideline that calls for a stateful protocol. However, MHAP is clearly a stateful protocol for MHAP clients and MHAP rendezvous points. This statefulness is not a problem regarding the breakup of end-to-end connections. If an MHAP router that maintains stateful information goes down, the state will be automatically re- Py MHAP 1a [Page 6] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 established in the backup router. MHAP is not connection-oriented since only one side keeps track of the current aliasings. 6. Protocol definition, description, and requirements 6.1 Terminology and descriptions of terms 6.1.1. Singlehomed address space: Any globally aggregatable IPv6 address EXCEPT those reserved for multihomed addresses. This address space has been allocated to a transit provider and is non-portable. 6.1.2. Multihomed address space: a) Two blocks of globally aggregatable IPv6 addresses reserved for centralized multihomed traffic (MHAP prefixes). This document uses 2345::/16 and 3FFE:FFFF::/32 for explanatory purposes. MHAP prefixes are allocated to an organization and are portable. b) Two blocks of geographic PI address space: Address space reserved for geographically aggregatable multihomed traffic (geo PI prefixes). This document uses 2346::/16 and 3FFE:FD00::/24 for explanatory purposes. Geo PI prefixes are allocated to an MHAP area and are not portable outside the area. 6.1.3. Singlehomed traffic: Traffic whose destination IPv6 address is in the singlehomed address space. 6.1.4. Multihomed traffic: Traffic whose destination IPv6 address is in the multihomed address space. 6.1.5. Multihomed site: an organization that receives IPv6 connectivity from two transit providers from two different TLAs (or subTLAs or 6bone pTLAs). If an organization that has been allocated a TLA (or a 6bone pTLA) wants to be allocated an MHAP prefix, they also need to have a block of singlehomed addresses from another TLA (pTLA) likely one of their direct competitors. 6.1.6 Multihomed prefix size: both MHAP prefixes and geo PI prefixes are multihomed prefixes. The fixed size /48 has been chosen because it is percept to be sufficient for most multihoming purposes. 6.1.7 MHAP prefix: a /48 block of centralized multihomed addresses. MHAP prefixes are provider-independent and allocated to the end customer directly by the registry authority or the 6bone. All connected MHAP prefixes are listed in the MHAP routing table. 6.1.8. Geo PI prefix: a /48 block of geographically aggregatable multihomed addresses. Geo PI prefixes are provider-independent and allocated to the end customer directly by local aggregation authority (that can be one of the ISPs involved) or by the ipv6mh webmaster for 6bone addresses. Except within the set of routers that participate in an area's routing, Geo PI prefixes are not individually advertised in the routing table. Routers that participate in an area MUST send only the area's aggregate (a /32 for production, a /36 for the 6bone, see Py MHAP 1a [Page 7] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 examples in 12.2) to routers that they peer with and are not part of that area. 6.1.9. MHAP aliasing block: a /48 block of singlehomed addresses. Recipients of multihomed prefixes must, for each MTHP prefix or geo PI prefix, reserve an MHAP aliasing block for EACH TLA and pTLA from which they have been allocated addresses. 6.1.10. MHAP aliasing table: The dynamic table, in an MHAP client, that maps singlehomed addresses to multihomed addresses. The table contains /48 prefixes and does, in fact, maps MHAP aliasing blocks to MHAP prefixes or geo PI prefixes. The MHAP aliasing table contains twelve columns: MHAP_prefix, MHAP preferred aliasing block and BGP4+ metric, MHAP aliasing block #2, #3 and #4 and BGP4+ metrics, (respectively MHAP_TB_1, MHAP_metric_1, MHAP_TB_2, MHAP_metric_2, MHAP_TB_3, MHAP_metric_3, MHAP_TB_4, MHAP_metric_4), MHAP_requests_sent, and two timers, MHAP_request_timer, MHAP_refresh_timer, and MHAP_key. The two timers are unsigned 16-bit integers, and key is a 64-bit unsigned integer. 6.1.11. MHAP request: The request, sent from an MHAP client (or the client part of an MHAP endpoint) to an MHAP endpoint, contains a multihomed prefix that the client wants to alias into a singlehomed MHAP aliasing block. The PA addresses of the MHAP endpoint are unknown to the client when it sends the MHAP request. The MHAP request will flow to the closest MHAP rendezvous point that will alias (for centralized PI) and forward it to the appropriate MHAP endpoint. 6.1.12. MHAP reply: The reply, sent from an MHAP endpoint to an MHAP client upon request, contains the singlehomed MHAP aliasing block that is optimal for the requesting client. 6.1.13 MHAP client: A router running MHAP in client mode. Customers that have not been assigned a multihomed prefix would run in this mode. The purpose of MHAP clients is to build and maintain the MHAP aliasing table (unique to each router). The MHAP client sends MHAP requests to MHAP endpoints and aliases (by looking up the MHAP aliasing table built with MHAP replies) the destination IPv6 address, for egress traffic, from a multihomed address to a singlehomed address. Use of aliasing happens only if there is no specific IGP route from the same organization to the multihomed address (that is, the only route is one of the default aggregates or a same-area route). If there is a specific route inside the organization (case of type IV endpoints), just use it 6.1.14. MHAP multihomed client: An MHAP client that does NOT have a default route. The behavior of multihomed MHAP clients is the same as MHAP endpoints and will not be specifically addressed in this document. Py MHAP 1a [Page 8] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 6.1.15. MHAP endpoint: A router running MHAP in both client and endpoint mode. Multihomed sites that have been allocated an MHAP prefix or a geo PI prefix would run this mode. There is a special type of endpoint, type IV; these are not multihomed themselves but leverage the multihoming setup of a type I or type II endpoint belonging to the same organization. MHAP endpoints have two purposes: a) To alias back into multihomed traffic the singlehomed traffic sent from MHAP clients and MHAP rendezvous points. By a simple static subnet aliasing, the MHAP endpoint aliases ingress singlehomed traffic into multihomed traffic. The endpoint aliasing process is stateless and looks up a very small table issued from static configuration. The resources needed for MHAP endpoints are very low; basically, MHAP replaces the first 48 bits of the IPv6 address. This simple operation allows a single router to handle very large numbers of MHAP ingress packets without choking. Furthermore, the MHAP endpoint aliasing would be simple to implement in hardware and would enable IPv6 hardware- accelerated MHAP endpoints to handle ingress MHAP traffic at wire speed. b) To provide MHAP clients with the information they need to build their MHAP aliasing table by replying to MHAP requests. MHAP type IV endpoints do not receive requests. Note that MHAP endpoints also need to be MHAP clients to handle the aliasing of egress traffic since they do not contain the full MHAP routing table. The MHAP client running on an MHAP endpoint is similar but not completely identical to an MHAP client only. 6.1.16. MHAP rendezvous point: A router running MHAP in both client and rendezvous point mode. This is the mode that tier-1 transit providers (that have been allocated a TLA (or sub TLA or a 6bone pTLA)) would run. MHAP rendezvous points have two purposes: a) Alias and forward all MHAP requests from MHAP clients to appropriate MHAP endpoints. b) Alias and proxy a controlled amount of multihomed traffic to appropriate MHAP endpoints. As part of this aliasing / proxying, MHAP rendezvous points alias the destination IPv6 address of egress traffic to a singlehomed address like an MHAP client does except that the MHAP rendezvous point uses the MHAP routing table instead of the MHAP aliasing table. 6.1.17. MHAP routing table: The MHAP routing table, present in full only in MHAP rendezvous points, contains all MHAP prefixes allocated and all MHAP areas allocated. Technically, this is a BGP4+ routing table (MHAP rendezvous points are BGP4+ peers) with two extra characteristics: it contains only prefixes from the multihomed address space, and prefixes are of a fixed size, Py MHAP 1a [Page 9] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 which allows skipping the longest match rule and the use of optimized algorithms for lookups. All prefixes within 2345::/16 are /48 (MHAP blocks) All prefixes within 2346::/16 are /32 (geo PI areas) All prefixes within 3FFE:FFFF::/32 are /48 (6bone MHAP blocks) All prefixes within 3FFE:FD00::/24 are /36 (6bone geo PI areas). 6.1.18. MHAP rendezvous point short-term table: A dynamic table that contains pairs of source prefixes and destination multihomed MHAP prefixes. The MHAP rendezvous point will build and maintain this table in order to limit the number of non-MHAP multihomed packets that can be proxied for each pair. This table contains five columns: SH_source_prefix, ASSOC_MHAP_prefix, PROXY_MHAP_packets, STATIC_maxpackets and COUNT_unused. No timers are associated with entries in the short-term table. The table is dynamically built on-demand; the value of PROXY_MHAP_packets is reset every second. Entries are deleted when there has been no traffic for a specific entry for 30 seconds. It is recommended that the MHAP_maxproxy is at least a 32-bit unsigned integer and COUNT_unused a signed byte. 6.1.19. Egress unaliased multihomed packet: A multihomed packet that can be sent (by an MHAP client or an MHAP endpoint) to the best route to the multihomed address space (the closed MHAP rendezvous point) without a matching MHAP aliasing prefix in the MHAP aliasing table. The number of egress unaliased multihomed packets is strictly limited both at the MHAP client/endpoint and at the MHAP rendezvous point). The egress unaliased multihomed packet has a vague similarity with the TCP sliding window in the sense that it allows a certain amount of multihomed traffic to be sent to the MHAP rendezvous point (to be proxied, using a sub-optimal path) before waiting for an MHAP reply {"ships in the night") that will allow the client to transform the multihomed traffic into singlehomed traffic. The egress unaliased multihomed packet is a double-edged sword: If the destination prefix is valid, it will reduce the latency of the first packet(s) of a yet unresolved MHAP aliasing. Otherwise, it will waste bandwidth. In either case, egress unaliased multihomed packets are a burden that needs to be carried by MHAP rendezvous points. 6.1.20. The aliasing of an egress unaliased multihomed packet sent at the same time as an MHAP request by an MHAP client by the rendezvous point is called MHAP proxying. 6.1.21. The aliasing of a unaliased multihomed packet coming from a non-MHAP router by the rendezvous point is called non-MHAP proxying. 6.1.22. MHAP area: An area defined as an aggregation point for geo PI addresses. MHAP areas are likely to be centered on metropolitan areas. 6.1.23. MHAP aggregator: A router aggregating routes for one or more MHAP areas. MHAP aggregators are standard routers (no MHAP-specific code required). Py MHAP 1a [Page 10] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 6.1.24. The general principle of address aliasing is as follows: - Multihomed traffic is to be aliased and carried over the Internet as singlehomed traffic. - When the IPv6 packet or datagram is processed by the first aliasing- capable router along its path (an MHAP client in most situations) the multihomed destination IPv6 address is replaced with one of the possible singlehomed PA addresses. - When this aliased packet reaches the destination site's MHAP endpoint, the singlehomed destination PA address is replaced (unaliased) with the original multihomed destination address. - Contrary to most NATs, MHAP replaces the destination address, not the source. 6.2 Protocol requirements and implementation 6.2.1. MHAP is a feature of routers. Implementing MHAP at the host level would greatly increase the load of both MHAP endpoints and MHAP rendezvous points. Successful deployment of MHAP requires that there is an MHAP client in the path of multihomed traffic, which probably means that the edge of each stub network is MHAP enabled. However, some configuration of MHAP rendezvous points will allow them to be used as MHAP proxies and enable prefixes that do not have MHAP-enabled routers to access multihomed networks. 6.2.2. hosts that are sending traffic to a multihomed address are, as defined in 6.1, sending multihomed traffic. The main idea behind MHAP is that multihomed traffic, with the exception of the very first packets in a session, is transformed into singlehomed traffic at a router close to the source (the MHAP client) and transformed back into multihomed traffic at the last router (the MHAP endpoint). 6.2.3. Assignment of multihomed prefixes: The only requirement to qualify for a geo PI prefix (within 2346::/16 or 3FFE:FB00::/24) is to be multihomed in the same area in the relevant address space. The requirements to qualify for an MHAP prefix are as follows: - For a production MHAP prefix (a /48 block in the 2345::/16 range): two or more separate physical connections from to two or more transit providers from two or more different TLAs. - Organizations requiring MHAP prefixes might be required to financially support the MHAP rendezvous point infrastructure. It is mandatory for organizations that require an MHAP prefix to be multihomed in two or more MHAP areas. - For a 6bone MHAP prefix (a /48 block in the 3FFE:FFFF::/32 range): same as above except that physical connections can be replaced by tunnels. 6.2.4. The address space allocation requirements are as follows: For each multihomed prefix (MHAP or geo PI), a site must reserve a prefix of the same size (/48) from EACH of the different TLAs (pTLAs) from which the site receives IPv6 connectivity, for the sole purpose of MHAP Py MHAP 1a [Page 11] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 prefix aliasing. Thus, if a site is multihomed to three different TLAs (pTLAs), the total amount of IPv6 addresses to allocate is four times the size of a multihomed prefix: one time for the multihomed prefix itself (that is going to be the block of addresses being accessed by clients and resolved in DNS), and three times for each transit provider prefix aliasing block that is essentially wasted by the MHAP aliasing process. This space allocation problem, along with the fact that IPv4 multihomed customers would be very reluctant to discontinue using their IPv4 PI block, is why the editor thinks that MHAP is not a realistically deployable solution for IPv4. 6.2.5. MHAP clients do not require BGP4+. A static default route or any other routing mechanism is enough to configure a router as an MHAP client. MHAP clients, if BGP4+ enabled, receive only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32 (except multihomed prefixes originating from their own AS). The best path to the multihomed address space is the path deemed as best by BGP4+ for the multihomed prefixes. MHAP clients MUST NOT advertise multihomed prefixes outside of their autonomous system. 6.2.6. Routers at the edge of stub networks MUST discard ingress multihomed traffic (they should also discard singlehomed traffic which destination address in not part of the addressing space they have been allocated). 6.2.7. There are four types of MHAP endpoints: Type I: BGP4+ peering with rendezvous point, uses MHAP blocks. Type II: BGP4+ peering with aggregator, uses geo PI blocks. Type III: RIP routing with aggregator, uses geo PI blocks. Type IV: iBGP and IGP routing with type I or type II endpoint belonging to the same organization. Each site that has been allocated an MHAP prefix or geo PI prefix needs to have one or more MHAP endpoints. Technically, only one router is needed. However, having only one MHAP endpoint router would be counter- productive with why the customer wants to be multihomed, except for type III setups. MHAP endpoints must not have a default route. Type I, II and IV endpoints require a BGP4+ feed from all their transit providers. MHAP type I, II and III endpoints advertise their assigned multihomed prefixes to each of their transit providers' MHAP rendezvous points or aggregators and receive only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32 (except multihomed prefixes originating from their ISPs AS). MHAP type IV endpoints do not advertise any multihomed routes to through their eBGP connection to their ISP. The multihomed routing setup for type I MHAP endpoints is: o No default route. o eBGP4+ peering with MHAP rendezvous points: advertise only MHAP prefixes owned, receive only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o Private peering: advertise only MHAP prefixes that belong Py MHAP 1a [Page 12] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 to self or customers. o Do not accept any BGP4+ multihomed routes other than those mentioned above. The routing setup for type II MHAP endpoints is: o No default route. o eBGP4+ peering with MHAP aggregators: advertise only geo PI prefixes owned. o Private peering: advertise only MHAP prefixes and geo PI prefixes that belong to self or customers. o Do not accept any BGP4+ multihomed routes other than those mentioned above. The routing setup for type III MHAP endpoints is: o No default route. o RIP routing with MHAP aggregators: advertise only geo PI prefixes owned. o Private peering: advertise only MHAP prefixes and geo PI prefixes that belong to self or customers. o Do not accept any BGP4+ multihomed routes other than those mentioned above. The routing setup for type IV MHAP endpoints is: o No default route. o eBGP4+ peering with ISP: do not advertise any multihomed prefixes. Receive only the four aggregates from the multihomed address Space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o iBGP and IGP routing with MHAP type I or II endpoints belonging to the same organization: no restrictions, aggregation optional. o Private peering: advertise only MHAP prefixes and geo PI prefixes that belong to self or customers. o Do not accept any BGP4+ multihomed routes other than those mentioned above. 6.2.8. Each ISP that wishes to service an MHAP area MUST have one or more MHAP aggregators. The routing setup for MHAP aggregators is: o No default route. o BGP4+ peering with MHAP rendezvous points: advertise only aggregates for areas servicing, receive only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o BGP4+ peering with other MHAP aggregators in the same area: unrestricted (for multihomed addresses). o BGP4+ peering with MHAP type II endpoints: Accept only geo PI prefixes, advertise only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o RIP routing with MHAP type III endpoints: Accept only geo PI prefixes, advertise only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. Redistribute RIP geo Py MHAP 1a [Page 13] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 PI prefixes into BGP4+. o Private peering: advertise only MHAP prefixes and geo PI prefixes that belong to self or customers. o All other BGP4+ peers must be sent only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o Do not accept any BGP4+ routes other than those mentioned above. 6.2.9. MHAP endpoints, aggregators and rendezvous points are BGP4+ routers. BGP requirements, such as full mesh of iBGP peers, use of route reflectors [6], and other BGP4+ topics, are fully applicable to all MHAP routers running BGP4+. BGP4+ configuration should not be affected by MHAP. The interaction between MHAP and BGP4+ is three-fold: a) MHAP type I and II endpoints will lookup their BGP4+ routing table in order to reply to MHAP aliasing requests from MHAP clients. b) Implementation of BGP4+ on MHAP rendezvous points could be optimized to take advantage of the specifics of the MHAP routing table such as all prefixes being of the same length. c) MHAP rendezvous points will lookup the MHAP routing table in order to alias the destination address of proxied traffic (up to the amount allowed) and MHAP requests. 6.2.10. The full MHAP routing table is present in MHAP rendezvous points only. In the same spirit that TLAs and pTLAs collaborate on BGP4+ peering for the DFZ routing table, they also need to collaborate on MHAP rendezvous point peering. 6.2.11. The MHAP routing table might be considered as the DFZ routing table for multihomed traffic. 6.2.12. Although technically possible, it is not recommended to configure a router as both an endpoint and a rendezvous point. This configuration would defeat the scalability feature of MHAP. Configuring a router both as an MHAP endpoint and rendezvous point requires that router to run two completely separate instances of BGP4+. 6.2.13. Each TLA (pTLA) is required to have two or more MHAP rendezvous points. Each MHAP rendezvous point exchanges the full MHAP routing table with other MHAP rendezvous points, typically all MHAP rendezvous points within the same TLA (pTLA) and with at least one (preferably two or more) MHAP rendezvous points from other TLAs (pTLAs) directly connected (tunneled). MHAP rendezvous points must not forward singlehomed traffic. They must not advertise any singlehomed routes and must discard ingress traffic with a singlehomed IPv6 destination address except their own. MHAP rendezvous points MUST NOT receive any BGP4+ routes from peers that are not MHAP endpoints or rendezvous points themselves. The recommended connection of an MHAP rendezvous point is two direct, high-speed links to core routers. The routing setup for MHAP rendezvous points is: Py MHAP 1a [Page 14] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 o BGP4+ peering with other MHAP rendezvous points: unrestricted (for multihomed prefixes). No aggregation. o BGP4+ peering with MHAP aggregators: accept only area aggregates, advertise only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o BGP4+ peering with MHAP type I endpoints: Accept only MHAP prefixes, advertise only the four aggregates from The multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o All other BGP4+ peers must be sent only the four aggregates from the multihomed address space 2345::/16, 2346::/16, 3FFE:FB00::/24 and 3FFE:FFFF::/32. o Do not accept any BGP4+ routes other than those mentioned above. o Use routes from an IGP such as OSPF or EIGRP (not to be redistributed) or a static route to ::/0 pointing to the directly connected core router(s) to provide egress singlehomed connectivity. 6.2.14. When reaching an MHAP rendezvous point, multihomed traffic to an MHAP prefix that is not present in the MHAP routing table MUST be discarded and an ICMP unreachable sent to the originating router. 6.2.15. BGP4+ peering from or to a multihomed address is STRICTLY PROHIBITED in any situation. This prohibition applies both to BGP4+ peering and MHAP peering. It would likely result in a deadlock. 6.2.16. Routing within MHAP areas: ISPs servicing an area are required to BGP4+ peer their aggregators with at least three other ISPs servicing that area. This BGP4+ peering is limited to geo PI addresses (2346::/16 and 3FFE:FD00/24). When choosing peers, the size of the target ISP and the bandwidth and latency of the link are to be considered. IPv6 native links MUST be preferred over tunnels. A full mesh is not required. ISPs servicing an area advertise the area aggregate to MHAP rendezvous points. This setup results in the obligation for any ISP that services an area to provide free transit to that area for geo PI addresses. It is to be noted here that this traffic represents a very tiny part of the overall traffic (MHAP requests and unaliased egress multihomed packets). The bulk of the traffic is carried over singlehomed addresses. This peering for multihomed addresses is not related to private peering agreements (for example, between two large tier-2 ISPs) designed to save on transit costs. Customers are required to exchange dynamic routes with their ISPs. This can be done with RIP (for type III) or BGP4+ (for types I and II). In either case, customer must only advertise their /48 geo PI aggregate to their ISPs. ISPs that receive geo PI multihomed RIP routes from customers MUST redistribute these routes into BGP4+. 6.2.17. RIP routing for MHAP endpoints: Using RIP instead of BGP4+ for MHAP endpoints has pros and cons. Pros: Py MHAP 1a [Page 15] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 - No need for an ASN. - Might be considered simpler than BGP4+. Implementations are likely to be available on home/soho routers. - The peer's address does not need a static configuration. This would allow dial-up backup links. Cons: Since there is no BGP4+ routing table, there is no way for the MHAP endpoint to choose the preferred singlehomed address based on routing information, voiding most of the load-balancing and performance features. MHAP endpoints, when replying to MHAP requests, use a statically configured preference. (alternate, non-routing based selection algorithms might be offered later). MHAP endpoints using RIP are designed to accommodate low-end multihomed setups that have a designated primary link. The convergence time of RIP is a non-issue. If the secondary link is a dial-on-demand, there will be disruption anyway. If the secondary link is permanent, the survivability feature of MHAP will take care of the convergence issue. Even if RIP takes 1 minute to converge, MHAP reconverges in 2 seconds. The RIP option is available for geo PI only, not for MHAP blocks. 6.2.18. Allocation of MHAP areas: The body to allocate MHAP areas is not determined yet, likely IANA. The purpose of MHAP areas is to geographically aggregate geo PI addresses. There is no need for the physical infrastructure to map the MHAP areas (two providers that service an area are not required to have a physical connection between them in that area). The decision to create an MHAP area is based on the predicted number of IPv6 multihomed sites in that area. An MHAP area can handle up to 65,535 multihomed sites (4,095 for 6bone MHAP areas). For example, the Antarctic continent is likely to be a single MHAP area, where the San-Francisco bay area might have three different areas, San-Francisco, Silicon Valley and East Bay. 12.2 contains tentative allocations. 6.3 MHAP requests, replies, and other datagrams 6.3.1. Single homed to multihomed traffic: MHAP requests are sent by the MHAP client. What triggers the sending of an MHAP request is egress multihomed traffic that does not have a matching entry in the MHAP aliasing table. The source address of the MHAP request is the same as the singlehomed host that sent the traffic that triggered the MHAP request, and the destination address is unchanged, which is the address of the destination multihomed host. Although not recommended, the sending of MHAP requests can be de- activated for some prefixes. In that situation, all egress traffic would be unaliased, which requires the multihomed prefix to be directly reachable. By default, all multihomed traffic triggers the sending of MHAP requests. 6.3.2. Multihomed to multihomed traffic: It is STRICTLY PROHIBITED to send an MHAP request with a multihomed source address. When an egress packet with both the source and destination IPv6 addresses being multihomed arrives in an MHAP endpoint Py MHAP 1a [Page 16] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 or MHAP multihomed client, the MHAP multihomed client or MHAP endpoint client's behavior differs from the regular MHAP client because it will not send one MHAP request; instead it will send one MHAP request per interface configured with a singlehomed address MHAP aliasing block. The source address of each of these MHAP requests must be aliased to the singlehomed address that belongs to the interface the MHAP request is sent from. 6.3.3. When a type I or II MHAP endpoint receives an MHAP request, it looks up its BGP4+ routing table to find out which interface is the best, as deemed by BGP4+, to send traffic back to the requesting client. The MHAP reply contains the MHAP aliasing block associated with the egress interface to send the reply back to the MHAP client, as well as up to three other MHAP aliasing blocks (from different interfaces) that match the MHAP address in the MHAP request, by order of their respective metrics. Type III MHAP endpoints use the statically configured preference instead of looking up the BGP4+ table. MHAP requests with a multihomed source address are discarded. 6.3.4. When an MHAP client or endpoint receives an MHAP reply, it authenticates it to verify that it was initiated from itself (explained in "Security considerations"), and then updates the entry of the aliasing table with the contents of the MHAP reply. In an MHAP client, the BGP4+ metric for each MHAP aliasing block is the same. MHAP clients, since they are not multihomed, are not able to use the load-balancing feature of MHAP and must not alter the order of the aliasing prefixes received in MHAP replies. 6.3.5. MHAP type I and II endpoints in client mode, since they send multiple MHAP requests, will receive multiple MHAP replies. MHAP endpoints must update the MHAP aliasing table by assigning the MHAP preferred aliasing and MHAP aliasing blocks #2, #3 and #4 in the order of their respective metrics. MHAP type III endpoints use the statically configured preference instead. MHAP endpoints will optionally be able to load balance egress traffic to multihomed destinations. 6.3.6. No other IPv6 NAT MHAP is an address aliasing mechanism. IPv6 to IPv6 NAT must be avoided at any cost for MHAP traffic (which includes both MHAP datagrams and egress unaliased packets). IPv4 NAT of IPv6-encapsulated packets needs to be studied on a case-by-case basis and could be workable if end-to- end IPv6 connectivity is maintained. In other words, IPv6 encapsulated into IPv4 NATted traffic, if working, could accommodate MHAP traffic as well as regular IPv6 traffic. However, the editor thinks that no kind of NAT should be used in combination with MHAP regardless of whether it can be worked out. 6.3.7. MHAP client and endpoint timers and counters Two timers are associated with each MHAP prefix in the MHAP aliasing table of MHAP clients and endpoints: MHAP_request_timer and MHAP_refresh_timer. Their purpose and relation to the various timeout values are described below. Py MHAP 1a [Page 17] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 The MHAP_request_timer starts when a new entry is added to the MHAP aliasing table (when a new entry is added, only the MHAP prefix field is populated). The MHAP_request_timer is reset each time an MHAP keepalive is received for the matching entry in the MHAP aliasing table. For a description of MHAP keepalives, see "fault tolerance". The MHAP_refresh_timer is reset each time it expires and triggers another sending of MHAP request(s). 6.3.8. MHAP_request_timeout The MHAP_request_timeout is the value expressed in milliseconds of the life of an incomplete MHAP aliasing table entry (an entry in the aliasing table, triggered by an MHAP request that never got an MHAP reply). The default value is 2,000 (2s). During the duration of MHAP_request_timeout, only MHAP_maxpackets_unaliased MHAP requests and egress unaliased multihomed packets can be sent for the matching MHAP prefix for the incomplete entry in the MHAP aliasing table. The MHAP client checks and updates the value of MHAP_requests_sent before sending an MHAP request/ MHAP unaliased packet. If MHAP_requests_sent reaches the value of MHAP_maxpackets_unaliased, no more MHAP requests/ MHAP unaliased packets can be sent until MHAP_request_timeout expires. The MHAP_request_timeout also define the interval between MHAP keepalives when a valid MHAP table entry is present. 6.3.9. MHAP_refresh_timeout The MHAP_refresh_timeout is the value, expressed in seconds that will trigger the re-sending of MHAP refresh request(s) for a given entry in the MHAP aliasing table. The default value is 120 (2 mn). MHAP refresh requests are identical to MHAP requests except that they are sent directly to the MHAP endpoint. 6.3.10. MHAP_maxpackets_unaliased MHAP_maxpackets_unaliased is the number of MHAP requests and egress unaliased multihomed packets that can be sent for a given incomplete entry in the MHAP aliasing table. The default is 5. 6.3.11. MHAP rendezvous point timers and counters The timer for the MHAP rendezvous point short-term table is fixed to one second. 6.3.12. MHAP_shorterm_blocksize_backbone MHAP_shorterm_blocksize is the size of the block in the SH_source_prefix column for an auto-created entry related to the backbone multihomed address space (2345::/16). The default is /16. Acceptable values are 16 to 48. Note that one must be careful setting values such as /48 because it has the potential of creating a huge short-term table. 6.3.13. MHAP_shorterm_blocksize_6bone MHAP_shorterm_blocksize is the size of the block in the SH_source_prefix column for an auto-created entry related to the 6bone multihomed address space (3FFE:FFFF:/32). The default is /32. Acceptable values are 16 to 48. Note that one must be careful setting Py MHAP 1a [Page 18] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 values such as /48 because it has the potential of creating a huge short-term table. 6.3.14. MHAP_shorterm_flush MHAP_shorterm_flush is the number of seconds that an unused entry in the short-term table will remain before it is flushed. The COUNT_unused field is reset to MHAP_shorterm_flush each time the entry in the short- term table is matched and decremented every second. When it reaches zero, the entry is flushed. 6.3.15. MHAP_maxproxy is the number of non-MHAP multihomed packets that can be proxied, in one second, from and to a matching entry in the short-term table. The default is 100. Note that this value affects multihomed traffic only, NOT MHAP requests. The PROXY_MHAP_packets field is incremented each time the entry in the short-term table is matched and reset every second. Since the multihomed packets that are to be MHAP-proxied are no different than the ones that are to be non-MHAP proxied, a compromise to allow MHAP-proxied packets is to count the number of MHAP requests against the number of proxied packets, by decrementing the PROXY_MHAP_packets field in the matching short-term table. See "compromises" for more details. Setting this value to a very large number will effectively allow the rendezvous point to act as an MHAP client on behalf of other routers ("non-MHAP proxying") by allowing unconfigured MHAP clients or non-MHAP routers to send multihomed traffic that would be aliased and sent to the appropriate MHAP endpoint. That would effectively transform the rendezvous point into an MHAP transparent proxy for all multihomed traffic. This approach could be used, with care, to facilitate the initial deployment of MHAP. Changing the value of MHAP_maxproxy must be carefully thought through, however, because it will place a very high load on the MHAP rendezvous point. If one wants to enable proxying for a specific network, a static entry in the short-term table is preferred. 6.3.16. Static entries in the short-term table Static entries can be configured in the short-term table in order to enable the rendezvous point to act as a proxy for specific networks or to deny any multihomed proxying for a specific network. Static entries are assigned the value -1 for the COUNT_unused field, and never time out. 6.3.17. Short-term table lookup Static entries in the short-term table are checked first in the order they were configured. When a match occurs, it is processed, and the lookup process stops. - If STATIC_maxpackets is configured to 0, all multihomed proxying from / to the configured prefixes will be denied. - If STATIC_maxpackets is configured to 1, MHAP_maxproxy will be used instead like in a dynamic entry. Py MHAP 1a [Page 19] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 - If STATIC_maxpackets is configured to 2, the PROXY_MHAP_packets will not be updated and will not be checked against MHAP_maxproxy. - If STATIC_maxpackets is configured to any other number, the PROXY_MHAP_packets will be incremented and checked against STATIC_maxpackets Static entries in the short-term table can have different prefix sizes than the size defined by MHAP_shorterm_blocksize. If ASSOC_MHAP_prefix is configured to zero, any multihomed traffic will match the entry. Dynamic entries are checked next. Since dynamic entries are all of the same size, the order they are checked does not matter. 6.3.18. Examples of static short term table entries: Note that PROXY_MHAP_packets is incremented/reset by the router itself and COUNT_unused is always -1 for a static entry +-+------------------+-------------------+-------------------+ |#| SH_source_prefix | ASSOC_MHAP_prefix | STATIC_maxpackets | +-+------------------+-------------------+-------------------+ |1| ::/0 | 0:0:0 | 0 | +-+------------------+-------------------+-------------------+ |2| 2541:3672:/32 | 0:0:0 | 1 | +-+------------------+-------------------+-------------------+ |3| ::/0 | 3FFE:FFFF:1234 | 2 | +-+------------------+-------------------+-------------------+ |4| ::/0 | 0:0:0 | 2 | +-+------------------+-------------------+-------------------+ #1: Deny any multihomed traffic proxying at all. This is not recommended since it will deny legitimate MHAP proxying as well. #2: Allow non-MHAP multihomed traffic from 2541:3672:/32 to any multihomed destination to be proxied up to the limits of MHAP_maxproxy packets per second. This limits the proxying rate from that prefix. With default configuration, the proxying rate would have been MHAP_maxproxy packets per second per MHAP prefix. #3: Allow unlimited proxying to the MHAP prefix 3FFE:FFFF:1234/48 regardless of the source. #4: Allow unlimited proxying. Note that such an entry is preferred to setting the MHAP_maxproxy to a high value because it will save the rendezvous point the work of creating and checking the short-term table. 6.4 Compromises The design of MHAP has required some compromises outlined below: 6.4.1. Sub-optimal path Py MHAP 1a [Page 20] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 The path of MHAP requests, of egress unaliased multihomed packets, and of non-MHAP proxied multihomed packets is not optimal. All of these will reach the closest MHAP rendezvous point where they will be aliased or proxied and then sent the appropriate MHAP endpoint. The number, location, bandwidth, and performance of MHAP rendezvous points can greatly affect multihomed performance. In case of geo PI, it is predictable that the path of multihomed packets will not always be optimal. 6.4.2. Latency The latency the very first MHAP packets destined to a yet unresolved MHAP aliasing block is higher than normal because these packets need to transit the MHAP rendezvous point (except for geo PI). 6.4.3. Sub-optimal allocation of address space For each multihomed prefix, if a given site is multihomed to n different TLAs / pTLAs, n MHAP aliasing blocks of the same size as the MHAP prefix are wasted by the MHAP process. 6.4.4. Proxying An MHAP rendezvous point performs two types of proxying: - MHAP-proxying, which allows the very first multihomed packets from an MHAP client to be sent before the MHAP reply that would allow that client to build the MHAP aliasing table arrives. - non-MHAP proxying, which allows easier deployment of MHAP by enabling any host or router that is not MHAP enabled to access multihomed address. Proxying in MHAP rendezvous points is not a stateful operation and does not distinguish between the two types of proxying. The approximation made by decrementing the number of proxied packets for each MHAP request received is statistically correct but would not prevent a flood of non-MHAP multihomed packets from a given prefix to max out the proxying limits and therefore deny MHAP proxying. The editor thinks that: a) A MHAP endpoints + MHAP rendezvous points only setup (all proxying, no clients) is not a viable solution (sub-optimal paths, high latency, does not scale on the rendezvous point side). MHAP rendezvous points have not been designed to be aggregators of the multihomed traffic. b) Therefore, the amount of non-MHAP proxying must be kept under control, and the static entries in the rendezvous point short-term table will enable TLAs and pTLAs to fix deadline to their downstream to make their networks MHAP compliant. 6.4.5. Scalability MHAP does not completely solve the scalability issue. It makes it better first by splitting the routing table into two independent parts and then by allowing the DFZ to be strongly summarized. 6.5.1 Flowchart abbreviations SA is MHd?: Is the IPv6 source address multihomed? MHAP EndP?: Is this router an MHAP endpoint? Py MHAP 1a [Page 21] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 MHAP RVPt?: Is this router an MHAP rendezvous point? MH Pref in RT: Is the MHAP prefix present in the MHAP routing table? MHAP Pckt?: Is this packet an MHAP datagram? MHAP Req?: Is this datagram an MHAP request? MHAP Repl?: Is this datagram an MHAP reply? Auth Pass?: Does this datagram pass a valid security check? MH Pref in TT?: Is the MHAP prefix in the MHAP aliasing table? MH TrBl in TT?: Is there at least one aliasing block in the corresponding MHAP prefix entry in the aliasing table? PX MAX Pkts?: Has the maximum number of proxied packets for the first source prefix/destination prefix match in the short-term table been reached? DA Mhomed, EndP,Clt: The destination address is multihomed and this router is an MHAP client or an MHAP endpoint. DA Mhomed, RV point: The destination address is multihomed and this router is an MHAP rendezvous point. DA Shomed, EndP,Clt: The destination address is singlehomed and this router is an MHAP client or an MHAP endpoint. DA Shomed, EndP,Clt: The destination address is singlehomed and this router is an MHAP rendezvous point. 6.5.2. Flowchart notes The flowchart is a logical, high-level conceptual model of the way MHAP-related data flows inside an MHAP-enabled router. It has not been designed to closely map the actual configuration tasks. For example, the flowchart discards singlehomed traffic that hits a rendezvous point. In reality, no check is to be performed to decide if the router is a rendezvous point; an ingress traffic filter to discard singlehomed traffic configured only on rendezvous points would be simpler. Py MHAP 1a [Page 22] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 6.5.2 MHAP Flowchart +---------------------------------+ +--------> Lookup the routing table for a | ^ | match of the source address (I | | | and II) or at the pref (III). | |No | Populate the MHAP reply with | /\ | the matching aliasing block | /SA\ | and then with up to three other | / is \____ | aliasing blocks associated. | \ MHd/Yes | +---------------------------------+ \ ?/ v \/ +-+--->-----+--->-------------------+ ^ ^ ^ | |Yes |No |No +--------------+ | /\ /\ /\ | Alias | | / \ / \ /MH\ | destination | | /MHAP\___/MHAP\____/Pref\____| address and | | Ingress \EndP/No>\RVPt/Yes>\ in /Yes-> forward. | | Packet \ ?/ \ ?/ \RT/ +----------v---+ | v \/ \/ \/ (a) | | ^ +--->----------------->+ | |Yes |No | /\ /\ /\ /\ +-------------+ | / \ / \ / \ / \ | Update | | /MHAP\____/MHAP\___/MHAP\_____/Auth\____| Aliasing | | \Pckt/Yes>\ Req/No>\Repl/Yes> \Pass/Yes-> Table. | | \ ?/ \ ?/ \ ?/ \ ?/ +-------------+ | \/ \/ \/ \/ | |No |No | v +------v--+ +----------------------+ | | |keepalive| | Send MHAP request(s).| | | +---------+ | If this router is an | | | | endpoint, send mul- | | +-----v---+ /\ /\ | tiple requests and | | | Choice: | /MH\ /MH\ | alias their source | | +---------+ /Pref\______/TrBl\___| address to a single- | | |DA MHomed| \ in /Yes ^ \ in /No-> homed address before | | |EndP, Clt>-->\TT/ | \TT/ | sending them. | | +---------+ \/ | \/ +--------------v-------+ | +--+ | Aliasing | | | +--------v-------+ | | |EndP, Clt| | | Table | | | | Send unaliased | | | +---------+ | +-----------+ | | | to best RV Pt. | | | |DA MHomed| | | | +----v-----------+ | | |RV point | +---------------)------+ | | | | +---v-----+ /\ | | | (a) | +---+ | | /PX\ +-----v-+ | | | | | v +------>/ MAX\_No-> Alias | +-v-v---v---v--+ +---v---+ | \Pkts/ | Dest |--> Process with | | bit | | \ ?/ | Addr | | Packet flow. | |bucket | | \/ +-------+ +--------------+ +--^----+ | | Yes | +----------------v--->----------------------------------+ Py MHAP 1a [Page 23] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 7. Fault tolerance The purpose of MHAP is to alias multihomed traffic (to a multihomed prefix) to singlehomed traffic (to an MHAP aliasing block). If the preferred aliasing block (MHAP_TB_1) associated with the MHAP prefix becomes unavailable, MHAP should be able to recover preferably fast enough for upper layer connections not to timeout. MHAP clients and endpoints send at periodic intervals (defined by MHAP_request_timeout) a keepalive datagram to each MHAP_TB_1. If the keepalive fails to return, traffic is immediately failed over to the aliasing block defined in MHAP_TB_2. If the keepalive fails to return three times in a row, a new MHAP request (that can be multiple in the case of an MHAP endpoint) is sent, and traffic to the affected multihomed prefix is still being sent to MHAP_TB_2 in the meantime. Keepalives are a reasonable waste of bandwidth. They are sent only when there is other traffic to a specific multihomed prefix. MHAP multihomed clients and MHAP endpoints also check the validity of the route (NLRI) of each MHAP_TB_1 at the interval defined by MHAP_request_timeout and immediately (without waiting for the keepalive that they cannot send if there is no route) fail over the traffic to MHAP_TB_2 and send a new MHAP request (that can be multiple in the case of an MHAP endpoint) if no valid route is found. To insure the validity and availability of MHAP_TB_2 when needed, MHAP clients and endpoints send at periodic intervals (defined by MHAP_refresh_timeout) an MHAP REFRESH request (that can be multiple in the case of an MHAP endpoint) to each MHAP_TB_2. MHAP refresh requests are identical to MHAP requests except that they are sent to an MHAP aliasing block directly and do not transit the MHAP rendezvous point. 8. Load balancing MHAP can use two different types of load balancing: - Network load balancing, which is available to any network traffic. Since traffic aliased by MHAP is no different than any other traffic, the only requirement of network load balancing is making sure that the MHAP aliasing process occurs before the load balancing process. - MHAP load balancing. This future enhancement of MHAP leverages the fact that the MHAP aliasing table knows up to four MHAP aliasing blocks for each MHAP prefix will be described in a later revision of this document. 9. Application compatibility End-to-end traffic is unaware of MHAP. The MHAP client aliases the destination multihomed into a singlehomed address, and the MHAP endpoint aliases it back to the same multihomed address that was sent by the originating end device. There is no way for an end device to Py MHAP 1a [Page 24] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 know that the multihomed traffic it sends or receives has been or will be aliased twice. Therefore, MHAP is transparent to the upper layers and should not require any modifications of applications or network components at the Transport level and above. The only two situations that have been identified so far as requiring special handling of MHAP are a) a firewall performing stateful packet inspection and/or dynamic stateful access filtering in case of asymmetric traffic and b) ICMP unreachable or related messages that would need to be reversed-natted to reach the original host. 10. Security considerations By modifying the MHAP aliasing table, MHAP reply datagrams can alter the destination of traffic. With such potential for abuse, MHAP clients and endpoints must not process any MHAP reply datagram that is not a reply from a request they sent. Each MHAP request, refresh request, and keepalive request contains a unique 64-bit random number, MHAP_key. The algorithm used to generate the key is left to each vendor as long as the key is unique and the sequence unpredictable even when the router boots. BGP4+ peering between MHAP rendezvous points and other routers might bring security issues. These issues are not specific to MHAP and should be addressed the same way they are addressed for regular BGP4+ peering. 11. IANA Considerations Since there is no MHAP running code at the time of the writing, this document IMAGINES that the IANA has reserved the following: - UDP Port number 7777/UDP. - 2345::/16: production MHAP block, /48 blocks to be allocated by various registry authorities. - 2346::/16: production geo PI block, /48 blocks to be allocated by the local aggregation authority. - 3FFE:FFFF::/32: 6bone MHAP block, /48 blocks to be allocated by the 6bone. - 3FFE:FD00::/24: 6bone geo PI block, /48 blocks to be allocated by the local aggregation authority or by the ipv6mh webmaster. 12. Registry considerations This document does not intend to define any policy about IPv6 centralized multihomed address allocation. The editor thinks that MHAP block allocation policy should be a separate document. The following addresses, besides being fictitious, merely provides a possible, not even suggested, allocation scheme. Py MHAP 1a [Page 25] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 However, geo PI addresses are to be administered by local aggregation authorities and are to follow the guidelines below. The body to administer allocation of aggregation areas is to be determined. 12.1 Centralized multihomed addresses: 2345::/16 production 2345:0::/32 Reserved 2345:1::/32 ARIN 2345:1:1::/48 American company 2345:2::/32 RIPE 2345:2:1::/48 European company 2345:3::/32 APCNIC 2345:3:1::/48 Asian company 3FFE:FFFF::/32 6bone 3FFE:FFFF::0:/48 Reserved 3FFE:FFFF::1:/48 The first to provide MHAP running code 3FFE:FFFF::2:/48 The second to provide MHAP running code 12.2 Geographically aggregatable multihomed addresses: 2346::/16 and 3FFE:FD00::/24 2346:0000::/32 3FFE:FD00:0000::/36 Reserved 2346:0001::/32 3FFE:FD00:1000::/36 Sacramento, CA, US 2346:0002::/32 3FFE:FD00:2000::/36 Quebec, QC, CA 2346:0003::/32 3FFE:FD00:3000::/36 London, GB, UK 2346:0004::/32 3FFE:FD00:4000::/36 Paris, FR 2346:0005::/32 3FFE:FD00:5000::/36 Honolulu, HI, US 2346:0006::/32 3FFE:FD00:6000::/36 San-Francisco, CA, US 2346:0007::/32 3FFE:FD00:7000::/36 SF East Bay, CA, US 2346:0008::/32 3FFE:FD00:8000::/36 Silicon Valley, CA, US 2346:0009::/32 3FFE:FD00:9000::/36 Antarctic continent 2346:000A::/32 3FFE:FD00:A000::/36 Madrid, ES 2346:000B::/32 3FFE:FD00:B000::/36 Alpha space station 2346:000C::/32 3FFE:FD00:C000::/36 Dublin, IE 2346:000D::/32 3FFE:FD00:D000::/36 Leeds, GB, UK 2346:000E::/32 3FFE:FD00:E000::/36 Amsterdam, NL 2346:000F::/32 3FFE:FD00:F000::/36 Frankfurt, DE 2346:0010::/32 3FFE:FD01:0000::/36 Stockholm, SE 2346:0011::/32 3FFE:FD01:1000::/36 Ottawa, ON, CA . 2346:0FFF::/32 3FFE:FDFF:F000::/36 Last 6bone geo PI area . 2346:FFFF::/32 Allocate another /16 12.3 local allocations: 2346:0001::/32 3FFE:FD00:1000::/36 Sacramento, CA, USA 2346:0001:0000::/48 3FFE:FD00:1000::/48 Reserved for links 2346:0001:0001::/48 3FFE:FD00:1001::/48 Customer #1 2346:0001:0002::/48 3FFE:FD00:1002::/48 Customer #2 2346:0001:0003::/48 3FFE:FD00:1003::/48 Customer #3 . 2346:0001:0FFF::/48 3FFE:FD00:1FFF::/48 Last 6bone geo block . Py MHAP 1a [Page 26] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 2346:0001:FFFF::/48 Allocate another /32 2346:000C::/32 3FFE:FD00:C000::/36 Dublin, IE 2346:000C:0001::/48 3FFE:FD00:C001::/48 alphyra.ie Py MHAP 1a [Page 27] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 13. Datagram structure 13.1 MHAP datagram structure. All MHAP datagrams begin with the same structure and are carried over UDP port 7777. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| Traffic Class | Flow Label | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload Length | Next Header | Hop Limit | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Source Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Destination Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP Source port | UDP Destination port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP Length | UDP Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MHAP version | MHAP type | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ MHAP prefix + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + MHAP key + | | +-+-+-+-+-+-+-+- == end of keepalive datagrams == +-+-+-+-+-+-+-+ | | + MHAP_TB1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ MHAP_TB2 | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + MHAP_TB3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ MHAP_TB4 | | | +-+-+-+-+-+- == end of request or reply datagrams == -+-+-+-+-+-+ Py MHAP 1a [Page 28] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Version 4-bit Internet Protocol version number = 6. Payload Length 48 for requests and replies, 24 for keepalive requests and replies Next Header 8-bit unsigned integer. 17: UDP UDP destination port 7777: MHAP UDP length 40 for requests and replies, 16 for keepalive requests and replies. MHAP version 1 MHAP type 1 = request 2 = reply 3 = keepalive request 4 = keepalive reply 13.2 Packets or datagrams that have been aliased by MHAP are no different than the original packet or datagram except the destination address. In fact, packets or datagrams that have been aliased twice (to singlehomed by the MHAP client and back to multihomed by the MHAP endpoint) should be identical to the original. 14. Topology 14.1 Traffic examples: The general principle of traffic flow is as follows: 14.1.1. Rendezvous points, aggregators and endpoints exchange multihomed routes. 14.1.2. Source sends a packet to a multihomed address. 14.1.3. Client buffers the packet and sends a request to the same destination as the packet. 14.1.4. Client forwards unchanged packet. 14.1.5. Rendezvous point aliases the request and proxies the packet to the destination endpoint. 14.1.6. Destination endpoint receives the request and informs the client how to do the aliasing from now on. 14.1.7. destination endpoint receives first data packet, de-aliases it and sends it to the destination host. 14.1.8. Client now aliases packets and sends them directly to the destination for subsequent packets. 14.1.9. Session (on a per-prefix basis) ends. 14.1.10. Client state times out. The following example details the flow of the traffic and the aliasing modifications. - Host X off the single homed subnet of R18 has an IP address of 3653:1:2::2 Py MHAP 1a [Page 29] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 - ASN 65010 has been allocated the SH block 2134:5:6::/48 by ISP A. - ASN 65010 has been allocated the SH block 3653:3:4::/48 by ISP B. - ASN 65010 has been allocated the MH block 2345:9:8::/48 by ARIN - Host Y off the dual homed subnet of R16 and R17 has an IP address of 2345:9:8::2 - Host X wants to talk to host Y - Host X sends packet#1 to R18. SA=3653:1:2::2 DA=2345:9:8::2 - The best route R18 has for 2345:9:8::1 is 2345::/16 to R10 via R8 via R13 - R18 does not have an entry for 2345:9:8::/48 in the MHAP aliasing table - R18 sends an MHAP request to R10 via R8. SA=3653:1:2::2 DA=2345:9:8::2 - R18 sends unaliased egress packet #1 to R10 via R8 via R13. SA=3653:1:2::2 DA=2345:9:8::2 - R10 has a full copy of the MHAP table and aliases both the MHAP request and packet #1 - Depending on how hot the potato is, R10 elects that the best path to 2345:9:8::/48 (same prefix length, shorter as-path) is R17 via R13 via R8 - R10 sends (proxies) both the MHAP request and packet #1 to R17 via R13 via R8. SA=3653:1:2::2 DA=3653:3:4::2 - R17 receives the MHAP request from R18 and answers with two SH prefixes: 3653:3:4::/48 (preferred, because that is the best route to R18) and 2134:5:6::/48. R17 knows where to send the MHAP reply by the SA that has not changed. Note that this is very good in terms of traffic taking the same path both ways, because this will cause R18 to send the next packets to the SH address of ISP B - R17 receives packet#1. Since R17 has knowledge by static configuration that MHAP block 2345:9:8::/48 is associated with SH block 3653:3:4::/48 (the DA), R17 unaliases packet#1 back to SA=3653:1:2::2 DA=2345:9:8::2 - R17 sends packet#1 to Host Y. As far as host Y is concerned, packet#1 is the same as it was sent by host X (save the extra latency). Host Y can return traffic to host X. - R18 receives the MHAP reply from R17 and builds an entry in the MHAP aliasing table. - Host X sends packet #2 to host Y. SA=3653:1:2::2 DA=2345:9:8::2 - R18 now has an entry for 2345:9:8::/48 and aliases packet #2 - R18 sends packet #2 to R17. SA=3653:1:2::2 DA=3653:3:4::2 - R17 receives packet#2. Since R17 has knowledge by static configuration that MHAP block 2345:9:8::/48 is associated with SH block 3653:3:4::/48 (the DA), R17 unaliases packet#2 back to SA=3653:1:2::2 DA=2345:9:8::2 Py MHAP 1a [Page 30] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 - R17 sends packet#2 to Host Y. As far as host Y is concerned, packet#2 is the same as it was sent by host X. There is no extra latency. 14.2 Topology note s 14.2.1 The number of MHAP clients must be strictly controlled. For singlehomed customers, there are no advantages of having MHAP clients all over their network. 14.2.2 Customers connected with a single physical link must not have any MHAP clients and must send unaliased traffic to their transit provider. 14.2.3 Tier-2 transit providers are required encouraged to provide MHAP client services to their customers connected with a single physical link and are encouraged to provide MHAP client services to all of their customers using multihomed MHAP clients. Py MHAP 1a [Page 31] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 14.3 Sample topology diagram +-ISP A, 2134::/16-------+ +-ISP B, 3653::/16----------+ | ___ | | | | Mhomed /End\ | | | | +-----------------+ | | Subnet \_1_/ | | | | | _|_ | | _+_ ___ | | / \ | | / \ /Cli\ | | ( >---------< >------< ent ) | | /\_2_/ | | \_3_/\ \_4_/ | | / | | | | \ |Shomed | | ___ / | | | | \ ___ |Subnet | | /RV \ \ | | / /RV \ +-----+ | | (Point>......|....|.|....|......-----< ) : | | : __/\_7_/ | | \_8_/ \ : | | _:_ / | | | | \ _:_ | | /RV \/ | | | | \ /RV \ | | (Point>......|....|.|....|......--< >-----< >---< ent >-----+ | | Subnet \11_/ \12_/ | | \13_/ \ \14_/ Subnet | | | | | | \ | +-ASN 65001---------|----+ +----|---- \ ---ASN 65002---+ | | \ +-ASN 65010---------|-----------|----+ \ | | | | + \ ------------+ | ___ _+_ _+_ | | \___ | | Shomed /Cli\ /End\ /End\ | | /Cli\ Shomed | | +-----< ent >----------X-+ | | Subnet \15_/ \16_/ \17_/ | | \18_/ Subnet | | | | | | | | Mhomed | Subnet | | +---------------+ | +------+-----Y-----+ | +------------------------------------+ 14.4 Route distribution There are different types of routers involved in an MHAP topology: - Singlehomed MHAP client - Multihomed MHAP client - MHAP Type I endpoint - MHAP type II endpoint - MHAP type III endpoint - MHAP aggregator - MHAP rendezvous point - Transit router Py MHAP 1a [Page 32] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Each router contains singlehomed routes, multihomed routes, or both as described below. 14.4.1 Singlehomed MHAP client Singlehomed routes: default route. Multihomed routes: default route. 14.4.2 Multihomed MHAP client Singlehomed routes: no change. Multihomed routes: - MHAP blocks BGP aggregates and geo PI BGP aggregates from whoever, starting at the closest rendezvous point. - Individual BGP MHAP prefixes or BGP geo PI prefixes from non-transit peers (not to be re-advertised to anybody). 14.4.3 MHAP Type I endpoint Singlehomed routes: no change. Multihomed routes: - MHAP blocks BGP aggregates and geo PI BGP aggregates from the rendezvous points. Type I endpoints peer with rendezvous points. - Directly connected MHAP prefix to be advertised to the rendezvous points. - Individual BGP MHAP prefixes or BGP geo PI prefixes from non-transit peers (not to be re-advertised to anybody). 14.4.4 MHAP type II endpoint Singlehomed routes: no change. Multihomed routes: - MHAP blocks BGP aggregates and geo PI BGP aggregates from the aggregators. Type II endpoints peer with aggregators. - Directly connected geo prefix to be advertised to the aggregators. - Individual BGP geo PI prefixes from the MHAP area concerned. - Individual BGP MHAP prefixes or BGP geo PI prefixes from non-transit peers (not to be re-advertised to anybody). 14.4.5 MHAP type III endpoint Singlehomed routes: no change. Multihomed routes: - MHAP blocks RIP aggregates and geo PI RIP aggregates from the aggregators. Type III endpoints exchange RIP routes with aggregators. - Directly connected geo prefix to be annouced to the aggregators. - Individual RIP geo PI prefixes from the MHAP area concerned. 14.4.6 MHAP aggregator Singlehomed routes: no change. Multihomed routes: - MHAP blocks BGP aggregates and geo PI BGP aggregates from the rendezvous points. Aggregators peer with rendezvous points. - Individual BGP geo PI prefixes from type II endpoints for the area. - Individual RIP geo PI prefixes from type III endpoints for the area. - Individual BGP MHAP prefixes or BGP geo PI prefixes from non-transit peers (not to be re-advertised to anybody). 14.4.7 MHAP rendezvous point Py MHAP 1a [Page 33] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Singlehomed routes: - IGP or static only. Multihomed routes: Rendezvous points peer with other rendezvous points, aggregators and type I endpoints. - geo PI BGP aggregates from the aggregators. - Individual BGP MHAP prefixes from other rendezvous points. - Individual BGP MHAP prefixes from type I endpoints. 14.4.8 Transit router Singlehomed routes: no change. Multihomed routes: - MHAP blocks BGP aggregates and geo PI BGP aggregates from whoever, starting at the closest rendezvous point. 15. Statement of direction - The editor thinks that there is no good viable, universal, long-term IPv6 multihoming solution at the time of writing. MHAP addresses only one of the three address spaces of IPv6 multihoming (router, host, mobile). - There is some danger in people requesting pTLAs/subTLAs for the sole purpose of having their prefix advertisable in the DFZ. - To some extent, deployment of IPv6 has been or will be delayed by the lack of a solid IPv6 multihoming solution. The motivations behind the design of MHAP are: - A good waiting solution: Between the increased scalability provided by MHAP and the natural increase in router processing power and memory, will keep the IPv6 DFZ cleanly summarized until the perfect multihoming solution is invented. - A proven mechanism: Uses BGP4+, which is the only proven mechanism. MHAP is an evolution in the tradition of incremental changes rather than a revolution. - A consensus builder: A middle solution, between multihoming the same way it is done in IPv4 and solutions that would be consider too radical by many. 16. Revision History xx xx,2002: - Presentation changes. - Added 6.1.24. - Replaced "regions" with "areas". - Added type IV endpoints March 28,2002: renamed to MHAP version -01a - Changed the name and related text. - Added clarifications inspired by mailing list questions and private meetings at IETF-53. Py MHAP 1a [Page 34] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 February 28, 2002: version -02a - Submitted as working document of ipv6mh. - Editorial changes. - Added traffic examples. - Added definition and use of geo PI. November 21, 2001: - Submitted as http://search.ietf.org/internet-drafts/draft-py-multi6- mhtp-01.txt. - Editorial changes. - Changes regarding provider-independent motivations. - Added: 18. Compliance with the requirements and 19. Full Copyright Statement. - Moved 18. References to 20. and 19. Editor's address to 21. August 20,2001: version -01b - Not submitted. Text available as: http://arneill- py.sacramento.ca.us/draft-py-multi6-mhtp-01.txt - Minor editorial changes. - Added: 16. Revision history and 17. Acknowledgements. - Moved: 16. References and 17. Editor's address to: 18. References and 19. Editor's address August 14,2001: version -01a - Not submitted. Text available as: http://arneill- py.sacramento.ca.us/draft-py-multi6-mhtp-01.txt - Minor editorial changes. - Added: 14. Topology and 15. Statement of direction. - Moved: 14. References and 15. Editor's address to: 16. References and 17. Editor's address August 6,2001: version -00 - Original submission to the IETF as: http://search.ietf.org/internet-drafts/draft-py-multi6-mhtp-00.txt 17. Acknowledgements - Pekka Savola for reviewing the draft and bringing up ICMP issues. - Iljitsch Van Beijnum for the routing insights and his questions on the mailing list. - the ipv6mh mailing list for its questions and feedback. 18. Compliance with the requirements This chapter details the compliance of this document with [11. B. Black, V. Gill, J. Abley, "Requirements for IP Multihoming Architectures", work in progress, http://www.ietf.org/internet- drafts/draft-ietf-multi6-multihoming-requirements-02.txt, November 2001.] Chapter numbers are from the document mentioned above. Py MHAP 1a [Page 35] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 3.1.1 Redundancy MHAP does not present significant changes in terms of redundancy with the currently implemented IPv4 solution. 3.1.2 Load sharing MHAP does not present significant changes in terms of load sharing with the currently implemented IPv4 solution. 3.1.3 Performance MHAP does not present significant changes in terms of performance with the currently implemented IPv4 solution. 3.1.4 Policy MHAP is simply added to the list of modules that look at policy (i.e. a local config table) as part of the routing process. That would be "process with packet flow" in 6.5.1 [MHAP Flowchart]. 3.1.5 Simplicity MHAP will be slightly more complex to implement than the current IPv4 solution because there will be a few more routers (the MHAP rendezvous points) to configure. This is by far balanced by the fact that MHAP endpoints will not require a full MHAP table and will be simpler to configure. Overall, MHAP is not substantially more complex than current multihoming practices. 3.1.6 Transport-layer Survivability MHAP provides a significant improvement of transport-layer survivability by the use of keepalives that are sent by the MHAP router, a lot closer to the host than the current solution. 3.2.1 Scalability - Scalability on most routers is improved by two orders of magnitude (100 times). The size of the routing table will be divided by 100. With a 120,000+ table at the time of writing, it is reasonable to assume that the same table, if summarized correctly, would be 1,200 or smaller, which is two orders of magnitude. MHAP makes possible a "8K routing table", that would be the maximum size when all 8,192 possible TLAs have been allocated. - Scalability on MHAP rendezvous points is improved by three orders of magnitude: a) The size of the MHAP table shall be a hundredth of the existing public routing table. This number (1,200) is a tenth of the number of allocated ASNs at the time of writing (12,000) which dictates the number of multihomed sites. It is assumed here that no more than 10 percent of multihomed sites will use an MHAP prefix and then rest will Py MHAP 1a [Page 36] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 use a geo PI prefix (that is aggregatable and therefore has little influence over the size of the routing table). b) MHAP rendezvous points do not process as much traffic (they process only the very first packets of a given session). Combined with the fact that the MHAP routing table has a fixed size, it is reasonable to assume that an MHAP router could service ten times more prefixes than a regular router. MHAP will provide a solution that is at least two orders of magnitude, or about 100 times, more scalable than the current solution. That is 1,000,000 multihomed IPv6 sites with currently available hardware. 3.2.2 Impact on Routers - Changes to the routers require an MHAP implementation. The main component of MHAP (BGP4+) already exists and requires only optimization (which could be delayed to accelerate deployment). The other components (NAT/lookup) are based on well-known technologies and should not require un-reasonable development efforts. All routers need not to be changed, only clients, endpoints and rendezvous points. End customers and tier-2 transit providers that are not multihomed do not require MHAP-capable routers. Long-term scalability will require tier-2 transit providers to configure MHAP; however the minimum requirements for MHAP initial deployment are two MHAP-capable routers for each TLA or pTLA. Not only MHAP does not prevent single-homed operations, but it does provide access to compliant multi-homed networks from unmodified single-homed networks. 3.2.3 Impact on Hosts MHAP does not require any host modifications. 3.2.4 Interaction between Hosts and the Routing system MHAP does not require any specific host-to-router communications. 3.2.5 Operations and management As every new protocol, MHAP will require monitoring commands and SNMP OIDs, among other things. At the current stage of development, there is no compelling reason to think that reasonable monitoring tools would not be developed as part of a vendor's implementation. 3.2.6 Cooperation between Transit Providers MHAP does not require site-specific cooperation between transit providers. MHAP requires area-specific cooperation. 4. Security Py MHAP 1a [Page 37] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 Refer to 10. Security Requirements 19. Full Copyright Statement Copyright (C) Michel Py (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice. This document and the information contained herein is provided on an "AS IS" basis and THE AUTHOR DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 20. References [RFC 2119] Bradner, S, "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, Harvard University, March 1997. [ADDRARCH] Deering, S. and R. Hinden, "IP Version 6 Addressing Architecture", RFC 2373, July 1998. [RFC 2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. 4. Carpenter, B., "Architectural Principles of the Internet", RFC 1958, June 1996. 5. Egevang, K. and Francis, P., "The IP Network Address Translator (NAT)", RFC 1631, May 1994. 6. T. Bates, R. Chandra, E. Chen, "BGP Route Reflection - An Alternative to Full Mesh IBGP", RFC2796, April 2000. 7. P. Marques, F. Dupont, "Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing", RFC 2545, March 1999. 8. A. Heffernan, "Protection of BGP Sessions via the TCP MD5 Signature Option", RFC 2385, August 1998. 9. Y. Rekhter, T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771, March 1995. [V6DOC] M. Blanchet, "IPv6 Address Space Reserved for Documentation", work in progress, http://search.ietf.org/internet-drafts/draft- blanchet-ngtrans-exampleaddr-01.txt, July 2001. Py MHAP 1a [Page 38] Draft Multi Homing Aliasing Protocol (MHAP) April 29, 2002 11. B. Black, V. Gill, J. Abley, "Requirements for IP Multihoming Architectures", work in progress, http://www.ietf.org/internet- drafts/draft-ietf-multi6-multihoming-requirements-02.txt, November 2001. 21. Editor's address arn-py@arneill-py.sacramento.ca.us or mpy@ieee.org Py MHAP 1a [Page 39]