Multihoming: A Comprehensive Guide to Uninterrupted Connectivity for Networks

In today’s always-on digital landscape, uninterrupted internet connectivity is a fundamental necessity for network operations. A single point of failure in an internet connection can lead to lost revenue, damage to reputation, and widespread user frustration. This is where multihoming emerges as a robust solution. It enables a network to connect to multiple Internet Service Providers (ISPs) simultaneously.

Multihoming fundamentally enhances the resilience and performance of an organization’s internet connectivity by establishing multiple routes to a single location. Since – at its best – this system provides uninterrupted connectivity, it bolsters resilience and optimizes performance. Such a multihoming configuration establishes a redundant infrastructure, functioning during outages, and mitigating bottlenecks.

The Benefits of Multihoming

The advantages of multihoming go beyond redundancy. It offers an array of associated benefits:

Enhanced Reliability and Redundancy: This is the primary goal of most multihoming implementations. With it, if one ISP experiences an outage (due to network issues, natural disaster, or maintenance), traffic can be automatically rerouted through an alternative ISP. This greatly improves the likelihood of business continuity.
Plus, multihoming provides protection against more mundane failures, such as faulty cables or equipment malfunctions. It can even safeguard against unlikely but potential simultaneous failures of both ISP-side and customer-side routers.
Improved Performance and Reduced Latency: In many scenarios, multihoming enables optimized traffic routing. By intelligently directing traffic to the ISP that offers the shortest path to a particular destination, performance is significantly improved. This is particularly beneficial for geographically dispersed users or latency-sensitive applications. Research confirms that multihoming can improve Round Trip Time (RTT) performance by as much as 25% and achieve up to 20% higher data transfer speeds when an enterprise connects to two or three strategically chosen ISPs.
Load Balancing: In some scenarios, traffic can be intelligently distributed across multiple ISP links. This prevents any single link from becoming saturated. This multihoming function optimizes bandwidth use and provides consistent performance, even during peak traffic periods.
Increased Bandwidth Capacity: By aggregating bandwidth from multiple providers, a network system can achieve a higher overall internet capacity than with a single provider. This is crucial for organizations with high bandwidth demands.
Negotiating Power and Cost Optimization: Having multiple ISP options provides greater leverage during contract negotiations, potentially leading to better service levels and pricing.
Disaster Recovery: In the event of a major regional outage affecting one ISP – or even an entire geographic region – the ability to failover to an ISP whose infrastructure is unaffected provides a critical layer of disaster recovery. This also protects against a variety of physical outages, scheduled maintenance windows, internal network management issues, routing problems, and peering disputes that could otherwise compromise a single provider’s service.
Vendor Diversity: Relying on a single ISP can create vendor lock-in. Multihoming promotes vendor diversity, reducing dependence on a single provider and offering greater flexibility for network growth and changes. The ability to adapt quickly to market changes or new technology from different providers is a long-term advantage.

Types of Multihoming

Multihoming can be implemented in several ways, each with its own operational characteristics, complexities, and levels of resilience. These implementations represent a spectrum of risk, cost, and difficulty versus reward.

Multiple Homing (Provider-Dependent/Single-Homed): This is the simplest form, where an enterprise has multiple connections, but each connection is to a different ISP. However, each connection typically uses IP addresses assigned by each, specific ISP. Both links connect to a single device on the client side, with two default routes: one to the primary ISP, and a second with a higher metric to the secondary ISP. If the primary link goes down, routing will fail over to the backup link.

This configuration offers a basic level of redundancy at the ISP layer. If one ISP link experiences a failure, traffic can be rerouted through the remaining operational connection. It does not protect against failures deeper in the ISP’s network than the local link.

While it offers one level of redundancy, it lacks true independence from the ISPs’ addressing schemes. If an ISP’s link fails, the IP addresses associated with that ISP become unreachable, requiring changes to DNS records or client configurations. While various routing protocols can technically be used, Border Gateway Protocol (BGP) is generally the preferred choice even in this simpler setup.

Provider-Independent Addressing with Border Gateway Protocol (BGP): This approach is generally considered the most robust and is the common standard for large organizations. The network obtains its own block of IP addresses and Autonomous System Number (ASN). These resources may be available from an IP and ASN marketplace like IPv4.Global. (A public ASN is a prerequisite when multihoming to different ISPs.) These are then advertised to all connected ISPs using BGP. In the simplest BGP configuration, the ISPs advertise default routes to the clients, so that if there is a failure, the routes are withdrawn and traffic automatically fails over to another link.

The important advantage here is in offering full control over IP addressing, which enables seamless failover between ISPs (as the same IP addresses are consistently used regardless of the active link). The ISP links may be connected to one or two routers; two routers eliminates either as a single point of failure, but requires additional routing in case one of those devices fails. That might be redundant default gateways on local hosts, or having the two routers share an IP address using HSRP (Host Standby Router Protocol) or VRRP (Virtual Router Redundancy Protocol). Implementing BGP demands significant technical expertise and careful configuration, as mistakes can lead to routing issues like loops or blackholes. The deployment of BGP-capable routers is essential for this setup. Even in scenarios involving multiple physical links to a single ISP, BGP is often the best choice due to its design for routing between distinct organizations.

Multihoming and Receiving Routing Tables

This approach provides for some level of load balancing and path optimization. In the simplest form, one or both ISP(s) send(s) routing information for all of its directly connected customers. The client then has a default route to both, with different metrics. Traffic to ISP A’s customers will use that link, traffic to ISP B’s customers will use that link, and traffic destined for the rest of the internet will use the default route. Load balancing is not perfectly balanced, but the load is distributed.

Note that with two links in use, it’s possible to use more than 100% of the capacity of either link alone, meaning that in case of failure, there will be too much traffic, causing congestion and performance problems. This may be an acceptable trade off between cost and performance.

A more complicated approach has both ISPs send full routing tables. The client’s router will then have to choose between available paths for every network on the internet. This requires more powerful routers, capable of holding that many routes in memory, and of keeping up with constant route updates. The benefit is that traffic to any destination on the internet will take the shorted path (as calculated by BGP), providing the best available RTT.

Multihoming with Multiple Network Interfaces (within a single ISP): While not strictly “multihoming” in the sense of multiple ISPs, an organization can have multiple physical connections to a single ISP, sometimes through different access points or circuits.

This approach offers protection against physical disruptions such as failing or faulty cables and other equipment loss. It can also mitigate risks associated with other routing outages within the ISP’s network and ensures connectivity during planned or unplanned maintenance activities by the single provider.

However, the fundamental limitation of this method is the reliance on a single ISP. This setup offers no protection against ISP-wide outages, network management issues affecting the entire provider, or peering disputes that compromise the sole provider’s infrastructure.

SD-WAN (Software-Defined Wide Area Network) based Multihoming: SD-WAN solutions often incorporate multihoming capabilities as a core feature. They can intelligently manage traffic across multiple ISP links, providing dynamic path selection, application-aware routing, and automated failover. An SD-WAN can abstract the underlying network complexity, which makes multihoming easier to deploy and manage compared to traditional methods.

Operational Details and Challenges

Implementing and managing multihoming, particularly with BGP, introduces several operational considerations and potential challenges:

BGP Configuration Complexity: BGP is a powerful but intricate protocol. Proper configuration requires a deep understanding of attributes like AS-Path, Local Preference, MED (Multi-Exit Discriminator), and communities to ensure optimal inbound and outbound traffic routing.
Autonomous System Number (ASN) and Provider-Independent Address Space Acquisition: For BGP-based multihoming, an enterprise needs its own ASN and PI IP address block. This involves applying to a RIR and justifying the need, which can be a bureaucratic process requiring help. Since most RIRs have no remaining IPv4 addresses, there may be a multi-year waiting list, or no possible way to get IPv4 addresses from them. An IP address broker can help.
Routing Policy and Traffic Engineering: Enterprises need to define clear routing policies to dictate how traffic enters and exits their network. This involves making decisions about which ISP is preferred for certain destinations, how to handle failover, and how to balance load. This may require complex traffic engineering techniques.
Inbound vs. Outbound Traffic Control: It’s often easier to control outbound traffic (which path your network takes to reach the internet) than inbound traffic (how the internet reaches your network). While BGP attributes can influence inbound routing, ultimate control lies with external networks.
Security Considerations: Multihoming introduces additional attack vectors. Proper access control lists (ACLs), BGP security best practices (e.g., BGPSEC, RPKI for route origin validation), and robust firewalling are needed. Best practices are well documented in MANRS.
Monitoring and Troubleshooting: With multiple ISP links, monitoring network performance, identifying bottlenecks, and troubleshooting connectivity issues become more complex. Comprehensive network monitoring tools and accurate reverse DNS records for network interfaces are crucial.
ISP Coordination: Effective multihoming requires close coordination with all connected ISPs regarding routing policies, peering arrangements, and troubleshooting.
Cost: While multihoming offers long-term benefits, it does come with increased costs for multiple ISP circuits, potentially more sophisticated routing hardware, and the expertise required for implementation and management.

The Role of IPAM and DDI Tools in Multihoming

Given the complexities of managing IP addresses and DNS in a multihomed environment, IP Address Management (IPAM) and integrated DDI (DNS, DHCP, IPAM) tools become indispensable. These solutions serve as the central nervous system for managing the foundational elements of a multihomed network, providing the intelligence, automation, and control for optimal performance and reliability.

Centralized IP Address Management: In a multihomed setup with provider-independentaddressing, an enterprise will have its own public IP address blocks. IPAM provides a centralized, accurate database of all assigned and available IP addresses, preventing conflicts and ensuring efficient use. This centralized management is key for all network operations, critical for planning subnet allocation, tracking assignments to devices and services, and maintaining a proper record for compliance and auditing purposes. Increasingly, IPAM serves as a Network Source of Truth (NSoT) for sustainable network automation, providing the authoritative data repository that reflects the intended state of the network.

Integrated DNS and DHCP (DDI): DDI solutions integrate DNS and DHCP functionalities with IPAM, creating a cohesive management platform.

DNS plays a vital role in directing traffic in a multihomed environment. In non-BGP multihoming scenarios, if an ISP link experiences an outage, DNS records for internal servers must be updated to reflect the new reachable IP addresses. DDI tools automate these DNS updates, reducing manual errors and speeding failover times. For BGP-based multihoming, DNS ensures that the correct public IP addresses are resolved, irrespective of which ISP is currently carrying the traffic.

DHCP, while less directly impacted by the multihoming of public internet connectivity, is crucial for assigning IP addresses to internal devices, ensuring consistent and reliable IP address assignment throughout the internal network.

Network Visibility and Control: DDI tools offer comprehensive visibility into IP address use, DNS queries, and DHCP leases. This visibility is fundamental for troubleshooting connectivity issues, especially in a multihomed environment, identifying potential bottlenecks, and monitoring compliance with network policies.

DDI’s role in a Network Source of Truth (NSoT) is critically important here. DDI platforms contain rich, authoritative network data, making them an ideal foundation for a NSoT. Such an NSoT represents the intended network configuration and so supports proactive planning, consistency, and control, and is crucial for reducing configuration drift.

Automation and Orchestration: Advanced DDI software, like ProVision, can integrate with other network management systems and orchestration platforms. This automates tasks such as IP address allocation, DNS record creation, and even some routing adjustments in response to network events. This level of automation is important in managing the dynamic nature of multihomed networks, and is essential in SD-WAN deployments.

Reporting and Auditing: DDI tools deliver thorough reports on IP address utilization, DNS changes, and DHCP activity. All are essential for capacity planning, security audits, and compliance requirements.

In essence, IPAM and DDI tools act as the central nervous system for managing the critical foundational elements of a multihomed network. They provide the necessary intelligence, automation, and control for the complexities of multiple ISPs and ensure optimal performance and reliability. When used in connection with hybrid and multi-cloud environments, DDI becomes even more important in managing multihomed networks that go beyond on-premises networks. This evolution allows DDI to provide consistent IPAM, DNS, and DHCP services across disparate cloud environments, which is crucial for multihomed enterprises with a growing cloud presence.

Peering Coordination

Although not discussed extensively, some tools (such as ProVision’s Peering Manager) provide visibility into what routes are available from potential peers, and simplify the management of BGP sessions.

Conclusion

Multihoming is a sophisticated but increasingly necessary strategy for networks to flourish in an internet-dependent world. While it necessarily includes operational complexities, the benefits of greater reliability, better performance, and improved control outweigh the challenges. By carefully considering the different types of multihoming, planning the operational details, and leveraging the power of IPAM and DDI tools, network administrators can build a resilient and high-performance network infrastructure.

To consider, plan and execute a multihoming strategy, consider the following:

Needs Assessment: Develop a thorough understanding of a particular organization’s operational continuity, performance, and cost management requirements.
Expertise: Recognize that BGP and advanced multihoming configurations demand specialized skills. Networks should either cultivate this expertise in-house or engage with external partners to provide it.
DDI Implementation: Deploy a robust DDI solution. The right system will act as the “source of truth” for critical network data, making efficient, effective management available.
Automation: Leverage automation tools and practices for routing policy, security enforcement, and monitoring. These are required to manage the complexity of multihomed environments.
Security Planning: Integrate security measures, including RPKI for route validation, advanced firewalls, and DDoS mitigation strategies.
Monitoring and Optimization: It is key to implement monitoring solutions to proactively detect issues, confirm performance benchmarks, and ensure the ongoing operations of the multihomed network.

Blogs