Cloud HA Topology

This diagram shows an example of a vEOS Router Cloud HA implementation.

Figure 1. Cloud high availability network topology with vEOS router instances

In the diagram above, a virtual network is a collection of resources that are in the same cloud region. Within this virtual network, the resources, including vEOS routers, deploy into two cloud high availability zones (Availability Zones for AWS and Fault Domain for Azure) for fault tolerance reasons.

Note:For ease of discussion, we will use availability zone 1 and 2 to reference the high availability design in different clouds going forward.

Within each availability zone, the hosts/VMs and vEOS interfaces are connected to their corresponding subnets when the network is operating normally. Each subnet associates to a route table within the cloud infrastructure. Static routes are configured in the cloud route tables so the traffic from the hosts/VMs are routed to vEOS Routers in the corresponding availability zone as gateway or next-hop to reach certain destinations. For example, configure a default route (0.0.0.0/0) in the cloud route table with the next-hop as vEOS Router's cloud interface ID or IP (varies depending on the cloud). The routing policy or protocol, such as BGP, on the vEOS Routers, are user configurable based on user's network design.

The two vEOS Routers in the diagram above are configured with the Cloud HA feature as HA peers. The Cloud HA on the vEOS routers would establish a BFD peering session between the two devices through ethernet or tunnel interfaces.

When BFD connectivity loss is detected by the active vEOS router, the existing routes in the backup route table in the cloud would be updated through cloud-specific API to use the active vEOS router as the next-hop. For example, if vEOS 2 detected BFD connectivity loss with its peer, vEOS 2 would update the routes in Route Table 1 so traffic from hosts in Subnet 1 and Subnet 2 for vEOS 1 would be forwarded to next-hop ID or IP owned by vEOS 2. Traffic from the hosts in availability zone 1 would first be forwarded to the corresponding subnet gateways in the cloud. After that, the subnet gateways in the cloud would forward the traffic toward the new next-hop interface ID or IP that exist on vEOS 2. When vEOS 2 received the traffic, it would forward the traffic on according to its routing table.

What about traffic going toward the hosts in availability zone 1 while connectivity to vEOS 1 is down? When connectivity to vEOS 1 is down, hosts behind Subnet 1 and Subnet 2 become unreachable to the other part of the network (routes being withdrawn by routing protocols like BGP). Since Subnet 1 and Subnet 2 are not directly connected to vEOS 2, a routing strategy for the two subnets as "backup" on vEOS 2 is to be considered as part of your network design. A typical design would be to use static routes for the subnets connected to the peer vEOS router and point them toward the cloud subnet gateways of the active vEOS router (for example, static route for peer subnet 10.1.1.0/24 would be configured on the active vEOS router as ip route10.1.1.0/24 10.2.1.1 255 where 10.2.1.1 is the gateway/next-hop for one of the ethernet interfaces) with a high administrative distance value (least preferred). The static routes would be redistributed or advertised when the original routes with better administrative distance are withdrawn or removed by dynamic routing protocol (such as BGP).

When BFD peering session is restored to UP state upon recovery, each active vEOS router would restore its locally controlled route table entries (per user configuration) to point to itself as primary gateway again.