MP-BGP EVPN VXLAN ARP Suppression

ARP suppression is a feature for MP-BGP EVPN that reduces ARP flooding on VXLAN networks. Flooding impacts network performance, so it should be kept to a minimum.

IPv4 uses ARP to map an IP address to a MAC address. When a host wants to talk to another host on the same subnet, it sends an ARP request to figure out the remote host’s MAC address. An ARP request is broadcast traffic, so it falls in the category of broadcast, unknown-unicast, and multicast traffic (BUM traffic).




When using VXLAN, the hosts have no clue as to what network they are connected to. ARP requests are treated as multi-destination traffic and flooded to all VTEPs within the L2 VNI. This is achieved using static ingress replication or with a multicast underlay. The ARP reply is a unicast packet. Here is a visualization of ARP on a VXLAN network:

Vxlan Arp Suppression Arp Request Reply Regular

This flooding behavior is inefficient and should be reduced to a minimum.

IPv6 has something similar to ARP and uses Neighbor Discovery Protocol (ND) to map IPv6 addresses to MAC addresses. The IPv6 equivalent of an ARP request is the neighbor solicitation (NS), sent with Multicast. The destination replies with a neighbor advertisement (NA), similar to an IPv4 ARP reply. ARP suppression also supports IPv6 ND.

So, how does ARP suppression work? There are two pieces to understand:

  • Creating and maintaining the ARP suppression cache table
  • Dealing with ARP requests from hosts

Let’s take a closer look at both.

ARP Suppression Cache Table

Once you enable ARP suppression, each VTEP will create and maintain an ARP suppression cache table. In this table, they store the IP-MAC bindings of the different hosts in a VNI.

There are two ways to add MAC and IP bindings to the cache table:

  • Local
  • Remote

Let’s take a closer look at both options.

Local

VTEPs learn from the ARP requests from downstream hosts.

Many hosts are “chatty” and generate some traffic when they connect to the network. They might send a Gratuitous Address Resolution Protocol (GARP) or Reverse ARP (RARP) when they connect to the network. Otherwise, they might immediately send an ARP request for their default gateway, usually the VTEP.

When a host immediately generates traffic, the VTEP can quickly learn the host’s MAC and IP address and add them to the ARP suppression cache table.

There are exceptions, though. Some hosts might not generate any traffic until someone looks for them. We call these silent hosts. The only way to discover their MAC and IP address is that some host sends an ARP request for them. When the silent host replies with an ARP reply, we can learn their MAC and IP address. Examples of silent hosts can be printers or security devices. They can be connected for hours or days without initiating any traffic.

Vxlan Arp Suppression Arp Learned

Remote

The second option for learning MAC and IP bindings and installing them in the cache table is to learn them from remote VTEPs. When a VTEP learns about a host’s MAC and IP address, it installs it in the cache table but also creates an MP-BGP EVPN type 2 route, which is advertised to other VTEPs.

Vxlan Arp Suppression Arp Learned Remote

Respond to ARP Requests

Now that you know how the VTEPs fill the cache table, the second part is understanding how they deal with ARP requests from hosts.

Cache table without match

When a host in a VNI sends an ARP request for a host in the same VNI, the VTEP intercepts and checks the cache table, when there is no match, the switch floods the ARP request to other VTEPs.

Vxlan Arp Suppression Arp Request Reply Cache Empty

Cache table contains match

When there is a match in the cache table, the VTEP suppresses the ARP request so it won’t be flooded throughout the fabric.

Instead, the VTEP creates an ARP reply on behalf of the destination host and sends it to the host who sent the ARP request. The VTEP acts as an ARP proxy. It’s different, though, because with an ARP proxy, we use the router’s MAC address, and this time, the ARP reply is the same as if it originated from the destination host.

Vxlan Arp Suppression Arp Request Reply Cache Filled

Not flooding that ARP request to other VTEPs saves bandwidth on the underlay network and some CPU cycles on the hosts because they don’t have to process unnecessary ARP requests.

Issues

ARP suppression might sound like a good idea. Who doesn’t want less flooding and a more efficient network? It is also easy to configure. However, there are some issues.

ARPs aren’t restricted and flooded when needed on a typical Ethernet network. ARP suppression changes this behavior. Some applications might use ARP as a keep-alive mechanism. With ARP suppression enabled, these keepalives don’t make it end-to-end, and the application will think something is wrong. You might argue that (mis)using ARP for keepalives isn’t the best idea, but the reality is that we’ll have all kinds of applications running on our networks.

Issues related to inactive hosts also exist because of the mismatch between the MAC address aging time (5 minutes) and the ARP aging time (4 hours). This requires a more in-depth explanation outside the scope of this lesson.

ARP suppression is enabled or disabled by default, depending on the vendor. Do not enable this feature without fully understanding the possible complications.

Configuration

Let’s take a look at ARP suppression in action. We’ll do a before-and-after comparison so you can see the difference. I’m using a topology with a single spine switch and two leaf switches. We use an L2 VNI. It is the same topology we used in the MP-BGP EVPN L2 VNI lesson.

Vxlan Mp Bgp Evpn Arp Suppression Topology L2 Vni

We have two leaf switches connected to a single spine switch. The two hosts are Ubuntu docker containers. We’ll use these to generate some ARP and ICMP traffic. We use a single L2 VNI so that the hosts can communicate directly in the same subnet. I’m using Cisco NX-OS 9000v 10.3(1) on all switches.




Configurations

Want to take a look for yourself? Here, you will find the startup configuration of each device.

SPINE1

hostname SPINE1

nv overlay evpn
feature ospf
feature bgp
feature pim

ip pim rp-address 1.1.1.1 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8

interface Ethernet1/1
  no switchport
  mac-address 0050.c253.1001
  ip address 192.168.12.1/24
  ip ospf network point-to-point
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/2
  no switchport
  mac-address 0050.c253.1002
  ip address 192.168.13.1/24
  ip ospf network point-to-point
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface loopback0
  ip address 1.1.1.1/32
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
icam monitor scale

router ospf 1
router bgp 123
  log-neighbor-changes
  neighbor 2.2.2.2
    remote-as 123
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 3.3.3.3
    remote-as 123
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client

LEAF1

hostname LEAF1

nv overlay evpn
feature ospf
feature bgp
feature pim
feature vn-segment-vlan-based
feature nv overlay

ip pim rp-address 1.1.1.1 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8

vlan 10
  vn-segment 10010

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback0
  member vni 10010
    mcast-group 239.1.1.1

interface Ethernet1/1
  no switchport
  mac-address 0050.c253.2001
  ip address 192.168.12.2/24
  ip ospf network point-to-point
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/2
  switchport access vlan 10

interface loopback0
  ip address 2.2.2.2/32
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode

router ospf 1
router bgp 123
  log-neighbor-changes
  neighbor 1.1.1.1
    remote-as 123
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended

LEAF2

hostname LEAF2

nv overlay evpn
feature ospf
feature bgp
feature pim
feature vn-segment-vlan-based
feature nv overlay

ip pim rp-address 1.1.1.1 group-list 224.0.0.0/4
ip pim ssm range 232.0.0.0/8

vlan 10
  vn-segment 10010

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback0
  member vni 10010
    mcast-group 239.1.1.1

interface Ethernet1/1
  no switchport
  mac-address 0050.c253.3001
  ip address 192.168.13.3/24
  ip ospf network point-to-point
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/2
  switchport access vlan 10

interface loopback0
  ip address 3.3.3.3/32
  ip router ospf 1 area 0.0.0.0
  ip pim sparse-mode

router ospf 1
router bgp 123
  log-neighbor-changes
  neighbor 1.1.1.1
    remote-as 123
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended

Without ARP Suppression

Let’s start with the default behavior where ARP suppression is disabled. I’ll send a ping from S1 to S2:

root@S2:# ping 172.16.12.2 -c 5
PING 172.16.12.2 (172.16.12.2) 56(84) bytes of data.
64 bytes from 172.16.12.2: icmp_seq=1 ttl=64 time=5.52 ms
64 bytes from 172.16.12.2: icmp_seq=2 ttl=64 time=18.7 ms
64 bytes from 172.16.12.2: icmp_seq=3 ttl=64 time=24.8 ms
64 bytes from 172.16.12.2: icmp_seq=4 ttl=64 time=7.16 ms
64 bytes from 172.16.12.2: icmp_seq=5 ttl=64 time=16.6 ms

The ARP request looks like this:

Frame 8: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: OrionTechnol_53:10:02 (00:50:c2:53:10:02), Dst: IPv4mcast_01:01:01 (01:00:5e:01:01:01)
Internet Protocol Version 4, Src: 2.2.2.2, Dst: 239.1.1.1
User Datagram Protocol, Src Port: 63096, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: OrionTechnol_53:40:01 (00:50:c2:53:40:01), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Address Resolution Protocol (request)
    Hardware type: Ethernet (1)
    Protocol type: IPv4 (0x0800)
    Hardware size: 6
    Protocol size: 4
    Opcode: request (1)
    Sender MAC address: OrionTechnol_53:40:01 (00:50:c2:53:40:01)
    Sender IP address: 172.16.12.1
    Target MAC address: Xerox_00:00:00 (00:00:00:00:00:00)
    Target IP address: 172.16.12.2

This is a broadcast that is flooded to all VTEPs in the VNI. Here is the ARP reply:

Frame 9: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: OrionTechnol_53:30:01 (00:50:c2:53:30:01), Dst: OrionTechnol_53:10:02 (00:50:c2:53:10:02)
Internet Protocol Version 4, Src: 3.3.3.3, Dst: 2.2.2.2
User Datagram Protocol, Src Port: 58649, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: OrionTechnol_53:50:01 (00:50:c2:53:50:01), Dst: OrionTechnol_53:40:01 (00:50:c2:53:40:01)
Address Resolution Protocol (reply)
    Hardware type: Ethernet (1)
    Protocol type: IPv4 (0x0800)
    Hardware size: 6
    Protocol size: 4
    Opcode: reply (2)
    Sender MAC address: OrionTechnol_53:50:01 (00:50:c2:53:50:01)
    Sender IP address: 172.16.12.2
    Target MAC address: OrionTechnol_53:40:01 (00:50:c2:53:40:01)
    Target IP address: 172.16.12.1

This is a unicast packet from S2 to S1.

VXLAN MP-BGP EVPN ARP with Multicast Underlay

Here is what LEAF1 has advertised in MP-BGP EVPN:

LEAF1# show bgp l2vpn evpn neighbors 1.1.1.1 advertised-routes

Peer 1.1.1.1 routes for address family L2VPN EVPN:
BGP table version is 5, Local Router ID is 2.2.2.2
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - b
est2

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 2.2.2.2:32777    (L2VNI 10010)
*>l[2]:[0]:[0]:[48]:[0050.c253.4001]:[0]:[0.0.0.0]/216
                      2.2.2.2                           100      32768 i

Route Distinguisher: 3.3.3.3:32777

Above, we see the MAC address of S1 but no IP address. You see the same thing on LEAF2 for S2:

LEAF2# show bgp l2vpn evpn neighbors 1.1.1.1 advertised-routes

Peer 1.1.1.1 routes for address family L2VPN EVPN:
BGP table version is 5, Local Router ID is 3.3.3.3
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - b
est2

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 2.2.2.2:32777

Route Distinguisher: 3.3.3.3:32777    (L2VNI 10010)
*>l[2]:[0]:[0]:[48]:[0050.c253.5001]:[0]:[0.0.0.0]/216
                      3.3.3.3                           100      32768 i

We only see the MAC address. This is as expected for our L2 VNI.

With ARP Suppression

Now, let’s see how ARP suppression works.

First, we need to carve the TCAM (Ternary Content Addressable Memory); otherwise, you can’t enable ARP suppression. TCAM is a special memory type for storing data that requires fast lookups, such as access lists. The TCAM size is limited, so you need to decide which features you need and how much memory you assign to them. TCAM carving means we reallocate TCAM resources for specific features or requirements.

Assigning resources is done with slices. A slice is a unit of memory allocation, and it can be 256 or 512 bytes.

Whether you need TCAM carving or not depends on the platform. You need to do it on the NX-OS 9000v. Otherwise, you get this error when you try to enable ARP suppression:

We're Sorry, Full Content Access is for Members Only...

If you like to keep on reading, Become a Member Now!

  • Learn any CCNA, CCNP and CCIE R&S Topic. Explained As Simple As Possible.
  • Try for Just $1. The Best Dollar You’ve Ever Spent on Your Cisco Career!
  • Full Access to our 800 Lessons. More Lessons Added Every Week!
  • Content created by Rene Molenaar (CCIE #41726)
527 Sign Ups in the last 30 days
satisfaction-guaranteed

  • 100% Satisfaction Guaranteed!
  • You may cancel your monthly membership at any time.
  • No Questions Asked!