VXLAN Flood and Learn Multicast Data Plane

In the VXLAN static ingress replication lesson, we manually configured the remote VTEPs on each VTEP. This works, but it’s not a scalable solution. We can also use multicast in the underlay network. When we use multicast, each VTEP maps a VNI to a multicast group.

Broadcast, unknown-unicast, and multicast (BUM) traffic is forwarded to the configured multicast group. VTEPs that listen to the multicast group will de-encapsulate all traffic that it receives on that multicast group.

In this lesson, I’ll explain how to configure multicast in the underlay network and examine a packet capture between two hosts. You’ll see which packets are destined to the multicast group and which packets are transmitted with unicast.

Configuration

This is the topology we’ll use:

Vxlan Spine Two Leafs Cisco Nx Os

Instead of directly connecting the two leaf switches, we’ll add a spine switch in the middle. This spine switch will be our Rendezvous Point (RP) for multicast. The hosts are simple Ubuntu containers. The only thing they need to do is generate some ICMP traffic to test reachability.

I’m using Cisco NX-OS 9000v version 9.3(9) on the switches. This topology requires ~6 vCPUs and ~25GB of RAM.

I tried this lab using the NX-OS 9000v 10.x images but ran into an issue where multicast traffic sometimes got dropped on the leaf switches. This issue didn’t happen with the 9.x images.

Underlay Network

Let’s start with the underlay network. We need to configure IP addresses, routing, and multicast.

IP

Let’s start with the IP addresses. I’ll keep it simple and assign an IP address on every switch routed interface. I’m also using static MAC addresses so that it’s easy to debug and do Wireshark captures. Each switch has a loopback interface.

SPINE1

The spine switch needs a loopback that we can use as the RP.

SPINE1(config)# interface Ethernet 1/1
SPINE1(config-if)# no switchport
SPINE1(config-if)# mac-address 0050.c253.1001
SPINE1(config-if)# ip address 192.168.12.1/24
SPINE1(config)# interface Ethernet 1/2
SPINE1(config-if)# no switchport
SPINE1(config-if)# mac-address 0050.c253.1002
SPINE1(config-if)# ip address 192.168.13.1/24
SPINE1(config)# interface Loopback 0
SPINE1(config-if)# ip address 1.1.1.1/32
LEAF1

The leaf switches require a loopback interface, which we can use for the NVE interface.

LEAF1(config)# interface Ethernet 1/1
LEAF1(config-if)# no switchport
LEAF1(config-if)# mac-address 0050.c253.2001
LEAF1(config-if)# ip address 192.168.12.2/24
LEAF1(config-if)# exit
LEAF1(config)# interface Loopback 0
LEAF1(config-if)# ip address 2.2.2.2/32
LEAF2
LEAF2(config)# interface Ethernet 1/1
LEAF2(config-if)# no switchport
LEAF2(config-if)# mac-address 0050.c253.3001
LEAF2(config-if)# ip address 192.168.13.3/24
LEAF2(config)# interface Loopback 0
LEAF2(config-if)# ip address 3.3.3.3/32

OSPF

We’ll configure OSPF to advertise all interfaces.

SPINE1, LEAF1 & LEAF2
(config)# feature ospf
(config)# router ospf 1
SPINE1
SPINE1(config)# interface Ethernet 1/1 - 2
SPINE1(config-if-range)# ip ospf network point-to-point 
SPINE1(config-if-range)# ip router ospf 1 area 0.0.0.0

SPINE1(config)# interface Loopback 0
SPINE1(config-if)# ip router ospf 1 area 0.0.0.0
LEAF switches
LEAF1 & LEAF2
(config-router)# interface Ethernet 1/1
(config-if)# ip ospf network point-to-point 
(config-if)# ip router ospf 1 area 0.0.0.0

(config)# interface Loopback 0
(config-if)# ip router ospf 1 area 0.0.0.0

That takes care of our underlay network routing.

Multicast

There are three things to configure:

  • Enable the multicast feature.
  • Enable PIM sparse mode on the interfaces.
  • Configure the loopback interface of SPINE1 as the static RP.
SPINE1

Let’s enable multicast:

SPINE1(config)# feature pim

Enable PIM on all interfaces (including the loopback):

SPINE1(config)# interface Ethernet 1/1 - 2
SPINE1(config-if-range)# ip pim sparse-mode

SPINE1(config)# interface Loopback 0
SPINE1(config-if)# ip pim sparse-mode

And we’ll make SPINE1 the RP:

SPINE1(config)# ip pim rp-address 1.1.1.1
LEAF switches

The configuration of the leaf switches is similar:

LEAF1 & LEAF2
(config)# feature pim
(config)# interface Ethernet 1/1
(config-if)# ip pim sparse-mode 

(config)# interface Loopback 0
(config-if)# ip pim sparse-mode

(config)# ip pim rp-address 1.1.1.1

This completes the underlay network configuration.

Overlay Network

Let’s configure the overlay network. The configuration is similar to what we did in the VXLAN Static Ingress Replication lesson.

VNI

Let’s configure a VNI and map a VLAN to it. We’ll configure this on both leaf switches:

LEAF1 & LEAF2
(config)# feature vn-segment-vlan-based 
(config)# vlan 10
(config-vlan)# vn-segment 10010

(config)# interface Ethernet 1/2
(config-if)# switchport access vlan 10

That’s all we need for the VNI.

NVE

Let’s enable the VXLAN feature:

LEAF1 & LEAF2
(config)# feature nv overlay

Under the NVE interface configuration, we’ll do something different than what we did with static ingress replication:

LEAF1 & LEAF2
(config)# interface nve1
(config-if-nve)# no shutdown
(config-if-nve)# source-interface loopback0

(config-if-nve)# member vni 10010
(config-if-nve-vni)# mcast-group 239.1.1.1

We’ll configure a multicast group (239.1.1.1) here to use for BUM traffic. This completes our configuration.

Verification

Let’s verify our work.

Underlay Network

Let’s make sure the underlay network is configured correctly.

OSPF

We see two OSPF neighbors on the spine switch:

SPINE1# show ip ospf neighbors
 OSPF Process ID 1 VRF default
 Total number of neighbors: 2
 Neighbor ID     Pri State            Up Time  Address         Interface
 2.2.2.2           1 FULL/ -          00:06:19 192.168.12.2    Eth1/1 
 3.3.3.3           1 FULL/ -          00:06:26 192.168.13.3    Eth1/2 

The leaf switches can reach each other loopback interfaces:

LEAF1# ping 3.3.3.3 source 2.2.2.2
PING 3.3.3.3 (3.3.3.3) from 2.2.2.2: 56 data bytes
64 bytes from 3.3.3.3: icmp_seq=0 ttl=253 time=4.016 ms
64 bytes from 3.3.3.3: icmp_seq=1 ttl=253 time=2.676 ms
64 bytes from 3.3.3.3: icmp_seq=2 ttl=253 time=2.321 ms
64 bytes from 3.3.3.3: icmp_seq=3 ttl=253 time=2.325 ms
64 bytes from 3.3.3.3: icmp_seq=4 ttl=253 time=3.029 ms

Multicast

The spine switch sees two PIM neighbors:

SPINE1# show ip pim neighbor
PIM Neighbor Status for VRF "default"
Neighbor        Interface            Uptime    Expires   DR       Bidir-  BFD    ECMP Redirect
                                                         Priority Capable State     Capable
192.168.12.2    Ethernet1/1          00:17:43  00:01:26  1        yes     n/a     no
192.168.13.3    Ethernet1/2          00:17:43  00:01:31  1        yes     n/a     no

And the leaf switches see SPINE1 as the RP:

LEAF1# show ip pim rp
PIM RP Status Information for VRF "default"
BSR disabled
Auto-RP disabled
BSR RP Candidate policy: None
BSR RP policy: None
Auto-RP Announce policy: None
Auto-RP Discovery policy: None

RP: 1.1.1.1, (0), 
 uptime: 00:57:59   priority: 255, 
 RP-source: (local),  
 group ranges:
 224.0.0.0/4
LEAF2# show ip pim rp
PIM RP Status Information for VRF "default"
BSR disabled
Auto-RP disabled
BSR RP Candidate policy: None
BSR RP policy: None
Auto-RP Announce policy: None
Auto-RP Discovery policy: None

RP: 1.1.1.1, (0), 
 uptime: 00:58:08   priority: 255, 
 RP-source: (local),  
 group ranges:
 224.0.0.0/4

The underlay network seems to work correctly.

NVE Interfaces

Let’s check the NVE interface:

LEAF1(config-if-nve)# show nve vni
Codes: CP - Control Plane        DP - Data Plane          
       UC - Unconfigured         SA - Suppress ARP        
       S-ND - Suppress ND        
       SU - Suppress Unknown Unicast 
       Xconn - Crossconnect      
       MS-IR - Multisite Ingress Replication 
       HYB - Hybrid IRB mode
    
Interface VNI      Multicast-group   State Mode Type [BD/VRF]      Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1      10010    239.1.1.1         Up    DP   L2 [10]   

This tells us that we use VNI 10010 and multicast group 239.1.1.1. The state is up, and we use data plane learning. We see the same thing on LEAF2:

LEAF2(config-if-nve)# show nve vni
Codes: CP - Control Plane        DP - Data Plane          
       UC - Unconfigured         SA - Suppress ARP        
       S-ND - Suppress ND        
       SU - Suppress Unknown Unicast 
       Xconn - Crossconnect      
       MS-IR - Multisite Ingress Replication 
       HYB - Hybrid IRB mode
    
Interface VNI      Multicast-group   State Mode Type [BD/VRF]      Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1      10010    239.1.1.1         Up    DP   L2 [10]

The two leaf switches don’t know about each other yet:

LEAF1(config-if-nve)# show nve peer
LEAF2(config-if-nve)# show nve peer

This makes sense because there hasn’t been any traffic between hosts yet.

Multicast

Here’s what the multicast routing table looks like on SPINE1:

SPINE1# show ip mroute 239.1.1.1
IP Multicast Routing Table for VRF "default"

(*, 239.1.1.1/32), uptime: 00:01:40, pim ip 
  Incoming interface: loopback0, RPF nbr: 1.1.1.1
  Outgoing interface list: (count: 2)
    Ethernet1/2, uptime: 00:01:00, pim
    Ethernet1/1, uptime: 00:01:40, pim


(2.2.2.2/32, 239.1.1.1/32), uptime: 00:01:29, pim mrib ip 
  Incoming interface: Ethernet1/1, RPF nbr: 192.168.12.2, internal
  Outgoing interface list: (count: 1)
    Ethernet1/2, uptime: 00:01:00, pim


(3.3.3.3/32, 239.1.1.1/32), uptime: 00:00:32, pim mrib ip 
  Incoming interface: Ethernet1/2, RPF nbr: 192.168.13.3, internal
  Outgoing interface list: (count: 1)
    Ethernet1/1, uptime: 00:00:32, pim

We see two sources for multicast group 239.1.1.1:

  • 2.2.2.2 (LEAF1)
  • 3.3.3.3 (LEAF2)

Here are the multicast routing tables of the leaf switches:

LEAF1# show ip mroute 239.1.1.1
IP Multicast Routing Table for VRF "default"

(*, 239.1.1.1/32), uptime: 00:15:18, nve ip pim 
  Incoming interface: Ethernet1/1, RPF nbr: 192.168.12.1
  Outgoing interface list: (count: 1)
    nve1, uptime: 00:15:18, nve


(2.2.2.2/32, 239.1.1.1/32), uptime: 00:15:18, nve mrib ip pim 
  Incoming interface: loopback0, RPF nbr: 2.2.2.2
  Outgoing interface list: (count: 1)
    Ethernet1/1, uptime: 00:14:26, pim

LEAF1 has a * entry and one for its own IP address (2.2.2.2). We see a similar output on LEAF2:

LEAF2# show ip mroute 239.1.1.1
IP Multicast Routing Table for VRF "default"

(*, 239.1.1.1/32), uptime: 00:15:09, nve ip pim 
  Incoming interface: Ethernet1/1, RPF nbr: 192.168.13.1
  Outgoing interface list: (count: 1)
    nve1, uptime: 00:15:09, nve


(3.3.3.3/32, 239.1.1.1/32), uptime: 00:15:09, nve mrib ip pim 
  Incoming interface: loopback0, RPF nbr: 3.3.3.3
  Outgoing interface list: (count: 1)
    Ethernet1/1, uptime: 00:14:14, pim

Traffic between hosts

This is what you came for. Let’s figure out if we can send a ping from S1 to S2:

We're Sorry, Full Content Access is for Members Only...

If you like to keep on reading, Become a Member Now!

  • Learn any CCNA, CCNP and CCIE R&S Topic. Explained As Simple As Possible.
  • Try for Just $1. The Best Dollar You’ve Ever Spent on Your Cisco Career!
  • Full Access to our 806 Lessons. More Lessons Added Every Week!
  • Content created by Rene Molenaar (CCIE #41726)
403 Sign Ups in the last 30 days
satisfaction-guaranteed

  • 100% Satisfaction Guaranteed!
  • You may cancel your monthly membership at any time.
  • No Questions Asked!

Forum Replies

  1. Hello Amit

    Strictly speaking, you don’t need to enable PIM sparse mode on the loopbacks of the devices you are using. However, in the specific topology, Rene has chosen to use the loopback interface of the spine switch as the address of the RP. This is typically best practice as you can see from this lesson here. So for this particular case, if you don’t enable PIM sparse mode on the loopback, it cannot participate as the RP of the topology.

    I hope this has been helpful!

    Laz

1 more reply! Ask a question or join the discussion by visiting our Community Forum