CEF Polarization

CEF (Cisco Express Forwarding) polarization is an issue where the CEF hash algorithm selects certain paths in the network and leaves redundant paths unused. This is best explained with an example. Let’s take a look at the following topology:

cef polarization example topology

Above we have a topology with some routers. We have redundant links between R2, R3, R4, and R5.  If we run an IGP like OSPF or EIGRP on this network and the metric is the same for all these links then we will use ECMP (Equal Cost MultiPath) routing. All interfaces in the network will be used:

cef polarization igp load balancing

Our two hosts on the left side are sending 100 packets in total. R1 will load balance these so that R2 and R3 each receive 50 packets. R2 and R3 will use both interfaces, sending 25 packets on each interface. Finally, R4 and R5 each have 50 packets to deliver to H3.

This is not how it really works though. Cisco routers and switches use CEF which is responsible for forwarding traffic. CEF uses a hashing algorithm to decide which packets get sent on which interface. In its most basic form, CEF only considers the source and destination IP address to decide which interface will be used to forward traffic. When this occurs, the following situation might happen:

cef polarization load balancing

Above, H1 and H2 are sending 50 packets each. R1 runs its CEF hashing algorithm and decides that:

  • packets from H1 (192.168.1.1) to H3 (192.168.45.3) should be forwarded using the interface that connects to R2.
  • packets from H2 (192.168.1.2) to H3 (192.168.45.3) should be forwarded using the interface that connects to R3.
  • Our packets are now load balanced between R2 and R3.

Our packets from H1 and H2 are now load balanced by R1 but when they reach R2 and R3, something interesting happens:

  • R2 receives the packets from H1 and runs the same hashing algorithm as R1. Since it only receives packets from H1 to H3, a single interface is used for all these packets. The link between R2 and R5 will be unused.
  • R3 receives the packets from H2 and runs the same hashing algorithm as R1. Since it only receives packets from H2 to H3, a single interface is used for all these packets. The link between R3 and R4 will be unused.

The process above is called CEF polarization, it’s the result of using the same hashing algorithm and same hash input on all routers. The result is that only one path is used in the network for a certain flow of packets.

CEF polarization can be avoided by using a different hashing algorithm. On modern IOS routers, the following hashing algorithm variations are available:

R1(config)#ip cef load-sharing ?
  algorithm  Per-destination load sharing algorithm selection

R1(config)#ip cef load-sharing algorithm ?
  include-ports  Algorithm that includes layer 4 ports
  original       Original algorithm
  tunnel         Algorithm for use in tunnel only environments
  universal      Algorithm for use in most environments
  • Original: the original variation of CEF only looks at the source and destination IP address. The router performs a XOR on the lower order bits of the source and destination IP address to decide which interface to use. When the source and destination addresses remain the same, the same interface will be picked.
  • Universal: each router generates a unique 32 bit universal ID that is used as a seed in the hashing algorithm next to the source/destination IP addresses. Since each router will have a unique universal ID, each router will have a different hashing result and a different interface will be selected for each flow.
  • Tunnel: this is an improvement to the universal algorithm meant for tunnel interfaces. With a tunnel interface, the source and destination are often the same so it’s possible that the same interface is used over and over again.
  • L4 port: this variation includes the layer 4 source and/or destination port numbers in the hashing algorithm.

Only the original CEF variation is prone to CEF polarization, the other options avoid it by including other information besides the source/destination IP address.

The variations that you can select are different for each platform and IOS version. For example, the Catalyst 6500 switches have similar options with different names:

  • Default: uses the source/destination IP addresses and the universal ID.
  • Full: uses the source/destination IP addresses, source/destination port numbers but no universal ID. Uses an unequal weight for each interface.
  • Simple: uses source / destination IP addresses only.
  • Full simple: uses source/destination IP addresses, source/destination port numbers but no universal ID.  The difference with “full” is that the weight for each interface is equal here.

The full, simple and full simple variations are prone to CEF polarization.

Another option to deal with CEF polarization is per packet load balancing. However, this can cause out-of-sequence packets which might be an issue with real-time traffic like VoIP or video.

Tags:


Forum Replies

  1. CAM: High speed memory that is primarily used for a switch’s layer 2 lookup information. This information allows the switch to decide which port to send a packet to (a known MAC address) or whether to flood it to all ports (unknown MAC address).

    TCAM: Not all switches have this. Think of this as an extension of CAM. It is used for very rapid decisions on ACLs and Quality of Service. On high end layer 3 switches, the TCAM can also contain the FIB, again, so specialized hardware can making routing decisions without interrupting the central CPU of the switch.

    ... Continue reading in our forum

  2. Thanks Rene,

    I have another question, in the above section where u have mention about ARP request construct, I think the behavior of switches are different

    The multilayer switch will check the routing table, notices that 192.168.20 /24 is directly connected and the following will happen:
    The destination MAC address changes from FFF (Multilayer switch Fa0/1 ) to BBB (ComputerB).
    The source MAC address changes from AAA (ComputerA) to GGG (Multilayer switch Fa0/2).===This seems to be Incorrect

    I LAB this up with 3 switches (all real hardware 3550/3560) [ c3560-ips

    ... Continue reading in our forum

  3. Hi Andrew!,

    Glad to see you around and yea! thanks for confirming my doubt.

    I have a few more doubts that i have no where to turn to and i hope you will enlightened me…

    My coreswitch has ip cef turn on my default.

    q1) I have been reading up abit on cisco about IP CEF and it seems like to enable/disable IP CEF, you have to do in at the ingress interface as the decision (e.g. load balance is done there). – is it right ?

    Assuming i have “no ip cef” and only wish to turn on ip cef on certain interfaces and ->
    q2) if i want to do to packet loadbalancing, should “ip

    ... Continue reading in our forum

    1. The ARP table is stored in RAM, and not in either CAM or TCAM

    2. Here is the Cisco definition of Epoch and how it is used:

    The term "epoch" refers to a period of time. A new epoch for a Cisco Express Forwarding table begins when a table rebuild is initiated. The time after this instant is in an epoch different from the time before, and the different epochs are numbered between 0 and 255. Through the use of epochs, the software can distinguish between old and new forwarding information in the same database structure and can retain the old Cisco Express Forwar
    ... Continue reading in our forum

89 more replies! Ask a question or join the discussion by visiting our Community Forum