adrift on a cosmic ocean

Writings on various topics (mostly technical) from Oliver Hookins and Angela Collins. We have lived in Berlin since 2009, have two kids, and have far too little time to really justify having a blog.

Dangers of spanning tree interoperation

Posted by Oliver on the 12th of September, 2010 in category Tech
Tagged with: bpduciscohpmstppvrstpvstrstpspanning treestp

A couple of months ago we ran into a minor issue at work when running Cisco and HP switches together. Naturally we have redundant links in place for all networking equipment so spanning tree of some sort must be used. In theory, any equipment can work together with spanning tree but the reality is not exactly this utopian:

  • Cisco supports natively their proprietary protocols PVST, PVST+, PVRST and PVRST+. They also support MSTP, and all modes will drop down to compatibility modes of STP after detecting standard STP peers on the line.
  • HP supports MSTP, RSTP and STP. Will drop down to RSTP or STP after detecting peers who use those protocols on the line.

I haven't written about it, but MSTP is a massive waste of time with current implementations. I actually list experience with MSTP on my CV, and during one interview I was actually asked about it by a potential employer and we both agreed that it was a huge nightmare which confirmed my own experiences. Basically, the administrative overheads outweigh any bandwidth gains you can potentially make, and these overheads scale up quickly. Unless you have absolutely no changes ever in your environment (right down to your list of used VLANs) I can't recommend it. So that takes MSTP out of the equation.

Cisco will drop down to STP even when configured to use one of their per-VLAN implementations (which I think are actually quite good), but it still conveys per-VLAN information in the extended system ID field of the BPDU. Most switches which aren't looking for it will pass on this information and consider it to be simply additional path weighting information.

BPDU flow diagram

As you can see from this rough diagram, we end up in a bit of a pickle when our upstream links from the non-Cisco switches have mismatched VLANs (but can talk on all of these VLANs between each other). The HPs blindly send on BPDU information which encapsulates the extended ID in the priority field, and when it reaches the original switch, it finds that the priority is higher (lower value) than the one it had set already by itself as the root. To get an idea of what happens, I'll describe the path weighting which comes from the above diagram if we are using 1Gbps links (usually attributed a link cost of 4):

  1. Cisco1 is the primary root, which in Cisco terms means a priority of 24576. It sends the BPDU to HP1, adding the extended ID of 100 for the VLAN.
  2. HP1 receives the BPDU over its VLAN100 link. The priority of 24676 and the extended ID of 100 are added, plus the link cost of 4 which results in 24680. HP1 sends a BPDU over to HP2, and at this point since they are speaking either MSTP (presumably as part of the CST) or RSTP/STP, VLANs are immaterial.
  3. HP2 receives the BPDU, adds the link cost of 4 which brings us to the total of 24684. HP2 sends a BPDU out its VLAN200 link back to Cisco1.
  4. Cisco1 receives the BPDU on a port it expects to be only used for VLAN200. The path cost at this point is 24688, but 24576+200 is 24776 and higher than the path cost for the BPDU which originated on the other port. Therefore the BPDU that was originally transmitted for VLAN100 has made it back into VLAN200 and Cisco1 is led to believe that VLAN200 has a better root bridge which comes from this path. It elects this port as a root port, even though the new mystery root bridge on this port is this switch itself!

Fortunately when I discovered this problem, we did not actually suffer from any loss of links or network problems (that I could see, anyhow) but the way the network was set up well and truly put spanning tree's behaviour into the realm of "undefined". You should not do this! But even so, it demonstrates that you can quite easily mess up your network when the network equipment can't even really standardise on a version of spanning tree. As much as the Cisco documentation will tell you they interoperate by dropping down to STP, they still maintain per-VLAN STP spanning trees so the behaviour is still very different.

© 2010-2018 Oliver Hookins and Angela Collins