4-R&S/Multiple Spanning Tree Protocol (MST).

Let’s continue the previous three posts 1-R&S/Virtual LAN (VLAN) and Spanning Tree Protocol (STP) , 2-R&S/Spanning Tree Protocol (STP) Part 2. and 3-R&S/Spanning Tree Protocol (STP), RSTP Part 3, so in this post i will talk about the last protocol defined to avoid the loop in the Layer 2 switched network, which is the Multiple Spanning Tree (MST), i will explain its operation, configuration and verification commands from CLI point of view to be able to understand why we might use the MST in our Layer 2 switched network.

Multiple Spanning Tree (MST) IEEE 802.1s:

As mentioned in the PVST+ explanation, the switches within the STP domain will create STP instances based on the number of the VLAN configured on the VLAN database, this means that if the switches have 3 VLANs in the VLAN database, so each switch in the STP domain will create 3 STP instances (one STP instance for each VLAN defined in the VLAN database), so let’s see the following figure that shows the Layer2 switched network that we are talking about:

mst

We see here that our Layer 2 switched network consists of the 3 well-known layers (Core, Distribution and Access Layers), at which Switch1 is the Core switch and it is not running Layer 2 at all, it just run Layer 3 between itself and each Distribution switch, this means that from design point of view this Core switch is not participating in the STP domain and the three Distribution switches must be configured to act as Root bridges within this Layer 2 switched network, so when we run PVTS+, assume that Switch2 is configured to be the Root bridge for VLAN1 and secondary Root bridge for VLAN2 and VLAN3, Switch3 is configured to be Root bridge for VLAN2 and secondary Root bridge for VLAN1 and Switch4 is configured to be Root bridge for VLAN3, as well the links connecting the Distribution switches are GigaEthernet interfaces, so their STP cost is better than the FastEthernet links connecting the Access switches with the Distribution switches. Let’s see the following figures that show the final loop-free  Layer 2 topology from point of view of the three VLANs:

VLAN1 Loop-free Layer 2 topology:

mst-1

VLAN2 Loop-free Layer 2 topology:

mst-2VLAN3 Loop-free Layer 2 topology:

mst-3

So at this design, it is perfect to run the PVST+ mode on our Layer 2 switched network, so that each switch run single STP instance for each VLAN of the 3 VLANs we have here, we utilize all the Layer 2 links  connected between the Access and Distribution switches, at which some ports allow frames belong to VLAN1 to be carried, while block the frames belong to VLAN2 and VLAN3, while another ports allow frames belong to VLAN2 to be carried, while block the frames belong to VLAN1 and VLAN3,… and so on. So by this configuration and design, we make us of all the existing Layer 2 links as can as possible without wasting BW of the Layer 2 links, CPU processing cycles, and memory of the participating switches.

Assume that we have more than 3 VLANs at this specific design ? maybe this switched network has 4, 5 VLANs , or 50, 100, 200 VLANs, … and so on. At this design maybe you need to upgrade your network, so that you need to add another 1 Core switch, another 2 Distribution switches and another 4 or more Access switches, so at this situation you still need to run PVST+ mode if you have 4 or 5 VLANs at your network, still you make use of your resources, BW, CPU processing cycles, and memory, but assume that you configured 100, 200, 500 VLANs at your network, so you still run PVST+ mode on the same Layer 2 switched network ?? 🙂 i think you need to think more about that solution, as at this situation every switch run 500 STP instances which for sure will consume many CPU processing cycles and memory which for sure will affect the performance of your network, so what is the solution for this situation ? Can you get back to the legacy Common Spanning tree (CST) or what ? if you use the CST again, this will result in BW underutilization for the existing Layer 2 links connecting the Access to the Distribution switches, because at this case only one Distribution switch is the responsible for forwarding the frames of all the VLANs, so this will result in congestion in these uplinks and may result in the control plane packets/messages exchanged between the switches to be dropped because of this congestion, and this for sure results in instability in the network, so what is the solution for this situation ? The solution here is the MST 🙂 but how ? and what is the meaning of the MST ?

What is MST ?

As the name implies, multiple spanning tree, this means that the switch will create multiple STP instances (but not per-VLAN), this means that if the switch has 100 VLANs in its VLAN database, it will not create 100 STP instances, instead it will create one or more STP instances and this is based on your design and configuration. Let’s explain the MST meaning regarding the previous situation, we said that we have 500 VLANs in the VLAN database of each switch in this Layer 2 switched network, based on the previous design we mentioned that Switch2 is the Root bridge for VLAN1, Switch3 is the Root bridge for VLAN2 and Switch4 is the Root bridge for VLAN3, this means that for the remaining VLANs we can make it round robin design, i mean that we can configure Switch2 as the Root bridge for VLAN4, Switch3 as the Root bridge for VLAN5, Switch4 as the Root bridge for VLAN6, Switch2 as the Root bridge for VLAN7 and so on. The point here is if we use PVST+ at this situation, we can deduce that Switch2 is the Root bridge for for VLANs 1, 4, 7, 10, 13, 16, …., while Switch3 is the Root bridge for VLANs 2, 5, 8, 11, 14, 17, …, and Switch4 is the Root bridge for VLANs 3, 6, 9, 12, 15, 18,…. Because of this design, the Loop-free topology for all the VLANs that Switch2 act as their Root bridge are exactly the same, as well the same concept is applicable for both Switch3 and Switch4, so why we run 500 STP instances ?

The solution here is to still run one or more STP instances at all the switches, but still not per-VLAN, this means that we need solution to not run 500 STP instances but less, so that we don’t need to consume more CPU processing cycles and memory, and at the same time make use of all the BWs for the existing Layer 2 links to minimize the congestion on only few number of links (in case of CST). So how MST solve this issue ? simply, the MST is designed to create one or more STP instances and depend on the concept of VLAN(s)-to-Instance mapping, this means that it can map all the VLANs to one MST instance, or it can map more than one VLAN to one MST instance and the remaining VLANs to another MST instance.

Let’s explain the MST concept at our situation, at which we have 3 Distribution switches and 500 VLANs in the VLAN database, this means that we can configure 3 MST instances and map the first 167 VLANs to the first MST instance, the second 167 VLANs mapped to the second MST instance and the last 166 VLANs mapped to the third/last MST instance, this means that the VLANs from 1 to 167 are mapped to single MST instance (MST instance no. 1), the VLANs from 168 to 334 are mapped to single MST instance (MST instance no. 2) and the VLANs from 335 to 500 are mapped to single MST instance (MST instance no. 3), from this concept we can deduce what are the benefits of running MST:

  1. It depends on RSTP for its operation, this means that it support better convergence speed against the different types of failure as mentioned before in the RSTP post.
  2. It still has the same concept of PVST+, at which some ports allow frames belong to certain range of VLANs to be carried and block the frames belong to another VLANs, as well you can change the STP settings/parameters per MST instance (not per VLAN instance) so that you can tun your Layer 2 switched network based on our design.
  3. It creates one or more MST instances, at which it can map all the VLANs to one MST instance, or it can map more than one VLAN to one MST instance and the remaining VLANs to another MST instance, so by this concept it doesn’t consume more CPU processing cycles and memory as with PVST+.

As mentioned before, the MST depends on the RSTP for its operation, but the MST added new extension in this RST/MST BPDU to carry its information, such as the MST instance 0 information, MST instance 1 information,….. The following explains the RST/MST BPDU format and explain only the newly added fields/components under this “MST extension”:

1-MST Config ID format selector: this field is always set to 0 and it is reserved for future use.
2-MST Config name: this field is used to indicate the configuration name for this MST region.
3-MST Config revision: this field is used by all the MST switches within the same MST region to indicate matching in the MST parameters, this means that all the MST switches within the same MST region must have the same revision number.
4-MST Config digest: this field is used carry the hashing of the MST VLANs-to-instance mapping, so that each MST switch within the same MST region must calculate the same MST config digest value to make sure that those MST switches having the same VLANs-to-instance mapping.
5-CIST Internal Root path cost: this field indicates the cost of the internal path to reach CIST Root switch.
6-CIST Bridge ID: this field indicates the Bridge ID of either the CIST Root bridge, CIST Regional Root bridge (again it is the same as the IST Root bridge) and this is determined based on where we see this MST BPDU, this means that if this MST BPDU is seen on the link between the MST regions, so it indicates the Bridge ID of the CIST Root bridge, while if it is seen inside the MST region itself, it indicates the Bridge ID of the CIST Regional Root bridge (IST Root bridge), at which it consists of (CIST bridge priority + CIST extended system ID + CIST bridge base MAC address)
7-CIST Remaining hops: this field is used to indicate how many hops exist between the IST Root bridge and the receiver of this MST BPDU, it is the same as the concept of Message Age, at which it is used to indicate whether this MST BPDU is valid of not, as each switch within the MST region decrements the CIST Remaining hop value by one once it receive it, then it flood it toward the downstream switches, this means that if switch receive MST BPDU and decrements it and it become zero, so it consider it as invalid MST BPDU.
8-MSTID # field: this field consists of multiple fields that are used to describe the MST instance number # as the following:
a-MSTI flags: this filed is used to indicate the Spanning tree flags for MST instance number #, as mentioned before at the RSTP post and it is the same as the flags field defined at the BPDU.
b-MSTID # Root Bridge ID: this field indicates the Root Bridge ID for the MST instance number #.
c-Internal Root Path cost: this field indicates that cost of the path to reach the Root Bridge of this MST instance number #.
d-Bridge Identifier Priority: this field indicates the Bridge priority of the Root bridge of the MST instance number #, but this value is in terms of 4096 as the Bridge priority must be multiple of 4096 because of the Extended system ID that is used by the switch to add the VLAN ID to be able to uniquely assign Bridge ID for each VLAN, this means that if you open the MST BPDU using Wireshark and found the value equals 2, this means that the Bridge priority equal 2X4096 = 8192.
e-Port Identifier Priority: this field indicates the priority of the sender port and its value is in Hex and the length of this field is 8 bits, this means that if you open the MST BPDU using Wireshark and found the value equals 4, this means that the actual Hex value is not 4, it is 40, so 40 in Hex equals 64 in decimal, this means that the Port priority equals 64.

How MST works ?

MST topology consists of many switches that participate in this MST domain, at which all the switches that run MST and have the same MST configurations are considered to be under certain region called by “MST Region”, at which the MST Region is a collection of switches running the same MST configurations, this means that they configure the same MST region name, same MST revision number, same number of MST instances and same VLAN(s)-to-MST instance mappings so that we can call them belong to the same MST Region. As mentioned before, the MST-switch defines one or more MST instances, at which either map all the VLANs to one MST instance or divide these VLANs and map range of VLANs to different MST instance. Once we enable the MST on the switch, the system define MST instance called  by Internal Spanning Tree (IST) which is MST instance no. 0 and by default all the VLANs in the VLAN database are mapped to this MST instance, so that the default configuration of MST still protect your network from forming loop, so that the switches participating in this MST region will elect one of them to act as Root bridge for the IST. As well this MST instance is used for certain purpose, at which it is used to inter-operate with the non-MST switches connected to our MST region and for sure don’t belong to the MST region, so that it still protect your network from forming loop with the other non-MST domains. So the IST is used to inter-operate with the other non-MST domains to avoid forming loop, and once they determine which links should be blocked and which links should be forwarding, this state is inherited to all the other VLANs mapped to IST and the VLANs mapped to the other MST instances, this means that if one port is in Forwarding state, so this port is Forwarding for all the VLANs not just for the VLANs mapped to the IST. The entire MST region appears as a large single switch from the point of view of the other switches outside the MST region and this is thanks to the IST which is used for the outside interaction.

What about the interaction between different MST regions ? or what happened when  we have multiple redundant links connected between two different MST regions ? the answer is the CIST (Common and Internal Spanning Tree) at which the Common Spanning Tree is considered as the standard spanning tree understood by every switch run Spanning Tree, so it is considered as the common language understood by all the STP-capable switches for this reason when two MST regions are connected with multiple redundant links, they revert back and use the CST so that they can understand each others because of the different MST regions we have. As well each MST region has its own IST instance, which is used as well to help to inter-operate with the outside world, this means that if two MST regions are connected to each others via multiple redundant links, they will use both IST and CST, at which the switches within the same MST region use the IST internally with each others, as well the MST regions communicate with each others using CST which is the common language and understood by everyone as mentioned before.  Each MST region will determine which switch become the Root bridge for the IST which is considered as well the “CIST Regional Root bridge”, which means that it represents the Root bridge for the region from the point of view of the other regions, this means that this Root bridge represents the region itself within this Layer 2 switched network (consisting of multiple MST regions) and this CIST Regional Root bridge is the bridge with the lowest bridge ID, and it consists of (Bridge IST priority + Instance number which equals to 0 + Bridge Base MAC address). Once every region elect its CIST Regional Root bridge, then they need to co-operate with each others to elect the Root bridge of all these regions, which is called by “CIST Root Bridge”, which represents the switch with the lowest Bridge ID among all the regions. The switch that is elected as the CIST Root Bridge is considered the CIST Regional Root Bridge in its own Region. The region that doesn’t has the CIST Root bridge will choose another switch as the IST Root bridge or the CIST Regional Root bridge (again, the IST Root bridge is the same as the CIST Regional Root bridge but it is another name, during the explanation i will mention IST Root bridge), at which this new IST Root bridge will be one of the switches that are connected to the other MST region not the switch with the lowest IST Bridge ID within this region, so let’s see the following figure to understand which switches i am talking about:

cst1

The CIST Root bridge is located at MST Region 2, so the IST Root bridge at Region 1 must be either Switch2 or Switch5 as Switch2 or Switch5 are the boundary switches for MST Region 1. The IST Root bridge at the region is elected based on the lowest external root path cost to reach the CIST Root bridge, at which the external root path cost is the summation of the costs of the links connecting the MST regions with each other, while the links cost inside each MST region is not taken in the external root path cost calculation, this means that assume that the cost of the link connecting Switch2 to MST region 2 is 10, this means that the external root path cost to reach the CIST Root bridge is 10 at which the links inside the MST region itself is not taken in the cost calculation. If the external root path cost to reach the CIST Root bridge is the same from both Switch2 and Switch5, so the switch with the lowest Bridge ID is chosen as the IST Root bridge.

Let’ see the following figure that shows the layer 2 switched network to understand how the CST is used when we have multiple redundant links between two MST regions:

cst-1

In this topology, we have two MST regions (MST region1 and region 2) and the following represents the bridge ID of these switches for IST and the other MST instances for simplification:

1-Bridge ID of Swicth2: 32768.aabb.cc00.0300
2-Bridge ID of Switch5: 32768.aabb.cc00.0c00
3-Bridge ID of Switch3: 32768.aabb.cc00.0200
4-Bridge ID of Switch4: 4096.aabb.cc00.0700
5-Bridge ID of Switch6: 32768.aabb.cc00.0d00
6-Bridge ID of Switch7: 32768.aabb.cc00.0b00
7-Bridge ID of Switch8: 32768.aabb.cc00.0e00

The following figure shows the steps needed to explain the inter-operation between the MST regions with each others using CST and IST:

1-All the switches within the same MST region start to exchange with each others their RST/MST BPDU (as mentioned before that MST depend on RSTP for its operation, so it will use RST BPDU for its operation), so that they can know which switch within the MST region is elected as the Root bridge for the different configured MST instances.

cst

RST/MST BPDU originated by Switch2 and send it toward Switch5:

mst bpdu1

RST/MST BPDU originated by Switch4 and send it toward Switch3:

mst bpdu2

I will not mention all the BPDUs exchanged between all the switches, just mention one example at each MST region.
2-Based on the Bridge ID we mentioned for each switch, we can deduce that Switch2 has better Bridge ID than Switch5 from IST (MST instance number 0) point of view within MST region 1, so it is elected as IST Root bridge for MST region 1, and Switch4 has better Bridge ID than Switch3, Switch6, Switch7 and Switch8, so it is elected as IST Root bridge for MST region 2, so each Root Bridge within each region will flood its own superior RST/MST BPDU within its region.

cst2

Superior RST/MST BPDU originated by Switch2 and send it toward Switch5:

mst bpdu3

Superior RST/MST BPDU originated by Switch4 and send it toward the other switches within MST region 2:

mst bpdu2

3-The MST switches located at the boundary of each MST region (i.e the switches connected to the other MST region) will send the superior MST BPDU for the IST Root bridge for within their region, so Switch2 send the superior MST BPDU of the IST Root bridge for MST region 1 to Switch3 out Eth1/2, as well Switch5 send the superior MST BPDU of the IST Root bridge for MST region 1 to Switch3 out Eth1/0 and to Switch6 out Eth0/1. At the same time, Switch3 send the superior MST BPDU of the IST Root bridge for MST region 2 to Switch2 out Eth1/2 and to Switch5 out Eth1/0, as well Switch6 send the superior MST BPDU of the IST Root bridge for MST region 2 to Switch5 out Eth0/1.

cst3

Superior RST/MST BPDU originated by Switch2 and send it toward Switch3, same as originated by Switch5 and send it to both Switch3 and Switch6:

mst bpdu4

Superior RST/MST BPDU originated by Switch3 and send it toward Switch2, same as originated by Switch3 and Switch6 send it to Switch5:

mst bpdu5

4-We can deduce that the Superior MST BPDU of the IST Root bridge sent by MST region 2 is better than of MST region 1 as the IST Root bridge of MST region is Switch4 that has the lowest Bridge ID among all the Layer2 switched topology, for this reason Switch4 become the CIST Root bridge, hence the MST region 1 stop sending its superior MST BPDU of its IST Root bridge as it lose the CIST Root bridge election.

cst4

The final superior RST/MST BPDU sent by Switch3 to Switch2 and Switch5, and by Switch6 to Switch5:

mst bpdu5

5-As mentioned before, the Bridge IDs for the switches are for IST and all other MST instances for simplification, so that we can determine which ports are Root, designated and non-designated. Once Switch4 becomes the CIST Root bridge we can determine which ports within the MST region 2 become Root, Designated and non-designated, this means that ports Eth0/0, Eth0/1 and Eth1/1 of Switch4 become Designated as it is the CIST Root bridge, IST Root bridge, and the Root bridge for all the other MST instances (for simplification). For Switch3, port Eth0/1 become Root port, while ports Eth0/3 and Eth1/1 become designated ports as bridge ID of Switch3 is better than Switch6 and Switch7 within the MST region 2, and ports Eth1/0 and Eth1/2 become Designated as well as Switch4 is the CIST Root bridge, so MST region win the CIST Root bridge election, hence it send the Superior MST BPDU for the CST. For Switch6, port Eth1/1 become root port, port Eth0/3 become Alternate, while ports Eth0/0 and Eth0/1 become Designated ports. For Switch7, port Eth0/0 become Root port, while port Eth1/1 become Alternate. For Switch8, port Eth0/0 become root port.

cst5

6-MST Region 1 need to elect new IST Root bridge, that represents the nearest/closest bridge to reach the CIST Root bridge, this means that the bridge that has lowest cost to reach the CIST Root bridge, previously i mentioned that the link between Switch2 and Switch3 has cost of 100, the link between Switch5 and Switch3 has cost of 10, while the link between Switch5 and Switch6 has cost of 1, this means that we can deduce that Switch5 has the lowest cost to reach the CIST Root bridge, for this reason Switch5 is elected now as the new IST Root bridge for MST region 1, hence it start to flood the superior MST BPDU that it received from Switch6 on its port Eth0/1, for this reason port Eth0/1 become the Root port from IST point of view, as well it become Root port from the other MST instances point of view, but it has another name which is “Master port”, this means that port Eth0/1 is the Root port from IST point of view, while it acts as Master port from the other MST instances point of view, and the port Eth1/0 of Switch5 become Alternate port, as well the port Eth1/2 of Switch2 become Alternate port as this switch no longer the IST Root bridge for MST region 1. The following figure shows the loop-free layer 2 switched network for the entire topology, as well the logical representation for this Layer 2 switched network, and how each MST region looks like from the point of view of the other MST region.

cst6

logical

The previous figure shows that the switch “Switch-MST-1” represents the logical huge switch that represents the entire the MST Region 1 when seen from outside MST region 1, as well “Switch-MST-2” represents the logical huge switch that represents the entire the MST Region 2 when seen from outside MST region 2.

This means that finally, the two MST regions inter-operate with each others using both IST and CST and that is what we call by “CIST”, so that they result in loop-free layer 2 topology to avoid the layer 2 loop in an layer2 switched network consists of multiple MST regions.

 

How MST inter-operate with non-MST switches (PVST/PVST+ switches) from BPDU exchange point of view?

The MST switches generate only one MST BPDU that carry information about all the defined MST instances, and this MST BPDU is sent over the Native VLAN, this means that it send only one MST BPDU, untagged at the same time and has no PVID TLV that was previously defined for the PVST and PVST+ BPDU. What happen if we have MST switch connected to PVST/PVST+ switch ? this PVS/PVST+ switch doesn’t understand the MST BPDU, hence it can’t understand the MST extension that carry information about VLANs-to-Instance mapping for this MST region ? so this results in no common language understood by each others, so at this case the MST switch will use the normal PVST/PVST+ BPDUs so that the PVSt/PVST+ switch can understand the Root bridge information for each VLAN, so that both the MST switch and the PVST/PVST+ switch become consistent with each others regarding who is the Root bridge for each VLAN, at which the MST switch will replicate the information about the CIST Root bridge for every BPDU, this means that if there are 4 VLANs allowed on the trunk link connecting the MST switch with the PVST/PVST+ switch, this means that the MST switch will generate 4 BPDUs (one for each VLAN) , the following figures show the 4 BPDUs for VLANs (1, 2, 3 and 4):

Configuration BPDU generated by MST boundary switch for VLAN1:

pvst+1

Configuration BPDU generated by MST boundary switch for VLAN2:

pvst+2

Configuration BPDU generated by MST boundary switch for VLAN3:

pvst+3

Configuration BPDU generated by MST boundary switch for VLAN4:

pvst+4

MST Basic Configurations:

You can configure the switch with MST mode using the following command:

mst config1

You can configure the MST configuration parameters using the following commands:

mst config2

1-“Spannin-tree mst configuration” command is used to enter the MST configuration mode to be able to define the MST configuration parameters.
2-“name MST-1” command is used to define name for the MST region, and this name is “MST-1”.
3-“revision 100” command is used to define unique number that allow all the MST switches within the same MST region to have certain matched number to give another method to be sure that the MST configuration is matched among all the MST switches that should be within the same MST region.
4-“instance 1 vlan 1 – 167” command is used to define which VLANs should be mapped to which MST instance, and here we define the VLANs from 1 to 167 are mapped to the MST instance 1.
5-“instance 2 vlan 168 – 334” command is used to define which VLANs should be mapped to which MST instance, and here we define the VLANs from 168 to 334 are mapped to the MST instance 2.
6-“instance 3 vlan 335 – 500” command is used to define which VLANs should be mapped to which MST instance, and here we define the VLANs from 335 to 500 are mapped to the MST instance 3.

All the switches that should be within the same MST region, should have the same MST configurations, this means that they must have the same MST name, revision number and VLANs-to-Instance mapping.

You can change the Bridge priority per MST instance using the following command:

mst config3

Here we changed the bridge priority for MST instance 1 for Switch4 to be 4096.

You can verify that the switch is running MST using the following command:

mst config4

You can verify the MST configuration on the switch using the following command:

mst config5

 

Hope that the post is helpful.

Regards

Mostafa Hamza

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s