Ok… I have a confession to make. I’ve worked for an MPLS L3VPN provider for about 5 years now. For most of that time, I did not have a real grasp on what MPLS was or how it really worked. When I first got to digging around and working tickets I basically just thought to myself “oh, it’s just all different VRF’s. The VRF’s separate all the routing tables for each customer. That’s MPLS.” I undersold it quite a bit. But that’s the thing…. I got by with knowing just that. I guess that’s a wonderful thing about an MPLS backbone. You configure it once and for the most part, it just works! I want to go over a few of the fundamentals of MPLS and how they specifically apply to MPLS VPN’s.
Purpose/General Concepts of MPLS:
The good description I’ve heard about MPLS is that it’s “a way to achieve Layer 3 routing at Layer 2 speeds.” Update: I’ve realized with modern hardware, this statement is no longer relevant 😉 MPLS is essentially a different mechanism that can determine how routers actually forwards packets. Instead of doing a deep inspection of the IP header, MPLS routers can make all or most forwarding decisions based on a 4 byte header that is injected between the L2 and L3 headers. This is known as an MPLS header. There are instances where you will have more than one MPLS label encapsulating the L3 header, but we will get to that later.
CEF plays a big part in MPLS. After a router has decided the best routes to put into its routing table (RIB), it then takes that a step forward with CEF and puts those into its FIB. MPLS goes even one step beyond that and keeps track of all the label bindings for those destination prefixes and places them into something called the LFIB.
Label Distribution Protocol is the primary way that routers can exchange the labels they have allocated for the destination prefixes. LDP uses unicast over TCP port 646 to form adjacencies. Routers that apply, remove, or change MPLS labels are referred to as LSR’s. This is often referred to as “push, pop, and swap.” By default, routers will automatically generate and advertise an MPLS label for each prefix in their routing table to be advertised via LDP when an adjacency is formed. When the routers are advertising their local generated labels, they are essentially telling the other routers “if you have packets destined for a specific prefix, send the packets to me using this label.” This will occur across an entire LSP or Label Switched Path.
An Edge LSR is a router that handles both labeled and unlabeled packets. There are times that these routers will not want traffic to come in with a particular MPLS label. For example, if a router were to receive a packet with an MPLS label and it knew that the packet would be forwarded out an interface that is not enabled for MPLS, it would have to inspect the L3 header for more information anyway. Therefore it would prefer to receive the packet as unlabeled to reduce uneccessary processing time. In order for the router to inform other routers to do so, it will send over an MPLS label of 3 for those prefixes. MPLS label 3 is reserved and known as the “implicit null” label. You will also hear this referred to as Penultimate Hop Popping.
As I said before, there will be times when multiple MPLS labels are applied to a single packet. This is referred to as the MPLS Label Stack and is very important when it comes to MPLS VPNs. Depending on the situation, the router will most likely make it’s forwarding decision based on the outermost label, or the one at the “top of the stack.” The router will know if the Label is the “bottom of the stack” if it has its S bit set to 1. This lets the router know that the next thing to follow will be the Layer 3 header.
So why would you need multiple labels? The MPLS label stack is a critical piece of traffic forwarding for MPLS VPNs. Pairing MPLS with other technologies such as VRF’s and Multi-Protocol BGP is a powerful tool for SP’s to deliver WAN services to customers. Let’s go over these topics and how they can work with MPLS.
In order to adhere to to the “VPN” part, there needs to be some kind of segmentation between customers. This is where you will use a VRF, or Virtual Routing and Forwarding instance. When you create a VRF the router creates a new virtual instance of the routing table. In addition, only interfaces that are specifically assigned to that particular VRF are belong in that specific routing table. What you have created it essentially a VPN. The VRF contains it’s on control plane instances as well as the data plane is separated based on routing. One of the most important things here is that you can have overlapping address space between VRF’s since they are all within their own virtual routing table!
Now that you have interfaces segmented into different VRF’s, how do you scale this? If you’re a glutton for punishment and have a large network, you can do something referred to as “VRF Lite.”
VRF Lite is essentially the barebones approach to using VRF’s. In order to maintain traffic segmentation through the network, VRF’s must be locally created on every device in the path if you want correct functionality. This is obviously not the preferred approach because of its issues with scalability and overhead.
In order to efficiently exchange routes, but still keep them within the VRF’s you have defined, you can use Multi-Protocol BGP.
Multi-Protocol BGP is a number of extensions built on top of BGP that allow you to do more than just normal exchange of NLRI’s.
Let’s take a look at one of the issues you will obviously see with our previous example of multiple VRF’s on a single router: overlapping address space. Normal behavior of BGP consists of only advertising the BEST route for a specific prefix to its neighbors. If you have the same prefix in multiple routing tables… how does the router know which one to advertise? You overcome this issue by defining something with the VRF known as the Route-Distinguisher.
The purpose of the Route-Distinguisher is to do just that… DISTINGUISH the route. It is an 8 byte value that is added to the beginning of the NLRI prefix. It must be unique per VRF. The format is two values, separated by a colon. Often, the format you see used is below:
However, this value is truly arbitrary and can be whatever you’d like it to be.
The combination of an NLRI prefix and it’s RD is known as a VPNv4 route. This is what gives the router the ability to advertise the same prefix multiple times to its BGP neighbors. Each VPNv4 route is truly unique. For example, if multiple customer are using the 10.1.1.0/24 prefix, here is how the router will advertise each of the unique VPNv4 routes:
Note: The RD only makes the route unique so that it can advertise each prefix to its neighbors. The RD does NOT determine how the receiving router will inject those routes into the corresponding VRFs. That is actually done via the route-target.
The route-target is a specific 64 bit extended community value that is applied to each NLRI in BGP VPNv4 route. This is how the VRF’s both IMPORT and EXPORT routes. The RT is an 8 byte field that follows the same format as the RD (ASN:nn or IP-address:nn). The VPNv4 speaking router actually only accepts VPNv4 routes that have a local VRF importing that specific route-target. Route reflectors are the exception to this rule, as they are responsible for sending all VPNv4 routes to all routers within that AS.
You can define specifically within each VRF what route-targets they would like to import and export. They can have multiple import and export statements if necessary. You can also apply filters to these imports and exports if necessary via a route-map.
As with normal behavior of BGP, routers must exchange capabilities in the OPEN message to determine what each BGP speaker supports. Normally, IPv4 route exchange is enabled by default (you can disable this behavior, which is necessary in some cases). In order for routers to exchange VPNv4 routes, or IPv6/VPNv6 routes for that matter, that specific neighbor must be enabled to do so under the appropriate address-family in BGP. Both routers must be in agreement on their capabilities for them to send/receive these specific routes.
When delivering an L3VPN for a customer, every router plays a specific role on the end to end path. There are P Routers, PE Routers, and CE Routers.
- PE Routers
- Provider Edge
- Need to be VRF aware
- Import/Export routes in and out of specific VRFs
- Peer MP-BGP with other PE routers
- Peer BGP or IGP with CE routers
- CE Routers
- Client Edge
- Do not need to be VRF aware, but can be in some cases
- Responsible for taking in customer routes and sending them to PE routers
- P Routers
- Provider Routers
- These are what will be seen as your “backbone” MPLS routers. Speed is key here.
- Not VRF aware, do not even need to run BGP
- Only need routes for loopbacks of PE routers that are peering MP-BGP
- Can be done with an IGP like EIGRP or OSPF
Where does MPLS fit into this?
All the topics I just went over with respect to MPLS VPNs all had to do with route-exchange. So how is the traffic actually forwarded? Why do we need MPLS to form the VPN?
The need for labels comes in when you further dissect the logic that the router will use to forward the traffic. Sure, you may have set up all the routers to exchange the appropriate VPN routes, but you did those across GLOBAL interfaces (or at least interfaces that are not associated with the “customer” VRFs). If a packet were to come in on an interface that is no associated with a VRF, why would that router ever consult the VRF’s specific routing table? How would it know to do so?
I previously mentioned the fact that you may have more than one MPLS label on a particular IP packet. The concept of the MPLS label stack is not only how the router forwards VRF traffic across a global interface, but it also let’s the receiving router know which VRF the traffic is destined for. I’ve heard these labels given a few different terms, but I’ll refer to them as the Transport Label and the MPLS Label.
- Transport Label:
- Generated by LDP across MPLS backbone
- Outermost label or top of label stack
- Used to forward traffic between PE routers
- MPLS Label:
- Generated by BGP
- Innermost label, or bottom of stack
- Used to get traffic into the appropriate VRF
So… when a router is forwarding traffic that is within an MPLS VPN, it must perform 2 label lookups, one for the MPLS label and one for the Transport label. Let’s look a the traffic flow:
- A packet comes in on a VRF enabled interface. The destination address in the IP header is inspected and compared to that VRF’s routing table for the best match.
- A match is found in a route learned from BGP. This route is flagged that MPLS is required and includes the MPLS label that was learned via BGP from the other PE router. This is your MPLS label.
- Now, the router needs to find the label to use for the next hop address, which is another PE router. The router will then consult the MPLS forwarding table, or LFIB to get the label for the next hop router. This is your Transport label.
- The labels are stacked onto the packet and encapsulated in the L2 frame to be forwarded to the P router.
- The P routers label switch based on the outermost label, or the Transport label. Before the last P router sends the traffic to the appropriate PE router, it pops off the transport label, leaving only the MPLS label. As we discussed before, the PE will not need the transport label, because it is the router that generated it.
- The PE router receives the packet and sees only the MPLS label. The router refers to it’s LFIB and see that this particular label is an aggregate label and to consult the RIB for that particular VRF to make it’s next forwarding decision.
As I said, MPLS VPN’s are a powerful tool in a Service Provider toolset in order to provide scalable WAN connectivity for multiple customers. There are a number of other technologies and caveats that come into play with MPLS VPN’s, but this post was just to cover some of the fundamentals.