Hey Guys! OneSeventeen asked a seemingly simple question about how to route a subnet through a new Layer3 core switch he got. While he got some great advice on his actual technical question over here: Original Thread it occurred to me upon reflection that this is a question that I have answered many times before.
Specifically, I came to realize that not many people understand the long term effects of subnet layout in the event you need to move to a multi-routed internal network. For some this will be old information, for others this may be extraneous information, but I hope this post will give at least a few people enough information to help them either better prepare for their organizations future, or at least understand when they need to call in additional help.
I’lll admit right up front that there will be errors in this document. I have spent more than a few hours putting together my thoughts, but certainly not enough for this to be error free, muchless considered an authoritative post on this subject.
Additionally, I know some of my terminology here might not be entirely standard. Even I don’t agree with everything I have written here, but I am trying to make this both digestible and approachable.
Finally, I am a people - and an occasionally messy one. I make mistakes. I am certain I have done so within. If you see something that needs correction - please say so!
###With that said… Let’s go!
Before we get started - We need to make sure we can communicate effectively. To that end here’re some terminology you will see often in my post:
There are 7 layers in the OSI model. We are going to be discussing Layers 2 and 3 for the purposes of this conversation.
- Layer 1 is the physical layer - this is where your wires are, your cable plant.
- Layer 2 is the data link layer - this is the layer that switching happens within. Two devices which sit on the same Layer 2 data link could theoretically get information between one another.
- Layer 3 is the network layer - this is the layer that routing and networking as is commonly considered happens within. Two devices which sit on the same Layer 3 network can communicate with one another. Routers connect multiple L3 networks together in such a way as to be able to move information between them.
- Each successive layer builds on and depends upon the layer under it.
Network - Context specific term. If talking about Layer 1, it means the structured cabling system. If Layer 2 it means the switches, circuits and vlans. If layer 3 it means subnets and broadcast domains.
Subnet - A layer 3 network of specific size and scope. Important to note that subnets can roll up or condense, that is to say one large subnet may include multiple smaller subnets: more on this later.
vLan - An isolated layer 2 network, potentially running over the same Layer 1 network as other layer 2 isolated vLans/networks
Netmasks & CIDRs & /xx - All ways to express the size and scope of a subnet.
L1 & L2 & L3 - Short hand for Layer 1/2/3
ACL - Access Control List, For the purposes of this post, I will use ACL to refer to controlling access in a broad-spectrum sense. IE: I will use an ACL to block this subnet from ever talking to that subnet, no ifs ands or buts. ACLs are computationally easy and have relatively minor overhead.
GAC - Granular Access Control, Think of this like fine grained ACL. I don’t want everything on this subnet to be accessible to everything on that subnet, but i do want services X, Y, and Z, to be accessible. GACs are computationally intensive, and have relatively high overhead.
TCF - Traffic Control Functions, Abstract. General bucket that includes everything from Intrusion Prevention Services to bandwidth shaping and limiting.
NAT - Network Address Translation, A function provided by certain gateways which allow them to rewrite one IP address space into another. While used in a wide variety of roles, in this post we will be restricting our usage to its most common use as that of a service that allows many interfaces on a private network to use relatively few interfaces on a public network for the purposes of providing access to that network. (eg: using single public IP address to serve the internet connectivity needs of 100 private network hosts.)
RA - Route Advertisements, A function provided by most routers that allows them to communicate to other routers what L3 subnets they have access to in a standardized fashion.
Route - A specific path between two or more subnets.
Routing The act of exchanging information between 2 or more subnets
Router - A device or service which uses individual Routes to accomplish Routing. While a router typically does not look at the content of the information it is relaying, it is possible to use ACLs in specific cases.
Firewall - A device or service which may or may not be configured to act as a router, but does provide Granular Access Control, as well as additional network TCF.
Gateway - Whether a router or firewall, this is the relay point to get information in or out of a specific subnet
AIO-FR All-In-One Firewall Router, a commonly used device providing a full range of ACL,GAC,TCP,NAT,RA, & Routing Services. You may know it as a sonic wall, an ASA, or even a TP-Link.
Termination - The ending point of a network. This might be “nowhere” if we don’t want a network to be able to communicate with other networks, but it will typically be the gateway of that subnet.
Interface - Context specific term. In layer 2 context it refers to the physical port under discussion. (eg: the 3rd port of the second switch in that stack) In layer 3 context it refers to the IP address(es) of a specific system as they are exposed to a specific L3 network.
Contrary to popular belief, most people who “do” networking didn’t go to school for it, rather they fulfilled a need to accomplish a goal. Thus, rather than starting with a long introduction to the finer points of network design theory, they started by picking up what they had available and making it work.
Typically this looks like a collection of switches ontop of which rests a single L3 network. These switches eventually connect back to a AIO-FR device which acts as the gateway to a larger world (typically the internet).
###So you are knocking all of us who didn’t go to networking school?
Not at all! This is how I myself started - but unless further training and knowledge is learned, as your environment scales out you will quickly find yourself with a few opportunities that you are ill-equipped to address.
Some of these potentional opportunities include running out of available IP addresses, needing to reduce your broadcast domain for performance reasons, multicast scope limiting or evolving access control and isolation requirements.
Indeed it is, but somewhat amusingly, regardless of what your opportunity is, the answer typically involves creating more L3 subnets, usually (but not always) on top of L2 networks which are dedicated to holding that L3 subnet.
But creating more L3 subnets instantly puts you in a position where you will need to manage the interactions between those networks…
#Terminating L3 subnets
As a consequence of the common way that most people learned networking, the most common way to see a L3 subnet terminated would be adding a corresponding L3 interface onto their existing AIO-FR. Doing this has a variety of consequences though. Both positive and negative.
- It is easy - Typically there will be no changes to any existing configurations
- It is inexpensive - Whether you are using a virtualized L2 network or a new physical L2 network, by keeping your AIO-FR the same and preserving existing configurations, the cost to implement will be relatively minimal.
- Centralization - Whether we want to talk about Policies, Layout Conceptualization, Troubleshooting or Management - With everything happening in one place, it’s easier to know what is going on.
- Scaling Constraints - No matter how large the AIO-FR, there will always be a point at which it will no longer be sized to handle the aggregated traffic flow of an entire network.
- Bandwidth Limitation - By functioning as the sole router in your organization, using an AIO-FR as the termination point effectively throttles the entire network to the aggregate performance of your AIO-FR, whether that is the hardware capacity itself, or the capacity of the physical interfaces it uses to connect to the network.
###Ok, that doesn’t sound great, so what other ways are there?
Well, I’m glad you asked!
Conceptually, there are lots of ways to connect L3 subnets together. But for the sake of this post, let’s distill it down to the most common ways, which conveniently sit within two separate classes.
- Out-Of-Plane Router - This is routing which happens outside of a switch. You may commonly know this as using a sonicwall, a fortinet, or an ASA that is attached to physical network ports somewhere to enable connectivity to your various networks. This type of router is typically focused on features, providing routing functions within or alongside a device which these days typically also provides firewall and NAT functionality, thus providing a complete ACL/GAC/NAT/RA/TCF package. Essentially the AIO-FR devices I have mentioned above.
- In-Plane Router - This is routing which happens inside of a switch itself. You may commonly know this as a “Layer 3” Switch, and it is relatively common in advanced offerings from the likes of Cisco, HPE(Aruba), Brocade, Netgear, and many more. This type of router is typically focused on speed, often times eschewing traditional AIO-FR features such as NAT, GAC and TCF.
So, to summarize: using an Out-Of-Plane router is almost always needed due to demands for GAC/NAT/TCF features. However In-Plane Routing, as a virtue of being built into the switch fabric itself, can bring higher performance to the table, both in terms of bandwidth as well as reduced latency.
###So you are saying there isn’t a single answer?
###So which do I choose?
When you are relatively small, there is nothing wrong with using a well-specified AIO-FR as the sole gateway for your various networks. Feel free to add some more L3 subnets and L2 vLans - all is well.
But as you begin to grow to the point where the physical connections prove to be a network bottleneck, or you are going to needlessly oversize an AIO-FR just so you can shuttle information around your internal network - STOP - it’s time for a new approach.
###How do I know it is time for a new approach?
Honestly, it is different for everyone. I have seen organizations as small as 400~500 devices need to use both in and out of band routing. Conversely I have seen organizations of 1500+ devices happily sitting on a single AIO-FR, with room to grow. It all depends on your unique circumstances.
The important thing is to know your network, treat reports of poor performance with respect and investigate. Employ at least a rudimentary level of reporting so you can visualize at some level what is happening on your network. Think through the impacts of new applications and services. Never be scared to ask for a 2nd (or 3rd) opinion.
###Well that makes sense, Thankfully we aren’t yet at a place where I need to worry about…
Hold up. We aren’t done yet. Yes - while it is true you may not be in a position where you need to worry about leveraging both In-Band and Out-Of-Band routing side by side, decisions that you make today will directly affect the ease which you will be able to transition to this hybrid routing model down the road should the need arise.
Remember way up at the top of the page under terms where I said that one subnet can include within itself multiple smaller subnets?
Well, that might be the single most important thing to learn in this whole post. If you don’t currently understand how a subnet can include multiple smaller subnets, go stare at this chart for awhile and come back and ask some questions. For the purposes of this post, this concept is crucial. Subneting Chart
Using this chart you can see that a single Class B /16 network could either have 65534 hosts on it Please Don’t (: or it could have 256 separate /24 networks, with each of those networks containing 254 hosts.
More importantly, you could have a single /16 network which itself contained 8 separate /19 networks. You could then take 7 of those 8 /19 networks and further break those into both /22 and /24 networks. Then you could take the remaining /19, break it up /24s and then further brake those /24s into /29s and /30s. Even after being entirely broken up, we could still refer to this whole grouping as a single /16.
To visualize this in another way, take a look at this chart which shows you how you can break a single /24 down into anything from 2 separate /25 subnets with 126 hosts each to 64 separate /30 subnets with 2 hosts each. Or any combination of /25~/30 subnets you like, so long as no two subnet ranges overlap. /24 subnet chart
###Ok enough already, my head hurts, I get it, subnets can be broken into smaller subnets while still existing themselves - why is this important?
It’s important because of routes, ACLs, and GACs
In order to easily leverage multiple routers in a network, you need to be able to use relatively condensed subnets that in turn refer to relatively large cross sections of your network.
You can’t (or rather, you really don’t want) to have a /22 untrusted guest network slap in the middle of a /19 worth of other trusted networks. In order to route this appropriately we will now have to create dozens (or more) individual Routes, ACLs and GACs where you could have just had two or three. And that is just for a simple oversight of putting an untrusted /22 inside of an otherwise trusted /19.
If you have multiple of such scenarios, and begin multiplying that sort of issue out, you can quickly come to a situation where you could have an absolutely unmanageable amount of routes and ACL/GAC policies where you otherwise may have only needed a handful.
###So you need to plan your subnets?
Yes! Right away. Ideally from the first network you create, but if not from the first then certainly by the second.
There are a variety of ways to do this, but for churches I like something that looks like this:
Typically I will consider where to terminate what type of network according to the types of routes/ACLs/GACs they might need and then group accordingly.
Example: networks which are “trusted” might all terminate in the a core switch as their L3 gateway to give them quick access onto other trusted networks. Any isolation I may need between these trusted subnets can be done using the (relatively) simple ACL functionality present in the core switch. In contrast however, “untrusted” vlans, or vlans with narrow GAC/traffic monitoring needs will be terminated directly into an Out-Of-Band Router / Firewall interface so I can have much finer access control back into the rest of the network.
In practice this might mean something like this:
- I allocate each campus a /16 to be further broken down into trusted networks.
- I take another /16 and break it into 8 separate /19 networks, allocating one of those /19 to each campus to break down into their untrusted networks.
- I then take another /16 and break it into 16 separate /20 networks, one of which will be given to each campus for its management networks
- Finally I set aside another /16, broken into 32 seperate /21 networks, each campus getting a /21 to further break down into routing and backhaul interface subnets.
By constructing my subnet map like this, I can do some really cool things.
- I can build a simple rule to block the trusted /16s from my untrusted network /20s and know that i have actually “gotten everything”.
- I can route my management traffic over alternate routes or circuits than my production traffic, or build failover policies that correctly prioritize such access.
- With a hand full of routes (either manual, or via RAs) I can route between campuses trusted networks (presuming I have connectivity) I can even (simply) ACL the routes to stop trusted networks from crossing those routes needlessly
- I can take a look at traceroute reports and know that things are being routed correctly just by looking at the overriding subnet blocks that the interfaces are contained within.
- Put simply, I can more easily, clearly, and securely deliver and support the high performance networks that facilitates the success of the organizations I serve.
###Really, my brain is melting here.
I know. I really appreciate you hanging on this long. I also know that you are going to be having some thoughts right now. While I can’t pretend to know everything in your head, there is a good chance that most of them distill down into a single phrase: I am never going to be big enough to have to worry about this.
Not only is that completely understandable, it may even be true. However - and this is a big ask - are you in a place where you can say with certainty that you will never be in a position that you can in some way benefit from being built on a subneting foundation that looks something like this? While it is certainly possible to come back later and overhaul the subneting of an organization, it is something you really want to avoid. (I should know, I have had to do it)
I want to be clear - it is not as if I think every corner coffee shop needs to start off with a routing template and subneting plan that would allow it to seamlessly grow to become starbucks. My heart in writing this is to give you enough understanding of an area I find many people lacking insight in so that you may know when to take action, or identify when you need further help.
In wrapping up I just wanted to say thanks for sticking through this and I hope it helps someone learn in a condensed time frame concepts which often allude people for years. As I said up top, I am sure I have missed, overlooked or plain messed some things along the way. Feel free to ask questions or give (loving) correction.
Thanks for reading to the end! (: -Karl P