Small Enterprise Network Design Questions

Hi all,

I’m hoping to get some help and feedback on how to best design (redesign?) my enterprise’s network. I’m not aware of all the technologies available in our field - some I’m aware of but don’t know well enough to be designing an enterprise network. I’m a recent college grad with my CCNA. I started recently at this organization and the network design seems off. Don’t get me wrong, it’s been working this way for years but I think we can do better. Where I struggle is wrapping my head around what to do in attempt to fix it. I’ll do my best to explain the current state and end goals clearly. Any thoughts/comments/feedback/suggestions/etc are much appreciated. ​

Current state:

  • Static routing everywhere with the exception being if a branch office ISP goes down, the VPN goes down and appropriate devices remove that VPN route out of its routing table
  • Full mesh, site to site VPN (primary)
  • Hub and spoke VPN (backup)
  • Two ISPs in branch offices
  • Here’s a diagram of the current state: https://imgur.com/zQfxvbo It’s not pretty but it gets the job done for now. I put some small firewall symbols on some routers because we use our firewalls for routers in some places

IP Addressing Scheme:

  • All /24 subnets
  • .0 - .69 located in HQ
  • .128 - .133 located in Colo
  • .135 - .136 located in Colo
  • .138 - .139 located in Colo
  • .151 - .153 located in Colo
  • Anything >= .70 excluding mentioned colo subnets are branch offices

Notes:

  • Site to Site VPN was implemented 5 years ago by current Sr. Net Eng specifically for VoIP traffic. This improved VoIP quality immensely according to him
  • No CoS or QoS used
  • DMZ/PCI at HQ and Colo
  • Currently working on BGP for HQ. Two routers with VRRP and iBGP between them, eBGP with the two ISPs, then a FHRP - the “usual” BGP setup
  • HQ services ALL DNS/DHCP requests
  • HQ is where 98% of resourceslive
  • Also working on separating sensitive/Datacenter subnets from the rest of the enterprise. We’d likely do this with a new core for routing, then connect said core to the current switch fabric and implement ECMP routing
  • Here’s a diagram of some initial thoughts on topology changes to accommodate for all the things I’m asking about in this post: https://imgur.com/tKLrPtN
  • Currently use Fortinet firewalls. They’re almost 5 years old now so in the near future we’ll be evaluating a different solution

End Goals:​

  • Have a logically laid out IP addressing scheme(I don’t think our current scheme is that great)
  • Interested in dynamic routing but not sure how to implement, specifically because of branch offices
  • Implement North-South firewalling
  • Branch offices need to have seamless failover (if primary ISP fails, backup connections kicks in and routes properly)

Questions:

  • What’s the best way to implement a dynamic routing protocol, whether it’s OSPF, iBGP, etc in the enterprise?
  • Is there a need for a full mesh and hub and spoke if SD WAN is implemented properly?
  • How would SD WAN be implemented properly?
  • To achieve logical, simple routing, we may need to re-IP some subnets?
  • Where is the best place to terminate the MetroEthernet?
  • Is the network not as bad as I think it is? Should we keep doing what we’re doing with only minor changes?

I’m sure I’ve forgotten things that would help you all respond but hoping that questions will come up and I’ll be able to edit the post to include more info. What I really want to get out of this post is to understand how dynamic routing can work in our environment. I mentioned all the other stuff just to make everyone aware of some other initiatives. In the end, it all needs to work together - which is where I’m struggling. Thanks for any help - it’s much appreciated.

Edit 00: Oh my gosh this formatting is horrendous. I apologize, trying to fix it currently.

Edit 01: I SUCK at Reddit formatting. Also adding IP addressing - I forgot to put it in and realized it’d be helpful for some of the questions I have.

​Edit 02: I figured out that there is a new way to format on Reddit. It looks somewhat acceptable now. Sorry about that.

What are we trying to solve, exactly through your proposed enhancements?

I’m in a very similar situation as you - HQ, colo, and 65 sites, using Fortinet. I am the only person with “network” in the title. I’ve done as much cross-training as I can, but most people here lack the fundamental knowledge to apply anything I’ve built.

Static routing everywhere with the exception being if a branch office ISP goes down, the VPN goes down and appropriate devices remove that VPN route out of its routing table

This is what I’ve just finished developing. I avoided any dynamic routing because there’s no chance anyone else here would be able to do any troubleshooting on it whatsoever, and I like being able to take a vacation. Do you have FortiManager? You need to be using VPN Manager (in FortiManager), and understand it well. It’s a steep learning curve (and I’m happy to help), but it’s worth it. It can build out your static routes for you, and simplify it a lot.

Full mesh, site to site VPN (primary)
Hub and spoke VPN (backup)

Why are these different? This seems like a security problem, a functionality problem, or probably both.

Have a logically laid out IP addressing scheme(I don’t think our current scheme is that great)

If/when you do this, make sure your subnets are logical from a routing standpoint, not a “this looks pretty on a spreadsheet” standpoint. Any time I see someone using all /24 networks, I assume it was built on the latter.

HQ services ALL DNS/DHCP requests

I don’t know your business needs or technical limitations, but this reeks of “This is what we did back at Acme Corp, so I’m doing it here”. I hate seeing centralized DHCP, most of the time it’s a single point of failure. DNS is not always quite as big a problem, but not totally ideal either.

Implement North-South firewalling

I assume you’re saying there is no firewall between your HQ/colo and your branch offices? This is definitely a huge priority, IMO.

Branch offices need to have seamless failover (if primary ISP fails, backup connections kicks in and routes properly)

Is this not happening currently? You have the same setup that I do, and I think the last time I checked, it was failing over within 8 seconds, which is very acceptable to us.

Is there a need for a full mesh and hub and spoke if SD WAN is implemented properly?

I cannot come up with any scenario in which you should have both full mesh and hub and spoke topologies in place. Usually your comapny needs one, and the other is definitively less than ideal.

To achieve logical, simple routing, we may need to re-IP some subnets?

This isn’t a question that we can answer. Sequential /24 subnets everywhere work great as long as you will never, ever run out of IPs at a location. Once you do, you end up with multiple, non-concurrent subnets at the same location, and that becomes a nightmare.

Is the network not as bad as I think it is? Should we keep doing what we’re doing with only minor changes?

Now we’re at the real question.

  1. Is there a current PROBLEM that needs to be solved?
  2. Is there an IMPROVEMENT that will provide benefit to the business?
  3. Is there a FUTURE problem that you can avoid?

When you have clear, comprehensive answers to those three questions, then you can start thinking about what should change and what shouldn’t.

i recently did something similar with moving a 9 branch company to OSPF and out of static routing. each location has a cisco router that has an MPLS (TLS) connection back to our administrative office, along with a fortigate firewall that has a backup cable connection with vpn to each location for full mesh (ospf between each vpn as well). each of our locations have a Ip structure as follows: primary location 1.1.0.0/22 location2: 1.1.2.0/24 location3: 1.1.3.0/24 etc… i can try to help you with any specific questions but i really don’t want to design the network for you lol.

What’s the best way to implement a dynamic routing protocol, whether it’s OSPF, iBGP, etc in the enterprise?

IBGP isn’t really made for enterprise networks. The “I” is kind of a misnomer. I’d use it at the edge for redundancy between ISPs and that’s it. You’d need full mesh between hosts or route-reflectors. Plus AD is so high on IBGP that the same prefix-length learned from almost any other IGP would win out. Massive pain in the ass. OSPF if you’re smart, IS-IS if you’re daring, EIGRP if your stubborn and run an all Cisco-shop (and plan to keep it that way). Dynamic routing works best the more it’s used consistently. The more you redistribute into it, the more you’ll hate it.

Is there a need for a full mesh and hub and spoke if SD WAN is implemented properly?

A proper SD-WAN solution should can safely and easily be the only solution. It should be redundant in itself – multiple diverse transports/ISPs and no single-points-of-failure. HA boxes at the HQ sites at a minimum.

How would SD WAN be implemented properly?

That’s between you and your vendor. Talk to sales reps from a few. Ideally you’ll get a couple lunches out of it and a good idea of what suits your needs. Typically transports/ISPs plug directly into the appliance/VM (or via a switch, or ideally a stack of switches as to minimize SPOFs). Inside of the appliance connects to your WAN Edge segment (or core, if you’re not that cool) over a routed transit interface. Usually run OSPF over this. Some vendors prefer BGP here because it gives more flexibility. There are exceptions.

To achieve logical, simple routing, we may need to re-IP some subnets?

Maybe. Depends on how your current schema is working for you. Is it scalable for the next few years or are you going to break schema? If you are going to re-IP, do it with a plan in place, and plan for the future. RFC 1918 addresses are cheap. There’s 272 /16s you can work with there. For a lot of companies, that can easily mean a /16 for every site without problems. Or a couple /16s for remotes broken up into /20s or shorter.

Do not pull that stupid “increment by 10” crap. Multiples of 10s are for humans. Routers aren’t humans. Use summarizable ranges.

Where is the best place to terminate the MetroEthernet?

Probably into the SD-WAN appliance or WAN Edge segment.

Is the network not as bad as I think it is? Should we keep doing what we’re doing with only minor changes

I’ve seen worse. Remember the trade-offs. A good project makes your boss look awesome. A bad project makes you look like shit.

Have you considered Cisco DMVPN? Dynamic Multipoint VPN.

With DMVPN, each site would join a “cloud” so to speak, back to HQ, thats the M part (Multipoint) You could have a single subnet for all VPNs.

The D part (Dynamic) is that with your spokes, you don’t need any static Public IP addressing. You just have the head-end HQ router IPs mapped in the config, and the spokes will find their way. Not only that, but spoke-to-spoke traffic with dynamically generate its own connection and will “full mesh” without you having to specifically configure them.

This works best if you have a PKI infrastructure and can issue certificates to each router from the same CA in order to maintain the trust chain.

From there, you can run your routing protocol of choice, OSPF, EIGRP, I guess even RIP… and keep everything connected and reachable, and relatively simple to manage once they come up.

I’m not a network engineer, so bare with me…

The diagram that’s part of your Reddit post, even a VPN tunnel goes through the cloud. The way it’s setup looks like you have your own private lines?

Why do some things go around the firewall? Usually most, if not all traffic goes through the firewall as it leaves a site unless you’re doing mlps? I just don’t get why barely anything actually goes through the firewall. Isn’t that the point of one?

The diagram is probably the most confusing part, lol.

With dual ISPs already at each remote site, this is an ideal play for an SD-WAN like Velocloud, especially since you don’t have a bunch of existing routing infrastructure to rip/replace (not to be confused with RIP).

Sounds like one of the first jobs would be to implement an actual subnetting strategy. All the sites are stuffed into a /24?

This may not answer any of your questions but will help with the communication.

Hey man , are your primary VPNs traversing the internet ? Also how are you protecting traffic going to and from your ISPs? I dont see any firewalls in place there.

Hi ChiefElite,

Are you still working on your design? I noticed that you mentioned VoIP and SD-WAN with proper ISP diversity is critical from my experience for good voice quality. I find QoS useful as well, even though my SD-WAN technology is using Internet links, where it is harder to control the end-to-end link quality. I have not seen much on security, which you may want to consider. Here is a techtip on “remote office” DMZ that you might find useful… https://networkdavid.com/2018/11/19/dual-internet-design-part-4-branch-office-dmz/

Good question. My post is way to cluttered and unclear - also something I struggle with. Ha. What I really want to get out of this post is how to implement dynamic routing within the enterprise.

First of all - a serious thank you for your time to respond to this extent.

Static routing everywhere with the exception being if a branch office ISP goes down, the VPN goes down and appropriate devices remove that VPN route out of its routing table
This is what I’ve just finished developing. I avoided any dynamic routing because there’s no chance anyone else here would be able to do any troubleshooting on it whatsoever, and I like being able to take a vacation. Do you have FortiManager? You need to be using VPN Manager (in FortiManager), and understand it well. It’s a steep learning curve (and I’m happy to help), but it’s worth it. It can build out your static routes for you, and simplify it a lot.

I feel for you. Job security though. We have FortiManager and I totally agree with you, it makes adding a new branch office SUPER easy.

Full mesh, site to site VPN (primary) Hub and spoke VPN (backup)
Why are these different? This seems like a security problem, a functionality problem, or probably both.

When a primary ISP in a branch office went down, all traffic would route over the backup ISP through our colo then back down to HQ. This worked when HQ was a member of the hub and spoke but now that we have the MetroEthernet, HQ isn’t a member of hub and spoke which breaks the routing. Now we have to add a static route to the HQ core switch for that branch subnet pointing to the colo side interface of the MetroEthernet. Then disables the branch to reach specific subnets (DMZ, PCI) that hang off of our HQ firewall.

Have a logically laid out IP addressing scheme(I don’t think our current scheme is that great)
If/when you do this, make sure your subnets are logical from a routing standpoint, not a “this looks pretty on a spreadsheet” standpoint. Any time I see someone using all /24 networks, I assume it was built on the latter.

Yes, this is what I’m trying to achieve. I need to draft out my plan on paper, think through it more, then post it here for feedback. It should be pretty simple actually. We use 192.168.0.0/16. Basically just break up that into three /18’s and we’d be good (I think…). We’d have plenty leftover too. But, like I said, need to put more thought into it and understand our needs etc. Honestly, don’t know what “built on the latter” means - like thrown together?

HQ services ALL DNS/DHCP requests
I don’t know your business needs or technical limitations, but this reeks of “This is what we did back at Acme Corp, so I’m doing it here”. I hate seeing centralized DHCP, most of the time it’s a single point of failure. DNS is not always quite as big a problem, but not totally ideal either.

We have a backup/replica DNS/DHCP server in our colo, but it’s passive.

Implement North-South firewalling
I assume you’re saying there is no firewall between your HQ/colo and your branch offices? This is definitely a huge priority, IMO.

North-South as in firewall off the datacenter from clients. Basically any traffic between the datacenter and anything outside of the datacenter. There is firewalls in each location. Sorry for my poor explanation skills.

Branch offices need to have seamless failover (if primary ISP fails, backup connections kicks in and routes properly)
Is this not happening currently? You have the same setup that I do, and I think the last time I checked, it was failing over within 8 seconds, which is very acceptable to us.

The routing is screwed up with we add the static route on our HQ core switch over the MetroEthernet so traffic can get back to the hub of the hub and spoke network.

Is there a need for a full mesh and hub and spoke if SD WAN is implemented properly?
I cannot come up with any scenario in which you should have both full mesh and hub and spoke topologies in place. Usually your comapny needs one, and the other is definitively less than ideal.

It is strange - I agree. The “grey beard” Sr. Net Eng designed before my time here. I hope I can come up with a far superior design to present to the team.

To achieve logical, simple routing, we may need to re-IP some subnets?
This isn’t a question that we can answer. Sequential /24 subnets everywhere work great as long as you will never, ever run out of IPs at a location. Once you do, you end up with multiple, non-concurrent subnets at the same location, and that becomes a nightmare.

They do work great when they’re sequential like you mentioned, which, if I’m not mistaken, is our issue. Haha. The way we’ve allocated the /24’s is silly.

The three questions at the end are on point. Thank you for that.

Again, just want to say thank you for taking the time to share your thoughts. It’s a huge help to me and I do appreciate it.

Right on! I think where we differ is that each branch just has two internet circuts - direct internet access - we don’t have the luxury of MPLS. So I struggle with how OSPF will work. We’ll need VPN’s and then we’ll just overlay OSPF?

Why do you use 1.x ip space instead of actual private ip space? I guess you are ok with not being able to route to that for some reason?

OSPF if you’re smart, IS-IS if you’re daring, EIGRP if your stubborn and run an all Cisco-shop (and plan to keep it that way). Dynamic routing works best the more it’s used consistently. The more you redistribute into it, the more you’ll hate it.

Thank you for this! 99.9% sure we’ll stick with OSPF.

Thanks for you time to post this!

I’ve heard of DMVPN but that’s it - didn’t know what it was. You make this sound VERY attractive my friend…wow. I’m going to look into this. Thank you so much for your response.

The diagram is probably the most confusing part, lol.

Crap I knew I should’ve spent more time on the diagram. I suck at diagramming.

The diagram that’s part of your Reddit post, even a VPN tunnel goes through the cloud. The way it’s setup looks like you have your own private lines?

You’re correct that a VPN tunnel isn’t physically directly connected (I rephrased your “goes through the cloud”). I made them directly connected with a label assuming people would understand that. I think it’s best practice to identify VPN’s like that on diagrams - it’s highly possibly I’m mistaken though.

Why do some things go around the firewall? Usually most, if not all traffic goes through the firewall as it leaves a site unless you’re doing mlps? I just don’t get why barely anything actually goes through the firewall. Isn’t that the point of one?

I think I understand your confusion. So at the HQ, we terminate the internet circuits in our HQ core switch on a non routed VLAN in order to allow us to have a HA firewall. Does that make sense?

Thank you for the vendor suggestion. I know nothing about them but I can assure I will soon. Ha.