- The organization is planning to move resources to cloud or devise a networking strategy for their existing cloud infrastructure to harness value from cloud.
- The right topology needs to be selected to deploy network level isolation, design the cloud for management efficiencies and provide access to shared services on cloud.
- A perennial challenge for infrastructure on cloud is planning for governance vs flexibility which is often overlooked.
Our Advice
Critical Insight
Don’t wait until the necessity arises to evaluate your networking in the cloud. Get ahead of the curve and choose the topology that optimizes benefits and supports organizational needs in the present and the future.
Impact and Result
- Define organizational needs and understand the pros and cons of cloud network topologies to strategize for the networking design.
- Consider the layered complexities of addressing the governance vs. flexibility spectrum for your domains when designing your networks.
Considerations for a Hub and Spoke Model When Deploying Infrastructure in the Cloud
Don't revolve around a legacy design; choose a network design that evolves with the organization.
Analyst Perspective
Cloud adoption among organizations increases gradually across both the number of services used and the amount those services are used. However, network builders tend to overlook the vulnerabilities of network topologies, which leads to complications down the road, especially since the structures of cloud network topologies are not all of the same quality. A network design that suits current needs may not be the best solution for the future state of the organization.
Even if on-prem network strategies were retained for ease of migration, it is important to evaluate and identify the cloud network topology that can not only elevate the performance of your infrastructure in the cloud, but also that can make it easier to manage and provision resources.
An "as the need arises" strategy will not work efficiently since changing network designs will change the way data travels within your network, which will then need to be adopted to existing application architectures. This becomes more complicated as the number of services hosted in the cloud grows.
Keep a network strategy in place early on and start designing your infrastructure accordingly. This gives you more control over your networks and eliminates the need for huge changes to your infrastructure down the road.
Nitin Mukesh
Senior Research Analyst, Infrastructure and Operations
Info-Tech Research Group
Executive Summary
Your Challenge
The organization is planning to move resources to the cloud or devise a networking strategy for their existing cloud infrastructure to harness value from the cloud.
The right topology needs to be selected to deploy network level isolation, design the cloud for management efficiencies, and provide access to shared services in the cloud.
A perennial challenge for infrastructure in the cloud is planning for governance vs. flexibility, which is often overlooked.
Common Obstacles
The choice of migration method may result in retaining existing networking patterns and only making changes when the need arises.
Networking in the cloud is still new, and organizations new to the cloud may not be aware of the cloud network designs they can consider for their business needs.
Info-Tech's Approach
Define organizational needs and understand the pros and cons of cloud network topologies to strategize for the networking design.
Consider the layered complexities of addressing the governance vs. flexibility spectrum for your domains when designing your networks.
Insight Summary
Don't wait until the necessity arises to evaluate your networking in the cloud. Get ahead of the curve and choose the topology that optimizes benefits and supports organizational needs in the present and future.
Your challenge
Selecting the right topology: Many organizations migrate to the cloud retaining a mesh networking topology from their on-prem design, or they choose to implement the mesh design leveraging peering technologies in the cloud without a strategy in place for when business needs change. While there may be many network topologies for on-prem infrastructure, the network design team may not be aware of the best approach in cloud platforms for their requirements, or a cloud networking strategy may even go overlooked during the migration.
Finding the right cloud networking infrastructure for:
- Management efficiencies
- Network-level isolation of resources
- Access to shared services
Deciding between governance and flexibility in networking design: In the hub and spoke model, if a domain is in the hub, the greater the governance over it, and if it sits in the spoke, the higher the flexibility. Having a strategy for the most important domains is key. For example, some security belongs in the hub and some security belongs in the spoke. The tradeoff here is if it sits completely in the spoke, you give it a lot of freedom, but it becomes harder to standardize across the organization.
Mesh network topology
A mesh is a design where virtual private clouds (VPCs) are connected to each other individually creating a mesh network. The network traffic is fast and can be redirected since the nodes in the network are interconnected. There is no hierarchical relationship between the networks, and any two networks can connect with each other directly.
In the cloud, this design can be implemented by setting up peering connections between any two VPCs. These VPCs can also be set up to communicate with each other internally through the cloud service provider's network without having to route the traffic via the internet.
While this topology offers high redundancy, the number of connections grows tremendously as more networks are added, making it harder to scale a network using a mesh topology.
Mesh Network on AWS
Source: AWS, 2018
Constraints
The disadvantages of peering VPCs into a mesh quickly arise with:
- Transitive connections: Transitive connections are not supported in the cloud, unlike with on-prem networking. This means that if there are two networks that need to communicate, a single peering link can be set up between them. However, if there are more than two networks and they all need to communicate, they should all be connected to each other with separate individual connections.
- Cost of operation: The lack of transitive routing requires many connections to be set up, which adds up to a more expensive topology to operate as the number of networks grows. Cloud providers also usually limit the number of peering networks that can be set up, and this limit can be hit with as few as 100 networks.
- Management: Mesh tends to be very complicated to set up, owing to the large number of different peering links that need to be established. While this may be manageable for small organizations with small operations, for larger organizations with robust cybersecurity practices that require multiple VPCs to be deployed and interconnected for communications, mesh opens you up to multiple points of failure.
- Redundancy: With multiple points of failure already being a major drawback of this design, you also cannot have more than one peered connection between any two networks at the same time. This makes designing your networking systems for redundancy that much more challenging.
Number of virtual networks | 10 | 20 | 50 | 100 |
Peering links required [(n-1)*n]/2 |
45 | 190 | 1225 | 4950 |
Proportional relationship of virtual networks to required peering links in a mesh topology
Case study
INDUSTRY: Blockchain
SOURCE: Microsoft
An organization with four members wants to deploy a blockchain in the cloud, with each member running their own virtual network. With only four members on the team, a mesh network can be created in the cloud with each of their networks being connected to each other, adding up to a total of 12 peering connections (four members with three connections each). While the members may all be using different cloud accounts, setting up connections between them will still be possible.
The organization wants to expand to 15 members within the next year, with each new member being connected with their separate virtual networks. Once grown, the organization will have a total of 210 peering connections since each of the virtual networks will then need 14 peering connections. While this may still be possible to deploy, the number of connections makes it harder to manage and would be that much more difficult to deploy if the organization grows to even 30 or 40 members. The new scale of virtual connections calls for an alternative networking strategy that cloud providers offer – the hub and spoke topology.
Source: Microsoft, 2017
Hub and spoke network topology
In hub and spoke network design, each network is connected to a central network that facilitates intercommunication between the networks. The central network, also called the hub, can be used by multiple workloads/servers/services for hosting services and for managing external connectivity. Other networks connected to the hub through network peering are called spokes and host workloads.
Communications between the workloads/servers/services on spokes pass in or out of the hub where they are inspected and routed. The spokes can also be centrally managed from the hub with IT rules and processes.
A hub and spoke design enable a larger number of virtual networks to be interconnected as each network only needs one peered connection (to the hub) to be able to communicate with any other network in the system.
Hub and Spoke Network on AWS
What hub and spoke networks do better
- Ease of connectivity: Hub and spoke decreases the liabilities of scale that come from a growing business by providing a consistent connection that can be scaled easily. As more networks are added to an organization, each will only need to be connected once – to the hub. The number of connections is considerably lower than in a mesh topology and makes it easier to maintain and manage.
- Business agility and scalability: It is easier to increase the number of networks than in mesh, making it easier to grow your business into new channels with less time, investment, and risk.
- Data collection: With a hub and spoke design, all data flows through the hub – depending on the design, this includes all ingress and egress to and from the system. This makes it an excellent central network to collect all business data.
- Network-level isolation: Hub and spoke enables separation of workloads and tiers into different networks. This is particularly useful to ensure an issue affecting a network or a workload does not affect the rest.
- Network changes: Changes to a separated network are much easier to carry out knowing the changes made will not affect all the other connected networks. This reduces work-hours significantly when systems or applications need to be altered.
- Compliance: Compliance requirements such as SOC 1 and SOC 2 require separate environments for production, development, and testing, which can be done in a hub and spoke model without having to re-create security controls for all networks.
Hub and spoke constraints
While there are plenty of benefits to using this topology, there are still a few notable disadvantages with the design.
Point-to-point peering
The total number of total peered connections required might be lower than mesh, but the cost of running independent projects is cheaper on mesh as point-to-point data transfers are cheaper.
Global access speeds with a monolithic design
With global organizations, implementing a single monolithic hub network for network ingress and egress will slow down access to cloud services that users will require. A distributed network will ramp up the speeds for its users to access these services.
Costs for a resilient design
Connectivity between the spokes can fail if the hub site dies or faces major disruptions. While there are redundancy plans for cloud networks, it will be an additional cost to plan and build an environment for it.
Leverage the hub and spoke strategy for:
Providing access to shared services: Hub and spoke can be used to give workloads that are deployed on different networks access to shared services by placing the shared service in the hub. For example, DNS servers can be placed in the hub network, and production or host networks can be connected to the hub to access it, or if the central network is set up to host Active Directory services, then servers in other networks can act as spokes and have full access to the central VPC to send requests. This is also a great way to separate workloads that do not need to communicate with each other but all need access to the same services.
Adding new locations: An expanding organization that needs to add additional global or domestic locations can leverage hub and spoke to connect new network locations to the main system without the need for multiple connections.
Cost savings: Apart from having fewer connections than mesh that can save costs in the cloud, hub and spoke can also be used to centralize services such as DNS and NAT to be managed in one location rather than having to individually deploy in each network. This can bring down management efforts and costs considerably.
Centralized security: Enterprises can deploy a center of excellence on the hub for security, and the spokes connected to it can leverage a higher level of security and increase resilience. It will also be easier to control and manage network policies and networking resources from the hub.
Network management: Since each spoke is peered only once to the hub, detecting connectivity problems or other network issues is made simpler in hub and spoke than on mesh. A network manager deployed on the cloud can give access to network problems faster than on other topologies.
Hub and spoke – mesh hybrid
The advantages of using a hub and spoke model far exceed those of using a mesh topology in the cloud and go to show why most organizations ultimately end up using the hub and spoke as their networking strategy.
However, organizations, especially large ones, are complex entities, and choosing only one model may not serve all business needs. In such cases, a hybrid approach may be the best strategy. The following slides will demonstrate the advantages and use cases for mesh, however limited they might be.
Where it can be useful:
An organization can have multiple network topologies where system X is a mesh and system Y is a hub and spoke. A shared system Z can be a part of both systems depending on the needs.
An organization can have multiple networks interconnected in a mesh and some of the networks in the mesh can be a hub for a hub-spoke network. For example, a business unit that works on data analysis can deploy their services in a spoke that is connected to a central hub that can host shared services such as Active Directory or NAT. The central hub can then be connected to a regional on-prem network where data and other shared services can be hosted.
Hub and spoke – mesh hybrid network on AWS
Why mesh can still be useful
Benefits Of Mesh |
Use Cases For Mesh |
---|---|
Security: Setting up a peering connection between two VPCs comes with the benefit of improving security since the connection can be private between the networks and can isolate public traffic from the internet. The traffic between the networks never has to leave the cloud provider's network, which helps reduce a class of risks. Reduced network costs: Since the peered networks communicate internally through the cloud's internal networks, the data transfer costs are typically cheaper than over the public internet. Communication speed: Improved network latency is a key benefit from using mesh because the peered traffic does not have to go over the public internet but rather the internal network. The network traffic between the connections can also be quickly redirected as needed. Higher flexibility for backend services: Mesh networks can be desirable for back-end services if egress traffic needs to be blocked to the public internet from the deployed services/servers. This also helps avoid having to set up public IP or network address translation (NAT) configurations. |
Connecting two or more networks for full access to resources: For example, consider an organization that has separate networks for each department, which don't all need to communicate with each other. Here, a peering network can be set up only between the networks that need to communicate with full or partial access to each other such as finance to HR or accounting to IT. Specific security or compliance need: Mesh or VPC peering can also come in handy to serve specific security needs or logging needs that require using a network to connect to other networks directly and in private. For example, global organizations that face regulatory requirements of storing or transferring data domestically with private connections. Systems with very few networks that do not need internet access: Workloads deployed in networks that need to communicate with each other but do not require internet access or network address translation (NAT) can be connected using mesh especially when there are security reasons to keep them from being connected to the main system, e.g. backend services such as testing environments, labs, or sandboxes can leverage this design. |
Designing for governance vs. flexibility in hub and spoke
Governance and flexibility in managing resources in the cloud are inversely proportional: The higher the governance, the less freedom you have to innovate.
The complexities of designing an organization's networks grow with the organization as it becomes global and takes on more services and lines of business. Organizations that choose to deploy the hub and spoke model face a dilemma in choosing between governance and flexibility for their networks. Organizations need to find that sweet spot to find the right balance between how much they want to govern their systems, mainly for security- and cost-monitoring, and how much flexibility they want to provide for innovation and other operations, since the two usually tend to have an inverse relationship.
This decision in hub and spoke usually means that the domains chosen for higher governance must be placed in the hub network, and the domains that need more flexibility in a spoke. The key variables in the following slide will help determine the placement of the domain and will depend entirely on the organization's context.
The two networking patterns in the cloud have layered complexities that need to be systematically addressed.
Designing for governance vs. flexibility in hub and spoke
If a network has more flexibility in all or most of these domains, it may be a good candidate for a spoke-heavy design; otherwise, it may be better designed in a hub-centric pattern.
- Function: The function the domain network is assigned to and the autonomy the function needs to be successful. For example, software R&D usually requires high flexibility to be successful.
- Regulations: The extent of independence from both internal and external regulatory constraints the domain has. For example, a treasury reporting domain typically has high internal and external regulations to adhere to.
- Human resources: The freedom a domain has to hire and manage its resources to perform its function. For example, production facilities in a huge organization have the freedom to manage their own resources.
- Operations: The freedom a domain has to control its operations and manage its own spending to perform its functions. For example, governments usually have different departments and agencies, each with its own budget to perform its functions.
- Technology: The independence and the ability a domain has to manage its selection and implementation of technology resources in the cloud. For example, you may not want a software testing team to have complete autonomy to deploy resources.