Scale-Out Private Cloud Infrastructure
ACME Inc. is building a large fully redundant private infrastructure-as-a-service (IaaS) cloud using standardized single-rack building blocks. They plan to use several geographically dispersed data centers with each data center having one or more standard infrastructure racks.
The document describes a summary of design challenges sent by readers of ipSpace.net blog and discussed in numerous ExpertExpress engagements. It’s based on real-life queries and network designs but does not represent an actual customer network. Complete document is available as downloadable PDF to ipSpace.net subscribers. You can also buy a digital book with all ExpertExpress case studies
A standard infrastructure rack in each data center will have:
- Two ToR switches providing intra-rack connectivity and access to the corporate backbone;
- Dozens of high-end servers, each server capable of running between 50 and 100 virtual machines;
- Storage elements, either a storage array, server-based storage nodes, or distributed storage (example: VMware VSAN, Nutanix, Ceph…).
Figure 1: Standard cloud infrastructure rack
Racks in smaller data centers (example: colocation) connect straight to the WAN backbone, racks in data centers co-resident with significant user community connect to WAN edge routers, and racks in larger scale-out data centers connect to WAN edge routers or internal data center backbone.
Figure 2: Planned WAN connectivity
The cloud infrastructure design should:
- Guarantee full redundancy;
- Minimize failure domain size - a failure domain should not span more than a single infrastructure rack, making each rack an independent availability zone.
- Enable unlimited workload mobility.
Cloud Infrastructure Failure Domains
A typical cloud infrastructure has numerous components, including:
- Compute and storage elements;
- Physical and virtual network connectivity within the cloud infrastructure;
- Network connectivity with the outside world;
- Virtualization management system;
- Cloud orchestration system;
- Common network services (DHCP, DNS);
- Application-level services (example: authentication service, database service, backup service) and associated management and orchestration systems.
Figure 3: Cloud infrastructure components
ACME Inc. wants each infrastructure rack to be an independent failure domain. Each infrastructure rack must therefore have totally independent infrastructure and should not rely on critical services, management or orchestration systems running in other racks.
The rest of this case study discusses these topics:
- Impact of shared management or orchestration systems
- Impact of long-distance virtual subnets and layer-2 connectivity requirements
- Workload mobility considerations, including hot and cold VM mobility
- Impact of static IP addressing
- Disaster recovery and workload migration scenarios
Get the complete document
Complete case study, including design and deployment guidelines and sample configuration snippets is available to ipSpace.net subscribers. Select the Case studies tab after logging into the webinar management system.
Products and Services
- Yearly subscription
- ExpertExpress and Consulting
- Live events and on-site workshops
- Webinars and recordings
- Customized webinars
About Ivan Pepelnjak
- BGP Convergence Optimization
- BGP Routing in DMVPN Access Network
- Combine Physical and Virtual Appliances in a Private Cloud
- Designing a Private Cloud Network Infrastructure
- External Routing with Layer-2 Data Center Interconnect (DCI)
- Integrating Internet VPN with MPLS VPN WAN
- Redundant Data Center Internet Connectivity
- Redundant Server-to-Network Connectivity
- Replacing the Central Firewall
- Sizing the Network
3 May 2017
- Open Networking for Large-Scale Networks
9 May 2017
- Networking in Private and Public Clouds
10 May 2017
- Data Center 3.0 Updates
22 May 2017
- Ansible Updates
23 May 2017
- Network Visibility with Flow Data
7 June 2017
- Building Network Automation Solutions (Online course)
15 September 2017
Recent blog posts