Large organizations often have data centers located in different geographic regions. Distributed data centers allow these organizations to locate their applications close to their customers and employees, comply with data residency requirements, and provide disaster recovery for their critical business applications. The adoption of public cloud is making it even easier for organizations of all sizes to distribute their workloads across multiple regions. For example, AWS now spans 18 geographic regions around the world.
We’re excited to introduce PCE Supercluster to provide full visibility, centralized and federated management, and consistent enforcement of microsegmentation policies across multi-region infrastructure – at very large scale. This post explores the key requirements for securing multi-region infrastructure and why we designed PCE Supercluster with a federated architecture.
Requirements for a multi-region microsegmentation solution
If you have a globally distributed infrastructure, there are several key requirements for a microsegmentation solution. It is important to consider these requirements upfront, even if your initial microsegmentation deployment is limited to a single location.
- Resiliency: The microsegmentation solution must continue to operate and secure the infrastructure in the event of a data center failure or network outage between regions.
- Scalability: The microsegmentation solution must scale with the number of workloads in each data center and the total number of workloads worldwide.
- Manageability: The microsegmentation solution must be manageable by both global and regional security and application teams.
- Bandwidth Efficiency: Network bandwidth between regions is expensive, so the solution must not consume large amounts of bandwidth.
Architectures for a multi-region microsegmentation solution
Illumio’s Policy Compute Engine (PCE) is a software-based controller that is responsible for orchestrating microsegmentation policy across workloads and other enforcement points in the infrastructure. The PCE also collects telemetry data from the infrastructure, such as network flow information and information about the processes running on the workloads.
There are several possible approaches to architect the PCE – or any software-based microsegmentation solution – to secure workloads located in different geographic regions.
Here's a breakdown of the various architecture approaches and how they map to the requirements described above.
Centralized Architecture – the controller resides in a single location.
- Resiliency: A centralized architecture creates a single point of failure and provides limited resiliency.
- Scalability: A centralized architecture can scale both vertically and horizontally to support the total number of workloads worldwide.
- Manageability: A centralized architecture makes it easy for global security and application teams to configure and apply microsegmentation policy across the entire infrastructure. Role-Based Access Control (RBAC) can be used to provide regional teams with limited access to view and modify policy for just the applications in their region.
- Bandwidth Efficiency: A centralized architecture uses more bandwidth because all network flow data and other telemetry must be sent back to the controller. The bandwidth increases with the number of workloads per region and the numbers of connections between these workloads.
Distributed Architecture – places a controller in each data center and the controllers are completely independent from one another.
- Resiliency: A distributed architecture is highly resilient. A failure to a controller in one region does not affect the other regions.
- Scalability: A distributed architecture can scale with the number of workloads in each data center and the total number of workloads worldwide by deploying more controllers.
- Manageability: A distributed architecture allows regional teams to create local policies, but this architecture creates challenges for enforcing global policies since these must be manually replicated to each region. In addition, there is no way to visualize all applications in one place and see cross-region dependencies.
- Bandwidth Efficiency: A distributed architecture is more bandwidth efficient as all data stays local to the region.
Federated Architecture – places a controller in each data center and the controllers communicate with each other to share information about the organization's security policy and the workloads that are being secured.
- Resiliency: A federated architecture is highly resilient. Failure to a controller in one region does not affect the other regions.
- Scalability: A federated architecture can scale with the number of workloads in each data center and the total number of workloads worldwide by deploying more controllers.
- Manageability: A federated architecture makes it easy for global security and application teams to configure and apply microsegmentation policy across the entire infrastructure. RBAC can be used to provide regional teams with limited access to view and modify policy for just the applications in their region.
- Bandwidth Efficiency: A federated architecture is more bandwidth-efficient provided that only the minimal amount of information is shared between the controllers for the system to function.
The following table summarizes the three architectures for microsegmenting multi-region infrastructure:
CentralizedDistributedFederated Resiliency -++ Scalability +++ Manageability +-+ Bandwidth Efficiency -++
Introducing PCE Supercluster: multi-region microsegmentation done right
Given the clear advantages, PCE Supercluster has been designed with a federated architecture. In a Supercluster, global security policy is managed from a designated leader PCE. Illumio’s robust RBAC capabilities are supported on Supercluster, allowing global and regional teams to access the leader PCE in a least privilege fashion. The policy is then automatically replicated to the other PCEs that translate the label-based policies into instructions used to program host firewalls on workloads and other enforcements points in the infrastructure. This design ensures the global policy will be continuously enforced, even if a region becomes isolated from the rest of the Supercluster.
Illumio recognized early on that visibility is key to microsegmentation because you can't secure what you can’t see. Supercluster provides a full live application dependency map (Illumination) on the leader to visualize intra- and inter-region application dependencies and policy coverage. Real-time visibility into the high value systems and the authorized connections and flows across these applications is a critical first step to designing the organization’s micro-perimeters and creating microsegmentation policies that will not break applications.
Supercluster adds a level of scale to support the largest organizations in the world.
A single PCE can already be deployed as a multi-node cluster to support tens of thousands of workloads. By enabling multiple PCEs to be joined together, Supercluster adds another level of scale to support the largest organizations in the world.
We spent a good deal of energy designing Supercluster to minimize bandwidth consumption between PCEs. Only the minimum amount of workload data necessary to compute policy is replicated between regions. In addition, network flow data is preprocessed in region by each PCE and only the minimum information needed to draw the live application dependency map is replicated across the network.
With PCE Supercluster, organizations can:
- Gain real-time visibility into their globally distributed data center environment.
- Confidently design micro-perimeters and create microsegmentation policies that support inter-regional traffic and enforce microsegmentation at significant scale – without breaking applications.
- Accomplish their microsegmentation objectives and, at the same time, realize network bandwidth efficiencies and support disaster recovery and high availability.