Illumio Blog
November 7, 2018

It's a Bird, It's a Plane, It's...Supercluster!

Tim Bardzil,

Large organizations often have data centers located in different geographic regions. Distributed data centers allow these organizations to locate their applications close to their customers and employees, comply with data residency requirements, and provide disaster recovery for their critical business applications. The adoption of public cloud is making it even easier for organizations of all sizes to distribute their workloads across multiple regions. For example, AWS now spans 18 geographic regions around the world.   

We’re excited to introduce PCE Supercluster to provide full visibility, centralized and federated management, and consistent enforcement of micro-segmentation policies across multi-region infrastructure – at very large scale. This post explores the key requirements for securing multi-region infrastructure and why we designed PCE Supercluster with a federated architecture. 

Supercluster Launch Animation Still_Blog Header_FinalRequirements for a multi-region micro-segmentation solution 

If you have a globally distributed infrastructure, there are several key requirements for a micro-segmentation solution. It is important to consider these requirements upfront, even if your initial micro-segmentation deployment is limited to a single location.

  • Resiliency: The micro-segmentation solution must continue to operate and secure the infrastructure in the event of a data center failure or network outage between regions.
  • Scalability: The micro-segmentation solution must scale with the number of workloads in each data center and the total number of workloads worldwide.
  • Manageability: The micro-segmentation solution must be manageable by both global and regional security and application teams.
  • Bandwidth Efficiency: Network bandwidth between regions is expensive, so the solution must not consume large amounts of bandwidth.


Architectures for a Multi-Region Micro-Segmentation Solution

Illumio’s Policy Compute Engine (PCE) is a software-based controller that is responsible for orchestrating micro-segmentation policy across workloads and other enforcement points in the infrastructure. The PCE also collects telemetry data from the infrastructure, such as network flow information and information about the processes running on the workloads.

There are several possible approaches to architect the PCE – or any software-based micro-segmentation solution – to secure workloads located in different geographic regions.

Here's a breakdown of the various architecture approaches and how they map to the requirements described above.

Centralized Architecture – the controller resides in a single location.

  • Resiliency: A centralized architecture creates a single point of failure and provides limited resiliency.
  • Scalability: A centralized architecture can scale both vertically and horizontally to support the total number of workloads worldwide.
  • Manageability: A centralized architecture makes it easy for global security and application teams to configure and apply micro-segmentation policy across the entire infrastructure. Role-Based Access Control (RBAC) can be used to provide regional teams with limited access to view and modify policy for just the applications in their region. 
  • Bandwidth Efficiency: A centralized architecture uses more bandwidth because all network flow data and other telemetry must be sent back to the controller. The bandwidth increases with the number of workloads per region and the numbers of connections between these workloads.

Distributed Architecture – places a controller in each data center and the controllers are completely independent from one another.

  • Resiliency: A distributed architecture is highly resilient. A failure to a controller in one region does not affect the other regions.
  • Scalability: A distributed architecture can scale with the number of workloads in each data center and the total number of workloads worldwide by deploying more controllers.
  • Manageability: A distributed architecture allows regional teams to create local policies, but this architecture creates challenges for enforcing global policies since these must be manually replicated to each region. In addition, there is no way to visualize all applications in one place and see cross-region dependencies.
  • Bandwidth Efficiency: A distributed architecture is more bandwidth efficient as all data stays local to the region.


Federated Architecture
– places a controller in each data center and the controllers communicate with each other to share information about the organization's security policy and the workloads that are being secured.

  • Resiliency: A federated architecture is highly resilient. Failure to a controller in one region does not affect the other regions.
  • Scalability: A federated architecture can scale with the number of workloads in each data center and the total number of workloads worldwide by deploying more controllers.
  • Manageability: A federated architecture makes it easy for global security and application teams to configure and apply micro-segmentation policy across the entire infrastructure. RBAC can be used to provide regional teams with limited access to view and modify policy for just the applications in their region.
  • Bandwidth Efficiency: A federated architecture is more bandwidth-efficient provided that only the minimal amount of information is shared between the controllers for the system to function.


The follow table summarizes the three architectures for micro-segmenting multi-region infrastructure:

  Centralized Distributed Federated
Resiliency - + +
Scalability + + +
Manageability + - +
Bandwidth Efficiency - + +

Introducing PCE Supercluster: Multi-Region Micro-Segmentation Done Right

Given the clear advantages, PCE Supercluster has been designed with a federated architecture. In a Supercluster, global security policy is managed from a designated leader PCE. Illumio’s robust RBAC capabilities are supported on Supercluster, allowing global and regional teams to access the leader PCE in a least privilege fashion. The policy is then automatically replicated to the other PCEs that translate the label-based policies into instructions used to program host firewalls on workloads and other enforcements points in the infrastructure. This design ensures the global policy will be continuously   enforced, even if a region becomes isolated from the rest of the Supercluster.

Illumio recognized early on that visibility is key to micro-segmentation because you can't secure what you can’t see. Supercluster provides a full live application dependency map (Illumination) on the leader to visualize intra- and inter-region application dependencies and policy coverage. Real-time visibility into the high value systems and the authorized connections and flows across these applications is a critical first step to designing the organization’s micro-perimeters and creating micro-segmentation policies that will not break applications.

Supercluster adds a level of scale to support the largest organizations in the world.

A single PCE can already be deployed as a multi-node cluster to support tens of thousands of workloads. By enabling multiple PCEs to be joined together, Supercluster adds another level of scale to support the largest organizations in the world.

We spent a good deal of energy designing Supercluster to minimize bandwidth consumption between PCEs. Only the minimum amount of workload data necessary to compute policy is replicated between regions. In addition, network flow data is preprocessed in region by each PCE and only the minimum information needed to draw the live application dependency map is replicated across the network.

With PCE Supercluster, organizations can:

  • Gain real-time visibility into their globally distributed data center environment.
  • Confidently design micro-perimeters and create micro-segmentation policies that support inter-regional traffic and enforce micro-segmentation without breaking applications at significant scale.
  • Accomplish their micro-segmentation objectives and, at the same time, realize network bandwidth efficiencies and support disaster recovery and high availability.

See it in action. Watch our live broadcast featuring a PCE Supercluster demo today, Wednesday, November 7 from 1:00-3:00pm PST. 

WATCH SUPERCLUSTER LIVE DEMO

Topics: cybersecurity

Share this post: