Of the three basic resources in any data center or cloud – compute, storage, network – the network has been the slowest to evolve into the modern world of resource virtualization and abstraction. This is largely by design. It can be argued that the network fabric is the most critical resource in any data center or cloud architecture. Without a network, compute and storage are unreachable islands. The network enables access and allows communication between all compute and storage resources, amongst themselves and out to end users of these resources. Without the network underlying any architecture, all cloud conversations are meaningless. No matter how far you abstract cloud conversations around compute resources, from bare-metal to serverless, if there is an IP packet anywhere in the picture, the network is critical.
Network security is now defined using natural language, not networking language.
This is common sense, and networking has its own forms of resource virtualization meant to solve specific networking problems. Still, it is mentioned here for the simple fact that security in data centers or cloud deployments have traditionally been implemented in the network. In order to block or enable network traffic in transit between cloud resources, a firewall is deployed somewhere in the network fabric. Endpoint software may be deployed on compute resources, which are usually signature-based tools that look for known malware or bad behavior, but these are generally inspecting traffic, not blocking or allowing it. Most workloads have some kind of built-in firewalling capabilities, such as iptables in Linux, but orchestrating these tools at scale is often difficult to manage, and, therefore, not used. So, network security and traffic enforcement are traditionally done with network firewalls.
Security is often defined in a different language
Since firewalls are usually managed by networking teams, security policy is most often defined using terms that are familiar to networking teams. Firewalls have existed for decades, and how they are configured has changed minimally over the years. Policy rules are traditionally written using IP addresses, TCP/UDP ports, VLANs, and zones. Firewalls are not usually designed to look deeper into the data payload of packets to inspect what content or apps are contained, as they want to avoid becoming a network traffic bottleneck.
There are so-called next-generation firewalls (NGFW) that do have the capability to inspect packets much deeper at wire speed and can define policy against what is actually contained in a packet’s data payload, and not just its network headers. But because old habits die hard, the reality is that often these firewalls are configured the old way, with the next-generation options left unused. The result is a device that uses networking terminology to define network security, which is not how users of resources hosted in a data center or cloud perceive those resources. Users often don’t know, or don’t care, what network segment a resource is hosted on. They are concerned with the resource itself.
Network policy should reflect how users perceive the resources being protected
When a user or developer reports a problem, such as not being able to reach a resource that is hosted in a data center or cloud, they will usually refer to the specific workload or application that is unreachable. They generally won’t report that a specific IP address is not reachable over a specific port. However, the networking or security operations teams will request this information. And here’s where the issue arises: The problem being reported is in a different language than the devices that are enforcing network policy. Application speak usually doesn’t equal networking speak.
One important detail in the quest to automate as many resources as possible in cloud architectures is the ability to define network policy using the same terms that users perceive the resources being protected are. If a firewall is enforcing policy between applications X, Y, and Z, they should be able to define policy specific to those applications, and not specific to which network resource they are hosted on.
This is especially relevant in modern public cloud-hosted infrastructures, such as microservices, in which IP addresses are ephemeral. Workloads and applications are often migrated dynamically across different network segments, so an IP address is no longer a reliable way to identify any specific workload for the lifecycle of that resource. If you have to modify a firewall each time an IP address changes, this is not scalable.
The result is that, very often, firewalls are simply not deployed into modern cloud architectures. Instead, they are relegated to sitting at the perimeter of a cloud fabric, enforcing only North-South traffic, blind to the majority of East-West traffic.
Define security using metadata, not IPs
Most modern SDN controllers can create what amounts to a local database of workload IPs and metadata that is applied to each workload. For example, if five production workloads are SQL servers, and another five workloads are development SQL servers, the controller will build a local record which lists those servers in two categories, with the first five workload IPs assigned to a metadata tag of “SQL-Prod” and the second five workload IPs assigned to a metadata tag of “SQL-Dev.” The controller will monitor those workloads, and if any workload changes its IP for any reason, or if it is spun down, the controller will update its local records of metadata-to-IP mappings.
This way, the controller can track the full lifecycle of the workload based on the metadata assigned to it, regardless of what IP address it has assigned to it. Packet forwarding to and from the workloads is still performed using IP lookups, using its currently assigned IP address. But that workload’s identity is maintained by its assigned metadata, agnostic to which network segment it is assigned to.
Identifying a workload using metadata allows the identity of that workload to be fully abstracted away from any networking details – this is how modern security needs to be defined. Defining policy which reads something like “No SQL servers in dev can initiate contact with SQL servers in prod” is much closer to how users perceive these resources than something like defining policy to read as “192.168.10.0/24 TCP 1024-2000 10.10.0.0/16 permit.” Metadata terms are much more human-readable than networking terms.
Using metadata to identify resources is usually referred to as “tags” or “labels.” And this concept is used by controllers other than SDN. With Illumio ASP, you can assign a label to each workload, and each label has four “dimensions” to it: Role, Application, Environment, and Location. Each workload can be assigned a label that identifies it against one or all of these dimensions, and policy can then be defined using those labels. So, an Illumio ruleset does not refer to ports or IPs; it refers to labels.
The value of Illumio labels
The concept of labels may seem like a minor detail, but it bears emphasizing. Using labels, you can define policy using the same terms as to how users perceive the resources being protected. This is a significant change from how network security is traditionally defined. For decades, network security has been defined around networking constructs: IPs, VLANs, and ports. Firewalls view security through the lens of these networking constructs, and if any of these constructs change, the firewall configuration needs to be modified.
But if policy is defined using labels, and these labels result in the workload’s built-in firewall capabilities being configured to enforce this definition in the background, the policy now matches how resources are actually used. Network security is now defined using natural language, not networking language. And this natural language policy is defined once, remaining quiet even as workloads migrate across network fabrics, are spun down or up, or scaled up to large deployments.
Security should not be a roadblock to workload scalability requirements. Using natural language to define policy – using labels – enables workload evolution without security slowing down the DevOps process.
So I’m using labels: now what?
Even if networking teams become accustomed to defining policy using natural language labels, in order to create more human-readable language, the mindset behind the policy definition is still often network-centric. While the labels will refer to workloads and applications, networking teams still think of trust boundaries as network boundaries. But, as more and more companies adopt a Zero Trust mindset, one important feature requires organizations to push the trust boundaries away from the network and directly to the resources which the labels are referring to. If you have five SQL workloads, each of these workloads is its own trust boundary. The trust boundary is not any common network segment they all may be sharing.
Illumio deploys agents, known as VENs, on every monitored workload, and these agents translate the label-based policy into rules on the built-in firewall on each workload. So, the first step in the life of a packet, at its birth, is policy. Another way to think about Zero Trust is, “no packet shall reach the network forwarding plane until it has been inspected.” With Illumio, by the time a packet reaches the network forwarding plane, security has already been applied.
This is especially important when trying to address the problem of lateral movement, which allows malicious actors or malware to traverse a network undetected. When discussing security guidelines with remote users, for example, the need for security is usually recognized, but a common response is “I have nothing to hide,” which is used as a justification for not bothering to secure a workload. While that user may not have anything to hide, someone else might. Malware often breaches a workload for the specific purpose of hopping from that workload to other workloads, with the destination being a more valuable resource. This is lateral movement, using one workload as a launching pad for another workload.
If a trust boundary is a network segment, and malware breaches one of several workloads on that network segment, it can move laterally between workloads that share a segment without the network firewall noticing a thing. Lateral movement needs to be prevented at every single workload, not in the network fabric.
Labels are useful for making policy easier to understand and to keep the ultimate goal in focus: push the security solution out to what you are trying to secure. Don’t rely on one layer of a cloud architecture to secure a different layer. Your trust boundaries are wherever your workloads live. Zero Trust means every workload is a segment, and if you secure every workload, you will dramatically reduce the risk of lateral movement.
To learn more about Illumio ASP and how we think about labeling, check out: