How AI and Machine Learning Can Accelerate Zero Trust Segmentation

One way of looking at Zero Trust is as a strategy that turns your excessive and risky implicit trust into a less risky explicit trust model.

This state of implicit trust is classically explained with a firewall being the demarcation point between a trusted network (inside the firewall) and an untrusted network (outside the firewall). The job of the firewall is to keep the bad actors on the untrusted side of that boundary.

But this standard approach yields a “soft and chewy” inside behind the firewall because everything inside that trusted zone is implicitly trusted. ‍

The risks of implicit trust

It is a very simple security model, and it’s how we’ve been doing it for years – and how a lot of people are still doing it today. But once a bad actor is inside, their life is easy with this model.

There are also factors making the risk worse. For example, one can imagine that as a company scales, the trusted zone grows bigger. This means it is more likely something untrustworthy is inside.

Secondly, independent of scale, as things get more complex (as they naturally tend towards over time) the probability of something getting inside increases. So, scale and complexity make the risks associated with that implicit trust model grow continuously over time.

Zero Trust initiatives require knowing each system’s identity

Now, what’s the challenge with moving from implicit trust to explicit trust (or, going on a Zero Trust journey)?

Obviously, a segmentation policy that you apply to the server, endpoints, and devices is the way to accomplish that transformation to explicit trust. But before you can have an accurate security policy that tightens controls enough to reduce risk but isn’t so fragile that it will break applications, you need to know what’s in that environment. There are simply things on your network, and you must know what they are before you can apply segmentation and access control decisions.

The question you need to answer is: What is the identity of each system on your network?

For a long time, I had a limited definition of the word identity as it was being used in the Identity and Access Management (IAM) space. IAM is a technology that focuses primarily on authenticating users to prove who they say they are. The definition of identity in that context was limiting my understanding because I equated “username” and “secret” as identity. And I’ll be honest, I’m really kicking myself for having such a limited perspective on identity – it’s way more complex than that.

For example, in the non-tech world, my identity is not just my name. My name is a label that is used to reference me, but I’m also a CTO, employee of Illumio, father, brother, husband, and son. I grew up in Ohio, live in California, and enjoy rye whiskey, hiking, and cooking. I seek truth and believe in the value of trusting others, I listen, I started a podcast on CTOs, and I love software and startups.

The point is that my identity is much more than my name – it’s a multidimensional set of attributes that are both shared with many others (some might overlap with you, the person reading this) and a set of these things describing my unique identity.

Using artificial intelligence and machine learning for identity

Just like people, IT systems have an identity, and they have a purpose. Systems can be production or non-production. A system could be a front-end or a back-end, or it could be an HVAC controller, printer, or blood pressure machine. Systems could be in a critical care wing of the hospital, or they could be in the basement. Systems could be in charge of tens of billions of dollars of transfers, or they could be holding the live game state for thousands of Roblox players.

For individual system identity

Artificial intelligence/machine learning (AI/ML) solutions are ideally suited for this kind of problem that has multidimensional input and requires multi-dimensional output values that make up system identity. This includes both the behavior of a system’s peers (who they are talking to) and deep packet inspection that sees exactly what is happening at the application level (what they are saying).

And while some values are binary values like prod vs. non-prod, other values like business value are more continuous values. AI/ML techniques can provide those different types as well as provide confidence in the predictions to help provide suggestions around that identity space.

Illumio’s first use of ML in the Illumio Core solution was around this area. Core services detection uses both heuristics as well as ML techniques, leveraging features like peer relationships. This improves the productivity of the segmentation system’s operator through a recommendation workflow for those core services.

For application group identity

And to go a step further, identity is not just about an individual system. Just like us humans are individuals, we are also part of groups. I have a family, and my family has a group identity, a shared set of characteristics that make that group unique. That set of characteristics is different from another group.

This applies to IT systems as well. Servers might be back-end or front-end, but the sum of a set of individual servers makes up an application. And that application as a whole often needs to be treated as a single unit – just like a group of individuals in a family. For example, instances of applications could be very similar if a production system has a twin system in the staging environment that is often used for testing. It can also be similar in many dimensions as a group and should be treated as such, even if the underlying individual components are totally different.

When implementing segmentation in their networks, customers often want to create a ringfence around an application, so knowing the identity of the application and all its members is critical. ML clustering algorithms and other approaches are valuable here.

For system identity changes over time

The third aspect to the problem is one of time: Identities don’t remain fluid at the individual system level or the group level.

A classic problem is that systems are critical to some business function on the day they are built, but over time, priorities change, people leave, things happen, and that same system which was critical is now no longer relevant. As its purpose changes, its identity changes with it. Maybe it’s no longer maintained with the most up-to-date patches, which means it’s riskier and a juicier target for an attacker wanting to get a persistent foothold in the environment (see the Sony Pictures Hack). Or maybe it served a business-critical function, but a new generation of applications take on new customers and users on this application are shrinking.

Identities morph over time. However, segmentation policy needs to keep up with those changes. ML/AI algorithms are not just for a point-in-time analysis; they need to constantly run, understand changes to the environments, and make recommendations to keep policy in sync.

Security policies become fragile if they are not adapted when the identity and purpose of the system changes. Continuously asking if your policies are complete and correct, and providing predictive feedback on the places where risks or cracks will appear, will help operators keep things safe.

Where AI and ML fit into Zero Trust Segmentation

So, the sweet spot for AI/ML in Zero Trust Segmentation systems is to:

Provide suggestions for multi-dimensional identity for systems
Provide a higher level of grouping, membership, and group identity
Continuously track changes to the identity over time to inform the completeness and correctness of a Zero Trust Segmentation policy

Innovations in AI and ML can serve as powerful tools to those valuable security humans who we entrust every day with the task of securing our critical data and defending against attack. Let these AI- and ML-powered segmentation systems become a force multiplier in that never-ending battle.

Want to learn more about Illumio Zero Trust Segmentation? Contact us today.