Imagine proposing a host-based agent solution to a team of seasoned IT administrators. You will get some obvious questions right away:
- How much CPU and memory does the agent consume?
- Does it use a custom kernel module?
- What if that crashes the kernel?
- Is it in the data path processing packets?
- How much disk space does it use?
- How intrusive is it to my system?
- Does it affect my standard system configuration?
- How does it impact the throughput and connection rate?
These questions arise because agents running inside an operating system are notorious for causing problems in these areas, plus a whole lot more. Organizations try to keep their OS images as pristine as possible. Having third-party software inside core images increases risk and creates troubleshooting overhead.
Agents that process data packets in-line frequently degrade the throughput and connection rates of an application as they scan files and/or network traffic for malware detection. These processes are dreaded for consuming CPU and RAM cycles. So it’s understandable that many large enterprises have strict policies against installing third-party agents on server images.
We built the Illumio Virtual Enforcement Node (VEN)—a software agent that runs inside the application workloads to help enforce security—with all of these concerns in mind. The result is a lean agent that delivers security by using proven, pre-existing mechanisms inside an operating system while minimizing its footprint.
Rethinking security for today’s dynamic data center and cloud
When the Illumio team sat down to develop a new security approach for today’s dynamic data center and cloud environments, we had three goals:
- Build an architecture that delivers security enforcement anywhere. Anywhere means any hypervisor, bare-metal server, or even a Linux container. Anywhere means a private data center; a public cloud like AWS, Azure, or Rackspace; or a mix of all of them. Anywhere means starting in a private data center and moving partially or completely into a public cloud—without having to rewrite security policies. The only way to deliver policy enforcement that is completely agnostic to the networking and virtualization infrastructure is to deliver it inside the compute unit for applications—the workload.
- Feed the security system with a deeper understanding of an application’s context so it can make intelligent policy decisions and adapt to changes in real time. For example, if the web tier of an application auto scales to serve increased demand, the security system should be able to adjust the security policy to the scaled-up web tier without any human intervention. The security system should be able to provide sufficient context to the IT administrator so he or she can make informed policy decisions. This level of understanding is only possible if the security system can see inside the application—at the workload level.
- Build a security system that moves at the same speed applications move, in the continuous-delivery DevOps model. As applications move swiftly through the development life cycle, the security system should keep up, moving at the same speed, and without delay, right before going the application moves into production. The most natural way to align the speed of these two is to have the security live within the application—inside the workload.
We looked at it from many angles and the more we looked the more it made sense to move the security inside the application workloads.
With that decision, we created the Illumio VEN, a software agent that runs inside the application workloads.
How the VEN stays lean
Our approach to the VEN started with a desire to understand the common objections related to running an agent inside the application workloads. We started with the philosophy that the VEN is more like an antenna than an agent. Antennas only transmit and receive signals; the device that transmits the signal performs the most intelligent functions and the device that receives the signal knows exactly what to do with the received signal.
So, we created the Policy Compute Engine (PCE), a centralized controller that computes and then transmits a unique security policy to each of the VENs. The VEN receives this policy, programs pre-existing OS-native mechanisms in the workload with the received policy, and then lets those mechanisms enforce that policy. This approach allows the VEN to stay lean as it only transmits the context to the PCE and delivers the received policy to the workload.
Customer feedback from the very beginning
We were fortunate to have our early customers advise us on what was acceptable in their environments and what was not. Based on their ongoing advice, we applied the following architectural principals to ensure that the Illumio VEN stayed as lean possible while delivering visibility and enforcement anywhere, with context and speed:
- Use pre-existing mechanisms in the OS and, more importantly, in the kernel of the OS. The VEN does not enforce policy by processing data packets in-line. Instead, it enforces policy that it receives from the PCE by programming the mechanisms that have existed in operating systems for decades and are trusted by the industry at large.
- No custom kernel modules. Instead of using custom kernel modules, the VEN runs as a number of user-level processes to ensure it poses no threats to the kernel.
- Behave like a guest inside the workload. The VEN operates inside the workload with a philosophy that it is a guest inside the system and the resources on the workload belong to the application. In the spirit of being a helpful guest, the VEN avoids modifying any system configuration files and takes special care to minimize any disruption to the application during installation and while enforcing the security policy.
- Minimize footprint. All components of the VEN are designed to minimize its footprint on CPU, RAM, and disk. The software binaries are kept small, periodic polls are kept to a minimum, disk usage is minimized by putting careful caps on the stored data, and the algorithms used for collecting and uploading visibility information are highly optimized to minimize the CPU and memory usage by the VEN.
- Visibility modes. The VEN has various visibility modes so administrators can fine-tune the balance between their visibility requirements and the VEN’s resource usage at various stages of their application and policy life cycle.
The decisions above—and a lot of automated testing—are how we ensure that the VEN remains mighty yet lean.