A Global 50 financial services business started off running containers for 600 applications in a Kubernetes production environment. Relying on native Kubernetes Network Policies and the multitenant network plugin, the security team thought its network security environment was secure and self-contained. However, they suffered from lack of flow visibility and were reluctant to change policies in production for fear of a containerized application outage.
In this multi-tenant environment, the DevOps team needed to open communications between some client namespaces and applications running on virtual machines and bare-metal servers (batch processing, Jenkins replicas, build systems, databases, etc.). This was a huge challenge for two main reasons:
• Limited egress policy capabilities by default in Kubernetes
• Intermediate firewalls were managed by a different team (security)
The container cluster was firewalled from the rest of the data center using perimeter firewalls to control east-west traffic and prevent lateral movement between environments in case of a breach. But the security team didn’t want to randomly open a large set of IPs and ports on the firewall.
They mandated a statically assigned IP per application/namespace – a challenge for working with agility in dynamic Kubernetes environments.
In the container world, applications come and go, and everything is dynamic, based on IP pools and names. But when a containerized application was decommissioned or simply re-deployed, no one tore down the old policies in the firewall, leaving a potential hole in the perimeter security. While it was an effective kludge to work around the issue temporarily, firewall configuration was cumbersome, error-prone, and ultimately wouldn’t scale with a fast-growing container deployment.
Using the firewall workaround with static IP addresses made it complicated to securely control east-west traffic between containerized and non-containerized services because of the change management process and the risk of an outage in the production environment. Even though the business required these communications to be opened quickly, it took weeks to deploy fully functional new applications.
After a trial of several months, application developers didn’t want to use this approach. Instead of speeding up deployments, it took two weeks to get things out the door again, causing them to circumvent security in order to deliver applications, potentially leading to issues in production.
Where containers should have been an easy and fast way to deliver new apps, deploying on the cluster became an operational nightmare – and a significant risk to the business – due to an overburdened network security architecture.