Firewalls have been a foundational component of cybersecurity strategy for enterprises for a very long time. Due to this critical role firewalls play, they have gone through massive product feature additions and enhancements over the years. I wanted to focus on one particular feature that dates back to 1994 Check Point Software Technologies: stateful inspection.
The reason for revisiting stateful inspection is due to the proliferation of various flavors of data traffic inspection firewall that exists today between stateless and stateful protocol inspection. I will explain these terms shortly, but before that, I wanted to bring your attention to an array of solutions that have popped up that dilute the line between stateful and stateless inspection. This dilution needs to be understood when selecting a firewall for your environment.
In this blog post, I’ll explain the technical details of stateful and stateless firewalls. The second step in our journey is to understand how today’s firewall can be categorized based on this feature.
Stateful vs. stateless inspection
What is a stateful inspection?
A stateful inspection, aka dynamic packet filtering, is the capability of a firewall to filter packets based on the STATE and CONTEXT of network connections. Let’s dive a little deeper to understand what “state” and “context” means for a network connection.
Let's use the network protocol TCP-based communication between two endpoints as a way to understand the state of the connection. In TCP, the four bits (SYN, ACK, RST, FIN) out of the nine assignable control bits are used to control the state of the connection. Below is a simplistic explanation of state tracking in firewalls, but in real life TCP protocol can transition between 11 states depending on whether they are acting as client or server; firewalls can apply policy based on that connection state; and you also have to account for any leftover, retransmitted, or delayed packet to pass through it after connection termination.
- When a client application initiates a connection using three-way handshake, the TCP stack sets the SYN flag to indicate the start of the connection. This flag is used by the firewall to indicate a NEW connection.
- The server replies to the connection by sending an SYN + ACK, at which point the firewall has seen packets from both the side and it promotes its internal connection state to ESTABLISHED. Although from TCP perspective the connection is still not fully established until the client sends a reply with ACK.
- Similarly, when a firewall sees an RST or FIN+ACK packet, it marks the connection state for deletion, and any future packets for this connection will be dropped.
All the networking protocols do not have a state like TCP. An excellent example of such a protocol will be UDP, which is a very commonly used protocol and is stateless in nature. Applications using this protocol either will maintain the state using application logic, or they can work without it. Few popular applications using UDP would be DNS, TFTP, SNMP, RIP, DHCP, etc. Today's stateful firewall create a “pseudo state” for these protocols. For example, when a firewall sees an outgoing packet such as a DNS request, it creates an entry using IP address and port of the source and destination. It then uses this connection data along with connection timeout data to allow the incoming packet such as DNS reply.
The context of a connection includes the metadata associated with packets such as:
- IP address and port of source and destination endpoints
- Last packet received time for handling idle connections
- Packet length
- Layer 4 TCP sequence numbers and flags
- Layer 3 data related to fragmentation and reassembly to identify session for the fragmented packet, etc.
Categorizing firewalls based on stateful vs. stateless inspection
Now that you understand what kind of data a firewall might store, let's look at the various types of firewalls in the market.
Stateless firewalls – Inner workings, uses, and pitfalls
Let's refer to Figure 1 to help understand the inner workings of a stateless firewall. A stateless firewall applies the security policy to an inbound or outbound traffic data (1 in Fig. 1) by inspecting the protocol headers of the packet from OSI layer 2 to 4 with policy table (2). The policy action (4.a & 4.b) to ALLOW, DENY, or RESET the packet could be arrived solely by examining the packet in question and comparing it with policy table (2).
Figure 1: Flow diagram showing policy decisions for a stateless firewall
Let’s talk about the pros and cons of a stateless firewall.
First, the pros:
- The policy lookup is performed on static packet data and policy table, therefore the amount of CPU and memory resources required to do the lookup is low. It’s advantageous to be implemented in static policy lookup devices like routers and switches using features like Access Control Lists (ACLs).
- As a result, stateless firewalling is less resource intensive and is mostly implemented by hardware devices; the additional processing adds no-to-minimal overhead to latency or simple terms line rate processing of the packets.
For the cons:
- A stateless firewall bases its decision only using low fidelity data from the firewall, which provides limited filtering capability.
- Configuring and managing the ACLs on these devices is error-prone at small scale and almost impossible at a large scale.
Let me explain the challenges of configuring and managing ACLs at small and large scale. First, let's take the case of small-scale deployment.
- Stateless firewalls are unidirectional in nature because they make policy decisions by inspecting the content of the current packet irrespective of the flow the packets may belong. To accurately write a policy, both sides of the connection need to be whitelisted for a bidirectional communication protocol like TCP. Writing two rules for policing one connection becomes problematic.
- Now, think of a protocol like FTP where there are two sets of connections for each transaction and the data connection can have unknown albeit connection time negotiated port, there is no way a stateless firewall can whitelist that. This inability for writing policies for all the applications creates a big hole in security.
- Another use case may be an internal host originates the connection to the external internet. How do you create a policy using ACL to allow all the reply traffic? This tool was not built for finer policy controls and is not of much use to a micro-segmentation framework where policy is very fine-grained and directional in nature.
Let's move on to the large-scale problem now.
- For a moment, let's imagine that you have a magic wand to overcome all of the problems described in the previous paragraph. Even in that case, the resources like TCAM (ternary content-addressable memory) on the switch or router hardware are so limited that at a scale you cannot write a policy for all your application due to exhaustion of TCAM space.
- Even for a non-hardware-based implementation, the number of ACLs create many problems. For example, a product I used to work in the past would take thirty or more minutes to parse and build an efficient lookup table (Policy Table in Fig 1 above) on an extensively configured ACL system. And this process would trigger every time a single rule got added or deleted. Consider having that much downtime and impact to your critical business for such mundane day to day tasks.
Reflexive Firewalls aka Reflexive ACLs
A reflexive ACL, aka IP-Session-Filtering ACL, is a mechanism to whitelist return traffic dynamically. Most of the workflow in policy decision is similar to stateless firewall except the mechanism to identify a new workflow and add an automated dynamic stateless ACL entry. Let's see the life of a packet using the workflow diagram below.
Figure 2: Flow diagram showing policy decisions for a reflexive ACL
When a reflexive ACL detects a new IP outbound connection (6 in Fig. 2), it adds a dynamic ACL entry (7) by reversing the source-destination IP address and port. The new dynamic ACL enables the return traffic to get validated against it. Similarly, the reflexive firewall removes the dynamic ACL when it detects FIN packets from both sides, an RST packet or an eventual timeout. This way, as the session finishes or gets terminated, any future spurious packets will get dropped.
So what are the benefits of a reflexive firewall?
The one and only benefit of a reflexive firewall over a stateless firewall is its ability to automatically whitelist return traffic. This helps avoid writing the reverse ACL rule manually.
Cons of a reflexive firewall?
Reflexive ACLs are still acting entirely on static information within the packet. The reason to bring this is that although they provide a step up from standard ACLs in term of writing the rules for reverse traffic, it is straightforward to circumvent the reflexive ACL. Reflexive firewall suffers from the same deficiencies as stateless firewall. One way would to test that would be to fragment the packet so that the information that the reflexive ACL would act on gets split across multiple packets. This way the reflexive ACL cannot decide to allow or drop the individual packet. A stateful firewall, on the other hand, is capable of reassembling the entire fragments split across multiple packets and then base its decision on STATE + CONTEXT + packet data for the whole session.
The other drawback to reflexive ACLs is its ability to work with only certain kind of applications. For example: a very common application FTP that’s used to transfer files over the network works by dynamically negotiating data ports to be used for transfer over a separate control plane connection. Since reflexive ACLs are static, they can whitelist only bidirectional connections between two hosts using the same five-tuple. Therefore, they cannot support applications like FTP.
Stateful firewalls – Inner workings, uses, and pitfalls
A stateful firewall acts on the STATE and CONTEXT of a connection for applying the firewall policy. To understand the inner workings of a stateful firewall, let’s refer to the flow diagram below.
Figure 3: Flow diagram showing policy decisions for a stateful firewall
- When a packet arrives at the firewall (1 in Fig. 3), it tries to do a flow lookup using five-tuple (source IP, source port, destination IP, destination port, protocol) in a table called flow or connection table to find a match (2). Careful readers will point out the difference here compared to stateless firewall where the 5-tuple lookup is performed on policy table rather than flow table. The flow table and associated tables not shown in the figure holds the STATE+CONTEXT of all previously seen flows.
- If an entry is found, then the packet goes through fast path aka data plane processing. Simple fast-path processing will involve rate checks, layer 3 IP sanitation check to avoid fragmentation & reassembly based attack, layer 4 sanitation check to prevent attacks like spoofing, DOS, etc. If the firewall can do layer 7 tests, then it will go through additional filters called Application Layer Gateways (ALGs). If all the checks go smoothly, then the packet is forwarded to its next hop (3.b).
- If flow lookup results in a miss (3.a) then it is assumed that the packet is for a new connection and then it needs to go through additional policy checks, and this path is called slow path AKA control plane processing. In the control processing path, the firewall will not only check everything it does in a fast path, but it also decides if this new connection is allowed by firewall policy.
- So the firewall will do policy lookup by using the STATE + CONTEXT of the connection (5).
- If there is a policy match and action is specified for that policy like ALLOW, DENY or RESET, then the appropriate action is taken at this point of time (8.a or 8.b). Also, advance stateful firewalls provide features to configure what kind of content inspection needs to be performed. This is taken into consideration and firewall creates an entry in the flow table (9), so that the subsequent packets for that connection can be processed faster avoiding control plane processing.
Having seen how a stateful firewall works, does it solve all the problems associated with the stateless firewall?
First, let’s look at the cons of a stateful firewall:
- Stateful firewalls do additional checks to provide more security, and those other checks need more processing power in terms of CPU cycles and memory. Stateful firewall architects and developers have thought about this problem, and most of the latest firewalls overcome or reduce this problem with state-of-the-art algorithmic design to separate control and data plane processing – thus achieving almost similar stateless firewall performance. But I will give the stateless firewall a win here compared to the stateful firewall.
- A stateful firewall can have a larger attack surface compared to a stateless firewall due to larger code base footprint. A simple Google search provides numerous examples. Although over time. certain types of firewalls like those built in operating systems like Linux and Windows have been battle-tested and today serve as enterprise-grade stateful firewalls.
Now let’s talk about the pros of stateful firewalls:
- A stateful firewall provides full protocol inspection considering the STATE+ CONTEXT of the flow, thereby eliminating additional attacks surface. A subsequent blog post will have a demo to show this. Stay tuned.
- A stateful firewall acts a building block for more advanced application layer firewalls or gateways.
- A stateful firewall understands the network flow and can identify data packets of a flow, thereby enabling simple rule writing for bidirectional connections or pseudo state networking protocols.
- Since a stateful firewall can look deeper into packet payloads, it can understand complex protocols that negotiate communication port and protocol at runtime and apply firewall policies accordingly. Think of protocols like FTP, P2P protocols, etc.
I started the post with the idea of introducing the concept of protocol statefulness and things that make a firewall stateful. Later, I introduced various firewalls based on how they behave when policing a stateful protocol, along with the security and performance implications for those firewalls. There is no one firewall that solves all the problems and therefore each of those firewalls have place a in defense in depth strategy: a stateless firewall could help in places where coarse-grained policing is adequate, and a stateful firewall where finer and deeper policy controls are required. I’ve also simplified the technical terms or function for conveying the broader design principles of these categories of firewall. You may find myriad possibilities and combinations of technical implementations in the real world.
Now that you’re equipped with the technical understanding of statefulness, my next blog post will discuss why stateful firewalling is important for security segmentation/micro-segmentation and why you should make sure your segmentation vendor does it.
For any further questions, please reach out to me. Please also feel free to look at Illumio Labs where we publish more technical content, demos, and open source code to help improve data center and cloud security.