What makes the difference between a system that sustains 10+ Gbps and one that stalls at 1? The answer lies in the details of the network stack – from kernel vs. user space separation to memory copying, interrupts, and zero-copy techniques. Understanding these mechanisms is critical when performance, scalability, and security must be balanced at terabit scale.
Network processing is one of the performance-critical aspects of every operating system, whether acting as a client, server, router or a firewall.
- Do you know how your firewall processes network packets?
In the early days, typical network speed was 14kbps (kilobits per second) (1993), then 512kbps (2000 – introduction of ADSL), 2Mbps (megabits per second) (2005), escalating to 22Mbps (2008), and reaching the 500Mbps milestone (2015 – introduction of fibre-optic).
We are now at the stage where big companies are asking for a processing speed of terabits per second. The network processing stack is under constant development to increase packet processing power while still being secure.
Why is it then important to know about the network processing stack?
Misconfigured or poor implementation of software or security solutions introduce packet latency and impacts business continuity. Instead of achieving 10+ Gbits packet processing, the solution struggles to handle gigabit speeds.
Let’s define two essential terms: Kernel Space and User Space.
Foundation of system security: Separation between Kernel and User Space
The separation between kernel space and user space is a fundamental architectural decision in operating systems. This directly impacts network packets processing performance and data-protection.
Kernel Space operates with unrestricted hardware access and maintains system control, including:
Physical memory management and virtual address translation
Hardware interrupt handling and device driver operations
Network stack processing, including protocol implementation
Security policy enforcement through firewall and filtering mechanisms
User Space is a protected execution environment where:
Application execution inside restricted memory spaces
Memory access is mediated through system calls to the kernel
Network access occurs through socket abstractions
Analogy – House separation:
Your house is separated into different areas. The entrance, living room, kitchen, and bedrooms or office rooms; The entrance represents the kernel space, and the other rooms represent different applications in user space.
Operating systems separate the memory into two sections (over-simplified): kernel space and user space. Applications in user space cannot access the kernel memory directly.
But with the use of special drivers, applications can access the packet buffer and get some control over the physical network interface. This is not the standard approach inside normal systems.
For the application to process packets on the wire, packets inside the kernel memory space need to be copied over to the user space memory. This takes time and resources to do for the CPU, and packet throughput gets limited.
Analogy – Moving boxes
A big package on your porch (kernel); carrying it through the hallways to the right room (user space) is laborious. When you get multiple packages, you quickly get very tired carrying the packages around the house.
There is a method called zero-copy access. Zero-copy mechanisms provide user-space applications with direct access to packet buffers, eliminating expensive memory copy operations.
Using a special driver, a predefined memory segment can be allocated as network buffer and be directly accessible from user space application. Packets coming into the interface will be placed on the allocated ring buffer.
This method saves the CPU millions of cycles and can save a lot of throughputs depending on the use-case.
Analogy – Leaving the boxes at the entrance
Instead of moving the boxes inside to each specific room, the person who the package is for will come and handle the package at the entrance.
When packets arrive on the interface, the NIC controller raises an interrupt signal to the CPU, the Kernel processes the packet through the network stack.
Analogy - Interrupt Model
Interrupt Generation (IRQ): The doorbell rings at your door
Context Switch: You’re cooking dinner when the doorbell rings, so you have to stop what you are doing and head to the door.
Interrupt Service Routine (ISR): You peek through the peephole (or use the doorbell camera) to see who it is – delivery man, neighbor, or stranger – and quickly decide how to handle it. Let’s say it is a delivery man.
SoftIRQ Processing: You unlock the door, sign for the package, and carry the packet inside. You look at the name on the package to see who it is for in your house.
Application Wakeup: You then call out “Honey, your Amazon package is here!”.
The image shows that 8 CPU cores are dealing with software interrupts queue (si), and the process responsible is ksoftirqd. This is caused by a backlog of packets inside the packet buffers. This will starve the application running in user space.
This is easy to induce by sending a lot of small single flow packets to the recipient server.
Some examples:
hping3 -S -p 80 --rand-source --flood target_ip
nmap -sS --max-rate 10000 -p 80 target_ip
Analogy – Constant interruption
Imagine if 50 delivery people showed up at once, all ringing the doorbell:
You'd constantly be interrupted from your main tasks
You'd spend all your time just answering the door
You'd never get back to cooking dinner or watching TV
This is exactly what happens during a DoS attack - the CPU spends all its time in interrupt handlers!
This ingress device processing stage is one of the stages where we can implement controls to handle high amounts of traffic (e.g. DoS), preventing congestion. There are two steps in this stage: (1) traffic classification and (2) traffic shaping.
What is traffic classification?
Traffic classification is the process of categorizing network packets into different classes or flows based on various properties such as:
Layer 3/4 headers: Source/destination IP addresses, protocol type (TCP/UDP/ICMP), port numbers
Layer 2 information: VLAN tags, MAC addresses, interface identifiers
QoS markings: DSCP (Differentiated Services Code Point), Traffic Class, IP precedence
Flow characteristics: Packet size, inter-arrival times, connection state
What is traffic shaping?
Traffic shaping controls the flow of network traffic by:
Rate limiting: Enforcing maximum bandwidth consumption per flow or class
Burst control: Smoothing traffic spikes using token bucket or leaky bucket algorithms
Queue management: Organizing packets in prioritized queues with different service levels
Congestion avoidance: Proactively dropping packets before queues overflow
Delay management: Adding controlled delays to enforce traffic contracts
The goal of this stage - “Device Processing” - is to optimize network utilization while ensuring critical traffic receives priority and guaranteed bandwidth.
Analogy – Different types of visitors (packet types)
Emergency services (high-priority packets):
Always answer immediately
Regular mail (normal packets):
Prank-callers (junk packets):
Quickly dismiss at the door
The policy engine is an in-kernel checkpoint where packets are matched against rules in a defined order to decide what to permit, drop, modify, or hand off to a userspace verdict path, while maintaining connection state and consistent flow handling.
Analogy – House rules
You have rules in your house and standards for what you allow inside. You won’t bring a box with insects inside the house; you will reject it from entering. Or if the package does not contain the intended content, you may return it or throw it away.
You also keep track of who is ordering a lot of packages. When your partner orders a series of packages you know it is for her, yet again.
After filtering and policy checks, every network packet is subject to a routing decision. One of the elementary decisions is to choose between the following:
Local delivery: Packets designated for a local address. Packets are delivered to a listening application’s socket.
Forwarding: Packet designated for another system. Transit packets are sent out another interface, possibly after traversing further policy, shaping, and queuing.
These routing decisions are made by looking up the system’s routing tables (Forwarding Information Base/FIB). Thise table maps destination address to either local handling or a next hop gateway and output interface.
Routing lookup and policy
Routing lookups involve longest prefix match queries against one or more routing tables. But that isn’t the only way, it is also support for:
Policy-based routing: Allows decisions based on more than just the destination IP (e.g., source IP, input interface, or marks/tags)
Virtual routing/VRFs: Segregated routing domains on one system
Multipath routing: Load balancing over multiple next hops
Note: Policy engine enforcement points can exist both before and after the routing decision.
Analogy – Known recipient
You are aware of all the people living in your house, so when a package arrives you know who the packet is for. You also know your neighbors quite well, and it isn’t unusual that packets are sent to your house first, and when that happens you kindly place the package at your neighbor’s doorstep.
Once a packet has traversed the entire kernel’s network stack, the final journey is from the kernel into the user application. This handover happens via sockets.
A socket is an abstraction provided by the kernel to simplify network communication. An application will create a socket that will either connect to another system or listen for incoming connections.
When a signal from the kernel is triggered (SoftIRQ) to the application, the application will read from the socket’s memory buffer and then process the data according to the logic inside the software. The application will often send a reply through the same socket, passing it to the kernel.
Analogy – Giving a package
When you have gotten a package inside the entrance, you call out the recipient’s name and start carrying the package. You know which room you need to go to. When you enter the room, you find space to put the package. The owner of the package will then open it and do with it as the owner pleases.
Core processing stages:
Physical layer – High-performance zero-copy mechanisms
Driver layer – OS-specific network driver processing with early hooks
Optional “Fast-Path Bypass” - High-performance zero-copy mechanisms
Reinjection – Mechanisms to return processed packets to kernel stack
Device processing - Traffic control, shaping, and queue management
Optinal monitoring tap – Passive packet capture for analytics
Policy Engine – Security rules, firewall, NAT, and access control
Routing decision– Determine packet destination (local vs forwarding)
Socket connection – Connect to listening and established socket
Application processing – Application receives packets from the socket
Analogy summary: Home delivery
The network stack has been represented as a well-organized house receiving packages (packets) at the front door (physical interface).
The entrance (Kernel Space): Every package must first stop at the entrance, handled by you. You have full overview of who lives in the house.
Doorbell (Interrupts): When you get a package you are always notified by the doorbell. But it could happen that many deliveries arrive simultaneously, and you will get highly interrupted and distracted from important work.
Different visitors (Packet types): You know, by looking at the packages, which are important, and which are not. Based on that you may prioritize one of them before the other.
Moving boxes inside (Memory copy): You must lug each package from the entrance through narrow hallways (kernel-user space copy), which takes effort and time, especially when the house is busy.
Alternative: Handling at the door (Zero-Copy): The right person comes to the entrance to pick up their package directly, saving time and energy and avoiding congestion in the hallways.
House Rules (Security Policies): You have standards in the house—reject bugs, reports unexpected deliveries, accepts packages addressed to known residents, and forwards neighbor parcels as needed.
Passing it along (Socket): After accepting the package, it will end up with its intended recipient (the application) who opens it and gets the contents.