Guarding multi-core designs
Cisco estimates that IP traffic is expanding at a compound annual growth rate (CAGR) of 25 per cent—a doubling every three years. In parallel with the continual bandwidth expansion is the swelling of security threats to data-in-transit. Threats include address spoofing, passive monitoring (or 'eavesdropping'), data integrity attacks, and sophisticated man-in-the middle attacks. These threats are driving the industry to encrypt an ever-increasing percentage of communications using security protocols such as MACsec, IPsec and SSL/TLS.
Adding to the pressure from various market forces are technology trends that put increasing demands on the packet processing. Virtual private network (VPN) communications must be protected by encryption with a security protocol. And the increasing use of mobile offload to WiFi is also driving a rise in encrypted packet traffic. For example, LTE to WiFi offload using the Evolved Packet Data Gateway (ePDG) architecture (figure 1) relies on the IPsec security protocol to protect otherwise exposed communications.
Figure 1: Mobile Data Offload with ePDG.
The result is that specifications for security protocol throughput have moved from 5Gbit/s to 10Gbit/s and now to 40+Gbit/s for IPsec and SSL and to over 100Gbit/s for MACsec in just a few years. Silicon designs that implement protocol processing are challenged to keep pace.
Three design approaches for security protocol processing
Networking silicon design teams have three possible approaches to address requirements for security protocol processing:
A software-only approach, executing on a networking processor's primary CPU
Cryptographic-specific processing in hardware IP
Full security protocol processing in hardware IP
With the software-only approach, all security protocol processing executes on the primary CPU. Software "stacks" for protocol processing can be integrated into system software without affecting the hardware design. However, these stacks are resource intensive, executing complex mathematical algorithms for decryption and encryption of data, as well as implementing extensive data movement routines for each packet payload (figure 2). A software-only approach runs into a bottleneck because the compute-intensive and data-movement operations quickly overload the CPU. Multicore CPUs can push the bottleneck out, but performance, even on a multi-core CPU, is typically limited to a throughput of less than 2Gbit/s, well below today's networking requirements.
Figure 2: This flowchart of a security protocol processing process shows operations in a software-only implementation. The software stack, running on the CPU, classifies a packet and then executes the cryptographic processing for the encryption and hash steps before routing the packet to its destination.
A second option, found in many networking silicon designs, uses discrete cryptographic algorithms embedded in hardware IP (intellectual property), integrated as part of a dedicated adjunct processor. The IP offloads encryption and decryption from the CPU, performing math efficiently (figure 3). But the rest of the protocol processing workload remains in software and is still handled by the CPU.
Figure 3: Here is the processing flow using crypto-specific IP. While math intensive operations are offloaded, the CPU is still responsible for multiple steps and complex data movements.