Linux内核协议栈 NAT性能优化之FAST NAT

Posted 于杨

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Linux内核协议栈 NAT性能优化之FAST NAT相关的知识,希望对你有一定的参考价值。

各位看官非常对不起,本文是用因为写的,如果多有不便敬请见谅
代码是在商业公司编写的,在商业产品中也不能开源,再次抱歉
 
This presentation will highlight our efforts on optimizing the
Linux TCP/IP stack for providing networking in an
OpenStack environment, as deployed at our industrial customers.
 
 
Our primary goal is to provide a high-quality and highly performant TCP/IP stack.
To achieve this, we have to identify the performance bottlenecks in
the Linux TCP/IP stack for networking in OpenStack. We have performed a lot of
Linux TCP/IP stack performance tuning, related to NIC, CPU cache hit rate, spin lock,
memory alloc and others. However, we learned while measuring that conntrack NAT
uses too much CPU such for instance for the ipt_do_table function.
Linux conntrack is very good, but it is too heavy and many functions are not used.
Instead, we implemented FAST NAT in the Linux TCP/IP stack.
 
 
We will present our efforts on reducing the performance costs.
First, FAST NAT uses spin lock instead of global connection table but the entry to greatly reduces the CPU waiting time,
and user policies is instead stored as a hash table not a list. The connection table and user
policy is per-NUMA, this would avoid CPU through QPI waste much time and increase delay.
Second, FAST NAT does not record the TCP status,
but only record a tuple with relevant connection formation for NAT forward.
This can reduce much check for forwarding packet.
Entry in the connection table can be set to expire on
an absolute expiration time or relative expiration time basis.
Relative expiration time will incresae by per forwarding packet.
Global connection table don‘t synchronize for reducing lock‘s using. This may casue one TCP stream in
per-NUMA connection table. If we use Intel Ixgbe NIC with Flow Director ATR mode, the incoming
stream and outcoming stream will have same index for multiple queues. The mentioned limit above
will disappear.
 
Limitations of FAST NAT only TCP and UDP are supported.
Although some limitations exist, our work has paid off and resulted in 15-20 percentage pps improvement.

 

以上是关于Linux内核协议栈 NAT性能优化之FAST NAT的主要内容,如果未能解决你的问题,请参考以下文章

Linux Tcp 内核协议栈学习三种武器 之 Packet Drill

【性能】如何优化 NAT 性能?

Linux内核源码分析之网络协议栈架构

Linux 网络协议栈之内核锁—— 内核抢占

linux性能优化网络性能优化的思路

Linux 网络协议栈之内核锁—— Linux内核抢占和进程调度总结