ip_early_demux 内核中默认是1(开启), 所以在ip_rcv_finish 到 tcp_v4_rcv 中 skb->destructor = sock_edemux;
且很大概率 skb->_skb_refdst = (unsigned long)dst | SKB_DST_NOREF; // #define SKB_DST_NOREF 1UL
对于NOREF的dst,如果要缓存(tcp_prequeue()或sk_add_backlog()), 则要调skb_dst_force(skb);
1 2 3 4 5 6 7 8 |
|
http://blog.chinaunix.net/uid-20662820-id-4935075.html
The routing cache has been suppressed in Linux 3.6 after a 2 years effort by David and the other Linux kernel developers. The global cache has been suppressed and some stored information have been moved to more separate resources like socket.
Metrics were stored in the routing cache entry which has disappeared. So it has been necessary to introduce a separate TCP metrics cache. A netlink interface is available to update/delete/add entry to the cache.
总结起来说就是Linux内核从3.6开始将全局的route cache全部剔除,取而代之的是各个子系统(tcp协议栈)内部的cache,由各个子系统维护。
当内核接收到一个TCP数据包来说,首先需要查找skb对应的路由,然后查找skb对应的socket。David Miller 发现这样做是一种浪费,对于属于同一个socket(只考虑ESTABLISHED情况)的路由是相同的,那么如果能将skb的路由缓存到socket(skb->sk)中,就可以只查找查找一次skb所属的socket,就可以顺便把路由找到了,于是David Miller提交了一个patch ipv4: Early TCP socket demux
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
然而Davem添加的这个patch是有局限的,因为这个处理对于转发的数据包,增加了一个在查找路由之前查找socket的逻辑,可能导致转发效率的降低。 Alexander Duyck提出增加一个ip_early_demux参数来控制是否启动这个特性。
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
ip_early_demux就这样诞生了,目前内核中默认是1(开启),但是如果你的数据流量中60%以上都是转发的,那么请关闭这个特性。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|