static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
int push_one, gfp_t gfp)
{
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
unsigned int tso_segs, sent_pkts;
int cwnd_quota;
int result;
.............................................
while ((skb = tcp_send_head(sk))) {
..................................................
//可以看到只有当传输成功,我们才会走到下面的函数。
if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp)))
break;
/* Advance the send_head. This one is sent out.
* This call will increment packets_out.
*/
//最终在这个函数中启动重传定时器。
tcp_event_new_data_sent(sk, skb);
tcp_minshall_update(tp, mss_now, skb);
sent_pkts++;
if (push_one)
break;
}
...........................
}
Slow start
Congestion avoidance
Fast re-transmit
Fast recovery
然后下面主要是介绍了slow start和Congestion avoidance的一些实现细节。
123456789
CWND - Sender side limit
RWND - Receiver side limit
Slow start threshold ( SSTHRESH ) - Used to determine whether slow start is used or congestion avoidance
When starting, probe slowly - IW <= 2 * SMSS
Initial size of SSTHRESH can be arbitrarily high, as high as the RWND
Use slow start when SSTHRESH > CWND. Else, use Congestion avoidance
Slow start - CWND is increased by an amount less than or equal to the SMSS for every ACK
Congestion avoidance - CWND += SMSS*SMSS/CWND
When loss is detected - SSTHRESH = max( FlightSize/2, 2*SMSS )
Header Prediction:基于效率的考虑,将包的处理后续阶段分为fast path和slow path两种,前者用于普通的包,后者用于特殊的包;该header prediction即用于区分两种包的流向。
1.(tcp_flag_word(th) & TCP_HP_BITS) == tp->pred_flags 判断标志位是不是正常情况;tcp_flag_word返回指向tcphdr的第三个32位基址(即length前面),而TCP_HP_BITS是把 PSH标志位给屏蔽掉即该位值不影响流向;所以总的来说pred_flag应该等于0xS?10 << 16 + snd_wnd(那么pred_flag是在tcp_fast_path_check或tcp_fast_path_on中更新值的)
2.TCP_SKB_CB(skb)->seq == tp->rcv_nxt 判断所收包是否为我们正想要接收的,非乱序包
3.*ptr != htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_TIMESTAMP << 8) | TCPOLEN_TIMESTAMP) 若包中没有正常的timestamp选项则转入slow path
timestamp选项处理: 从包中的ts选项中获取数据,以此刷新tp->rx_opt的saw_tstamp,rcv_tsval,rcv_tsecr域;ts选项含三个 32bit,其中后两个分别记录着tsval和tsecr;(注意,ts_recent并不在此处更新,在后面的tcp_store_ts_recent 中更新)
struct tcp_options_received: 定义在tcp.h中,其中saw_tstamp表明timestamp选项是否有效,ts_recent_stamp是我们最近一次更新 ts_recent的时间,ts_recent是下一次回显的时戳一般等于下次发包中的rcv_tsecr;rcv_tsval是该data从发端发出时的时戳值,rcv_tsecr是回显时间戳(即该ack对应的data或者该data对应的上次ack中的ts_tsval值),(注意两端时钟无需同步;当ack被收端推迟时,所回复的ack中的timestamp指向所回复包群中的第一个确认包 “When an incoming segment belongs to the current window, but arrives out of order (which implies that an earlier segment was lost), the timestamp of the earlier segment is returned as soon as it arrives, rather than the timestamp of the segment that arrived out of order.”这条细节未看明白$)从包中的时间戳选项中记录这两个值
static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) /linux/net/ipv4/tcp_input.c #2491
//处理接受到的ack,内容非常复杂
首先介绍一下ack可以携带的各个FLAG:
12345678910111213
FLAG_DATA: Incoming frame contained data.
FLAG_WIN_UPDATE: Incoming ACK was a window update
FLAG_DATA_ACKED: This ACK acknowledged new data.
FLAG_RETRANS_DATA_ACKED:Some of which was retransmitted.
FLAG_SYN_ACKED: This ACK acknowledged SYN.
FLAG_DATA_SACKED: New SACK.
FLAG_ECE: ECE in this ACK.
FLAG_DATA_LOST: SACK detected data lossage.
FLAG_SLOWPATH: Do not skip RFC checks for window update.
FLAG_ACKED: (FLAG_DATA_ACKED|FLAG_SYN_ACKED)
FLAG_NOT_DUP: (FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
FLAG_CA_ALERT: (FLAG_DATA_SACKED|FLAG_ECE)
FLAG_FORWARD_PROGRESS: (FLAG_ACKED|FLAG_DATA_SACKED)