Just a reminder: TCP/IP is both a protocol suite (a set of communications protocols used on the Internet and other networks alike), and a couple of specific protocols: TCP and IP.
TCP/IP as a suite of protocols take its name from the most widely used protocols on the suite: TCP as a transport protocol and IP as a network protocol. Those two protocols became so famous that they are used to name the rest of the set. TCP as a transport protocol and IP as a network protocol are the rock stars of the TCP/IP suite, but the suite contains much more protocols other than TCP and IP. For example: ICMP, SSH, TELNET,LDAP,UDP... You get the idea.
We are focusing here on the TCP as a transport protocol.
As the TCP/IP protocol suite uses a layered model where each layer serves the purpose of the layer above, one good way of detecting issues in a network, is to focus on the transport layer. If there is a problem at the transport layer, you do not have to further investigate the layers above, which most of the times require to know detailed specifications of the protocol like knowing the specification details of CIFS, NFS or others.
So, the transport layer is a good place to start seeking for issues. If nothing is wrong at the transport layer level, then we can start troubleshooting upper layers. If something is wrong at the transport layer level, then we need to troubleshoot problems at the trasport layer or below.
So far, so good, let´s start.
What are the basics concepts to know for troubleshooting TCP?
We need to know how a TCP segment looks like.
Thanks to the Wikipedia, we have this picture of the header of a TCP segment:
The goal of the TCP protocol is to provide a connection oriented communication ensuring reliable, ordered and error-checked delivery of streams of bytes. Therefore, each of the fields in the TCP header is there for a reason.
Lets explain the basic fields.
Source port and destination port: source port and destination port allows us to identify the service the data must be sent to (destination port) and where the data is going to be sent from (source port). Because a single host offers various services such as http, ftp, telnet, etc., all clients connecting to it must use a destination port number to choose which particular service they would like to use. The services listening in each port number are register in the IANA. Check here
The sequence number is important for in-order, reliable delivery.
Sequence numbers are implemented as a 32 bits number. A TCP communication can be seen as two communication streams, one from source to destination and the other one from destination to source. Source and destination maintain his own sequence numbers, each one from his side of the communication, and use sequence numbers and acknowledge numbers to keep track of the conversation, and to advance or retro cede the conversation if required.
ACK number is a 32 bits number that acknowledges the reception of all prior bytes of information. It tells the sender until which byte, it has been received properly, so the flow of information can happen smoothly. TCP is a cumulative acknowledgment system, which can only use a single number to acknowledge data, the number of the last contiguous byte in the stream successfully received.
TCP FLAGS are 1 bit containing crucial information.
The most important are:
ACK flag: If the ACK bit flag is set, it means that the ACK sequence number field is relevant and contains ACK information.
SYN flag: It is used during the session setup to agree on initial sequence numbers. TCP is connection oriented, so session setup is required to agree on the sequence numbers, which are random to avoid easy attacks and spoofing.
FYN flag: It is used during a graceful session close to show that the sender has no more data to send.
RST flag: Reset is an instantaneous abort (normally and abnormal session disconnection)
PSH flag: If this flag is set, then it pushes (forces) the data delivery without waiting for buffers to fill. The data will also be delivered to the application on the receiving end without buffering.
Enough for the TCP flags, what else do we have?
Window Size: is a very important 2 bytes field in the TCP header. It can be read also as receive window. Is the amount of bytes in a point of time that the sender can process without exceeding its capabilities. So it is the maximum amount of data that can be received by the sender in a particular moment of time.
Hopefully we have now the basics concepts more clear, is time then to start seeing what are the situations in the transport protocol that can be causing problems to the overall communication.
Zero Window means that the receiver cannot cope with the amount of data the sender is transmitting and that the available reception buffer is 0. If this happen, there is a delay, as the sender needs to wait until some space is available at the destination to continue sending information.
The filter that can be used in Wireshark to detect a Zero Window condition is
The presence of high number of zero window, is a clear indication that they receiver cannot keep the rhythm of the sender. Increasing the tcp buffers on the receiver is likely required in those scenarios, or reducing the amount of data being sent from the sender to the receiver.
Duplicated ACKs, Missing packets and Retransmissions
Duplicated ACKs are normally a symptom of missing packets.
When a receiver receives a packet with a sequence number higher than the expected one, it proceeds as if some data was dropped. To help make the sender aware of the apparently dropped data as quickly as possible, the receiver immediately sends an acknowledgment (ACK), with the ACK number set to the sequence number that seems to be missing. That is a dup ack; is the receiver detecting a gap in the sequence number and basically saying to the sender: “you are sending me a segment that contains a sequence number higher than the one I am expecting, could you please send me the one I am expecting?
So the receiver detects a gap in the sequence numbers, and generate a duplicate ACK (an ACK with an ACK sequence number equal to the sequence number it is expected, which is lower that the latest sequence number received from the sender). The receiver keep sending dup acks for each subsequent segment it receives on that connection, until the missing segment is successfully received because it finally arrives by natural means or because it is retransmitted.
If the sender has fast retransmissions enabled, once it receives 3 dup acks, it will resent the missing segment.
Great number of dup acks can be a clear indication of dropped/missing packets or out of order packets (in this case packets are out of order because they have a higher than expected sequence number).
The way to determine if the dup acks end up with a retransmission is noticing the number of the dup ack: dup ack #1, dup ack #2, and so on.
If fast retransmissions are enabled, then after 3 dup ACKS are received by the sender, TCP performs a retransmission of that segment without waiting for the expiry of the retransmission timer.
This is known as fast retransmit because it happens before the retransmission timer expired naturally.
You can filter duplicate ACKs in Wireshark using the filter:
Bear in mind that duplicated acks can be normal as they also identify packets that arrive out of order (specifically with a higher sequence number than the expected), so not all the duplicated acks end with a retransmission or indicate an issue. Sometime after one or two dup ack, the right segment is received, it just arrives with a delay, but there is no packet loss and no need to re-transmit, as TCP is prepared for reordering.
The way to detect retransmissions in Wireshark is with the filters:
tcp.analysis.fast_retransmission –> Retransmission triggered by the reception of 2 dup ack. Fast retransmissions.
tcp.analysis.retransmission –> Retransmission triggered by the expiration of the retransmission timer.
Out of order packets can also occur because a sequence number at the destination is lower than expected
The TCP protocol was designed to deal with out of order packets, but as TCP only passes the data up to the application when all the received bytes are in order, if there are many out of order packets, or if they arrive with a delay, there would be a degradation in performance.
There is a filter in Wireshark to identify out of order packets:
Wireshark marks a packet as out of order based on the fact that it (a) contains data, (b) does not advance the sequence number value (meaning that it is a packet that has arrived after another one that had a higher sequence number), and (c) arrives within 3 ms of the highest sequence number seen.
If reordering is excessive or applications are overly sensitive to it the performance will be degraded significantly.
On average around a 3% of the packets are out of order, if it is greater than that it can cause a performance issue, and requires further investigation.