low-angle photography of metal structure
Mon Nov 07

TCP and UDP Fundamentals

TCP and UDP are two major protocols that enable devices to communicate over the Internet and other networks. They work on the transport layer of the Internet Protocol Suite, a model that describes how different protocols work together to enable data transmission. TCP stands for Transmission Control Protocol, and UDP stands for User Datagram Protocol. They have different characteristics and use cases, depending on the type and quality of data being transferred.

What is TCP and UDP?

TCP (Transmission Control Protocol) and UDP are basically the protocols which make it possible for two different devices to establish communication. The key point between the two lies in the differences.

Reliability

By using TCP, the data is received by the destination the way it is by rearranging data in its correct order. TCP detects each segment to verify if the data has arrived at the destination or not. If not, the sender will resend it again. However, the data may still be lost in case of data retransmission due to various reasons. Until several attempts, TCP can also abort the connection, resulting in unsuccessful data transfer. Yet, in terms of reliability, TCP is still better than UDP. Without it, the data received could be missing, error, and the segments could be out of order.

UDP, on the other hand, will not even guarantee that the data is heard by its destination. Once the data is transmissioned, the sender will not care what happened on the destination side. Because of this, it is also known as fire- and-forget protocol.

Connection

By knowing how the connection is established, we can say that TCP is connection-oriented while UDP which is connection-less. TCP uses three-way handshake before and four steps procedures after data transmission. Therefore, connection is established (connection-oriented). UDP is connection-less because it does not use a handshake or procedures to communicate.

Weight

In general, UDP is lighter than TCP due to its no synchronization and acknowledgment, resulting in less overhead. TCP, on the other hand, is heavier for the opposite reason.

Transport Packet Structure

Before diving into how both work, let me show you how both packet data are structured. As we know, the packet will be sent in the form of bits, so it will be included to give you an idea regarding the size.

TCP

Source PortDestination Port

Sequence Number
Acknowledgement
Dataoffset| Reserved0 0 0| N S| C W R| E C E| U R G| A C K| P S H| R S T| S Y N| F I N| Window Size
Checksum| Urgent Pointer
Options| Padding
Data
TCP Segment

Let’s take a look at each function of the TCP header.

  • Source port (16 bits): identify the sender port.
  • Destination port (16 bits): identify the destination port.
  • Sequence number(32 bits): the number generated in order to track data exchange between client to server and vice versa (will be incremented if we send the data). If the SYN is set, it is the initial sequence number (ISN). The future segment would be ISN + 1. If the SYN is not set, it is the sequence number of the first data byte in this segment. On the handshake, the sequence number will change from 0 to 1 which is known as a phantom byte (ghost byte) and it is not the real data.
  • Acknowledgement (32 bits): this is the number of data length expected from the opposite side (in the form of sequence number). Not applied if ACK is not set.
  • Header length or data offset (4 bits): specifies size of the TCP header.
  • Reserved (3 bits): bits reserved for future use.
  • Flags (9 bits)
    • NS (1 bits), CWR (Congestion Window Reduced) (1 bits), ECE (Explicit Congestion Notification Echo) (1 bits): used for congestion control.
    • URG (URGENT) (1 bits): indicates that certain data packets need to be prioritized.
    • ACK (ACKNOWLEDGMENT) (1 bits): acknowledgement of received packets.
    • PSH (PUSH) (1 bits): asks to push buffered data immediately to host.
    • RST (RESET) (1 bits): if set, the connection is aborted due to abnormal condition or error.
    • SYN (SYNCHRONIZATION) (1 bits): initialize connection between client and server.
    • FIN (FINAL) (1 bits): terminate connection.
  • Window size (16 bits): indicates the number of bytes the sender of the segment is willing to receive (represent resource allocation for current connection).
  • Checksum (16 bits): used for error checking.
  • Urgent pointer (16 bits): if URG is set, indicates last urgent data byte.
  • Options :
    • TCP Option - Maximum Segment Size : maximum segment size allocation for current connection.
    • TCP Option - Window scale : multiplier of the windows size, showing the actual size value.
    • TCP Option - SACK permitted : indicating whether the sender permits SACK or not. If it is set on both sides, the part of data which went missing during the transmission could be selectively acknowledged and retransmitted.
  • Padding : ensure the TCP Header ends.

UDP

Source PortDestination Port
LengthChecksum

Data
UDP Datagram

Let’s take a look at each function of the UDP header.

  • Source port (16 bits): identify the sender port.
  • Destination port (16 bits): identify the destination port.
  • Length (16 bits): showing the length of UDP header.
  • Checksum (16 bits): for error checking. However this is optional in UDP.

How TCP session work?

While two devices communicate to each other, they are likely to use either TCP or UDP. TCP is a connection-oriented protocol. To establish the communication, TCP is going through three different stages. Here is the overview of how the TCP session works.

Connection establishment (session start)

To establish connection using a session, TCP requires a three-way handshake. You can see the diagram below as a representation and let’s use this page as an example.

Client**(state)**Server**(state)**
LISTEN
SYN_SENT
SYN↘
SYN_RCV
SYN - ACK↙
ESTABLISHED
ACK↘
ESTABLISHED

Session start (3-Way Handshake)

SYN - Client to Server

As you open this web page, your PC (client) is actually requesting the web server to serve you with the page. Therefore, your PC asks the server to establish a session. This is what we call SYN (synchronization). At this moment, the server is already on LISTEN state and the client enters SYN_SENT state.

SYN ACK - Server to Client

As the web server receives the request, it will enter SYN_RCV and respond to your PC with SYN-ACK (synchronization acknowledgment). As the name suggests, the web server acknowledges/recognizes the request and tells the sender. In this state, server is waiting for client ACK.

ACK - Client to Server

The client needs to respond back to server with ACK. Therefore, acknowledgement is sent. The client is then in ESTABLISHED state. Once the server also enters ESTABLISHED state, a session can be opened or started and your requesting page can be served.

Data transmission

Client**(state)**Server**(state)**
ESTABLISHEDData transmission←← ← ←→→ →→ESTABLISHED

Session Established

As the new session has been opened, the data packet (this page) is then transferred from the server to client. During transmission, TCP will make sure that the data will be reassembled in the correct order so you, as a receiver can see the page as the way it is and not a broken page.

Connection termination (session end)

If the client wants to end the current session, the TCP will be going on four- way handshake.

Client**(state)**Server**(state)**
FIN_WAIT_1(active close)
FIN↘
CLOSE_WAIT(passive close)
ACK↙
FIN_WAIT_2LAST_ACK
FIN↙
TIME_WAIT
ACK↘
CLOSEDCLOSED

Session End (4-Way Handshake)

**FIN - Client to Server **

When the client wants to terminate the session, it will first send the FIN message to the server and go into FIN_WAIT_1 state. In this step, the client is waiting for confirmation or acknowledgement (ACK) from the server. The client initiaties the close (active close).

ACK - Server to Client

As the server receives the message, it will let the client know by sending the ACK (acknowledgement). Because the client has received this ACK message from the server, the client will go into FIN_WAIT_2 state waiting for the FIN message. On the server side, it enters CLOSE_WAIT (passive close).

FIN - Server to Client

After some closing process on the server side and it is ready to close the connection, the server sends the FIN message to the client. The server then enters the LAST_ACK state.

ACK - Client to Server

Once the client received FIN, it enters the TIME_WAIT state. During the TIME_WAIT state, if FIN is lost, then it will then be retransmitted until ACK for it is received. If the ACK is lost, but FIN is not, only ACK will be retransmitted. After that, the connection can be gracefully closed.

Application

Many internet applications use TCP for its connection. As we know now, the TCP provides some advantages over UDP, such as reliability, ordered and error- checked packet delivery. For these reasons, many internet applications rely on TCP. Some of them are Word Wide Web (HTTP), email (SMTP), and file transfer (FTP).

At this point, you might have wondered why still use UDP if it is so unreliable? There are certain cases where UDP is preferred, like video streaming and VoIP (Voice over IP). Since UDP allows continuous packet stream, the data can be received on a destination in a timely fashion. This is better than having data retransmitted due to packet loss.

Conclusion

In conclusion, TCP and UDP are two fundamental protocols that enable data communication over the Internet and other networks. They have different advantages and disadvantages, depending on the type and quality of data being transferred. TCP is more reliable, ordered, and error-checked, but also heavier and slower. UDP is faster, lighter, and continuous, but also unreliable and unordered. Depending on the application and the desired outcome, one protocol may be more suitable than the other. For example, TCP is used for web browsing, email, and file transfer, while UDP is used for video streaming and VoIP. Understanding the differences between TCP and UDP can help us appreciate how data transmission works and how we can optimize it for various purposes.