The immense influence of the Internet caused its communications protocol to become the global standard. Almost all networks have migrated to TCP/IP.
TCP/IP is a robust technology that was first tested in the early 1980s on the U.S. military's ARPAnet, the world's first packet-switched network. It was created as an open protocol that would enable all types of computers to transmit data to each other via a common communications language. It was also designed to withstand disruption in the event of war. See ARPAnet
TCP/IP is a layered protocol, which means that after an application initiates the communications, the message (data) to be transmitted is passed through a number of software stages, or layers, until it actually moves out onto the wire or into the air if wireless. The data are packaged with a different header at each layer. At the receiving end, the corresponding software at each protocol layer unpackages the data, moving it "back up the stack" to the receiving application. See protocol stack
TCP and IP
TCP/IP is composed of two parts: TCP (Transmission Control Protocol) and IP (Internet Protocol). TCP is a connection-oriented protocol that passes its data to IP, which is connectionless. TCP sets up a connection at both ends and guarantees reliable delivery of the full message sent. TCP tests for errors and requests retransmission if necessary, because IP does not.
An alternative protocol to TCP within the TCP/IP suite is UDP (User Datagram Protocol), which does not guarantee delivery. Like IP, UDP is also connectionless, but very useful for transmitting audio and video that is immediately heard or viewed at the other end. If packets are lost in a UDP transmission (they can be dropped at any router junction due to congestion), there is neither time nor need to retransmit them. A momentary blip in a voice or video transmission is not critical.
Application Layer 7
The top layer of the protocol stack is the Application Layer. It refers to the programs that initiate the communications in the first place. TCP/IP includes several Application Layer protocols for mail, file transfer, remote access, authentication and name resolution. These protocols are embodied in programs that operate at the top layer just like any custom-made or packaged client/server application.
FTP, SMTP, Telnet, DNS and WINS
Some of the most widely known application protocols in the TCP/IP suite are FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol), Telnet, DNS (Domain Name System) and WINS (Windows Internet Name System). FTP programs are widely used to copy files across the network. All TCP/IP-based mail programs use SMTP to send email. Telnet is a terminal emulator that provides access to a remote host. DNS and WINS allow hosts to be given understandable names, and the DNS and WINS servers turn those names into the IP address required by TCP/IP networks.
Other Client/Server Applications
The language and format in a user's proprietary client/server program are not known to TCP/IP. They are known only to the sending and receiving programs that must communicate with each other. The data from all applications, whether a proprietary program or part of the TCP/IP suite (FTP, Telnet, etc.), are "handed down" from the Application Layer in the client to the lower layers in the stack for transport. At the server side, they are "handed up" the stack to the appropriate application for processing. The operation is reversed for data sent back from the server to the client.
All nodes in a TCP/IP network (clients, servers, routers, etc.) are assigned an "IP address," which is written as four numbers between dots, such as 193.4.64.01. The first part of the address is the network address, and the second part is the host (station) address, also known as the netid and hostid. The network address allows TCP/IP packets to be routed to a different network. The number of bytes used for the netid and hostid vary according to a class system, and the first three bits of the first byte determine this ratio (see IP address
Ports and Sockets
A program identifies the program it wishes to communicate with by its socket, which is a combination of (1) the server's IP address and (2) the program's port. If it does not know the IP address, but knows the server by name, it uses a Domain Name System server (DNS server) to turn the name into the IP address. In Windows networks, a Windows Internet Name System server (WINS server) is used to map NetBIOS names, which are assigned to many Windows machines in small networks, to IP addresses.
The port is a logical number assigned to every application. For FTP, SMTP, HTTP (Web) and other common applications, there are agreed-upon numbers known as "well-known ports." For example, HTTP applications on the Web are on port 80, so a Web server is located by its IP address and port 80. An organization's internal client/server applications are given arbitrary ports for their own purposes.
OSI Layers 5, 6 & 7 Are All in the Top Layer
OSI Layers 5, 6 and 7 are all included in TCP/IP's Application Layer. For example, OSI Layer 6 (Presentation Layer) is where data conversion (ASCII to EBCDIC, floating point to binary, etc.) and encryption/decryption are performed. OSI Layer 5 is the Session Layer, which is performed in Layer 4 in TCP/IP. Thus we jump from Layer 7 down to Layer 4.
From Application to Transport Layer
The application delivers its data to the communications system by passing a stream of data bytes to the transport layer along with the socket of the destination machine. The dotted lines in this diagram are conceptual. DNS and WINS requests go down the stack (in a UDP packet) like everything else in order to go out onto the network.
Transport Layer 4 - TCP & UDP
TCP establishes a connection at both ends, creating a "virtual connection" between the two machines before any data can be transmitted. Once established, both sides negotiate the maximum size of a TCP packet. Although TCP supports packets up to 64KB, in most cases, the size will be based on the underlying network, such as Ethernet, which holds a maximum of 1518 bytes. TCP attaches a header onto the packet that contains the source and destination ports as well as the sequence number of the packet, and it hands it over to IP along with the destination IP address. (A TCP packet is technically a Protocol Data Unit or segment, but is more often called a packet in common parlance.)
The Sliding Window
TCP uses a sliding window system, which is an adjustable buffer that allows a number of packets to be received before an acknowledgment is sent back. The size of the window can be changed as conditions change, and TCP handles this "flow control" in real time. It also handles the retransmission of packets that have been received with errors.
UDP (User Datagram Protocol)
UDP is an alternative to TCP that does not establish a connection, makes no guarantees and provides no flow control or error detection. Either it does not matter as would be the case for real-time audio or video, or the application programs using UDP must themselves include the error detection and recovery that TCP provides.
From Transport to Network Layer
TCP and UDP hand over their packets to IP along with the IP address of the destination node. The packet size is typically the size of the underlying data link layer such as Ethernet or Token Ring.
Network (Internet) Layer 3 - IP
The IP protocol accepts the packets from TCP or UDP and prepares them for the Data Link Layer below by turning the IP addresses into physical station addresses (MAC addresses) and fragmenting the packets (if necessary) into the required frame size. IP uses the ARP (Address Resolution Protocol) to obtain the MAC address, unless (1) the address has already been ARP'd and is in the cache or (2) there is a predefined configuration file that contains the addresses. An ARP request is broadcast onto the network, and the machine with that IP address responds with its MAC address. If the target machine is in a different network or subnetwork than the source machine, IP supplies the target address of the default gateway, which is the router that can direct the packet to the appropriate network.
IP outputs packets called "datagrams," and each datagram is prefixed with an IP header that contains source and destination IP addresses. If IP has to fragment the packet further, it creates multiple datagrams with sequence numbers so that they can be reassembled by IP on the other end. IP hands over each datagram to the data link layer below along with the MAC address (Ethernet address) of the target station or router.
IP supports a very useful feature called "multicast," which allows one message to be delivered to multiple recipients. That means one IP data stream can travel a long, circuitous route before it is fanned out to all the target stations by the last router.
From Network to Data Link Layer
IP datagrams are handed over to Ethernet, Token Ring, ATM or some other data link protocol responsible for moving the data across the wire. The dotted lines in the diagram are conceptual. ARP requests go down the stack like everything else in order to go out onto the network.
IP Is the Routing Mechanism
In a large enterprise or on the Internet, the IP protocol is used to route the packets from network to network. Routers contain routing tables that move the datagrams to the next "hop," which is either the destination network or another router. Datagrams can traverse several routers within an enterprise and dozens of routers over the Internet.
Routers that span different types of networks may have to fragment the datagrams even further if they direct them onto routes that use a smaller frame size than the incoming frame; for example, from FDDI to Ethernet.
From Hop to Hop
Routers inspect only the network portion (netid) of the address and direct the incoming datagrams to the appropriate outgoing port for the next hop. Routers move datagrams from one hop to the next as they are mostly aware of only the devices that are directly connected to them. Eventually, if the routing tables are correctly updated, the datagrams reach their destination. Routers use routing protocols to obtain current routing information about the networks and hosts that are directly connected to them.
Routing Table Example
If a router receives packets for a remote network, it sends them out the port that will reach the next router. Router ports are entirely different than socket ports. Router ports are physical pathways to and from the router connected via cable. Socket ports are logical assignments made to running programs.
Data Link Layer 2 - Ethernet
IP can connect directly to Ethernet, Token Ring, FDDI, ATM, SONET, X.25, frame relay and other networks. Since Ethernet is the most widely used data link protocol, or network access method, we use it in our example. Ethernet wraps the IP datagrams into its own frame format, which includes a header with source and destination MAC addresses (station addresses) and a trailer that contains checksum data.
Ethernet Packets Can Collide
Ethernet uses the CSMA/CD (carrier sense multiple access/collision detection) access method to broadcast the frames onto the wire. If two stations transmit at the same time, their frames collide, and they each back off and wait a random amount of time before trying again (in milliseconds). The data link layer is responsible for reliable node to node transmission. If an Ethernet frame is received with errors, Ethernet handles retransmission until it is received error free.
LAN to WAN
If IP datagrams start out in a LAN, go to a wide area network (WAN) and then to a LAN at the other side, the Ethernet LAN frames are converted into WAN frames by a router and back again to Ethernet frames by the router at the other side.
Onto the Wire
The data link layer is responsible for reliable node to node transmission within a subnetwork. When Ethernet frames traverse several routers, the same frames are retransmitted over again by the next router.
Packets, Datagrams or Frames?
The message starts out in one host, goes down the protocol stack, over the wire, and back up the stack on the receiving host. The counterpart protocols unpackage the frames, datagrams and packets and deliver the data to the application for processing.
Although the terms are technically TCP segments, IP datagrams and Ethernet frames, they all ride over packet-switched networks and are frequently called packets at all stages.
IP Is "The" Standard
IP has become the worldwide standard protocol for all forms of electronic communications, including data, voice and video. The amount of data communications is increasing far more than voice traffic, and it is expected that all data, voice and video will ride over IP-based networks in the future. See IP on Everything
Transporting IP packets over a LAN is typically done via Ethernet. Over the WAN, IP generally rides over SONET or on ATM on top of SONET. In the future, IP is expected to ride directly over DWDM fiber (rightmost diagram).
Summary of the TCP/IP Stack
Perhaps the simplest reference ever written on the subject is "An Introduction to TCP/IP" by John Davidson (Springer-Verlag, 1988). Although written decades ago and only 100 pages, it is the easiest read on the subject you will ever find.
The Bibles for TCP/IP have been "Internetworking with TCP/IP," Volumes I, II and III, by Douglas E. Comer. Updated to its 6th edition, Volume I covers the principles, protocols and architecture of the subject. (Prentice Hall, 2006).