TCP/IP Layers and the OSI Model
With TCP/IP layers, each layer is responsible for a different facet of the communication. Layers are beneficial because a layered design allows developers to evolve different portions of the system separately. The most frequently mentioned concept of protocol layering is based on a standard called the Open Systems Interconnection (OSI) model as defined by the International Organization for Standardization (ISO).
Open Systems Interconnection (OSI)
The OSI Reference Model is a seven-layer model that identifies the steps and functions that must be completed when computers communicate over a network.
The seven layers combined are often referred to as a "Network Stack." The OSI model only provides guidelines on how computers should communicate over a network. It does not define specific procedures or protocols.
The seven layers of the OSI Reference Model are:
- Application Layer/Layer 7 - It specifies network-related functions for a user application or program to ensure that communication with another application over a network is possible. It's important to note that this is not the user interface itself. User's software programs interact with the Application Layer.
- Presentation Layer/Layer 6 - Accepts the data from the Application Layer and converts or encodes it into a standard format that the Application Layer on the other computer can understand. For example, text can be encoded as ASCII or HTML, while graphics can be encoded using JPEG or TIFF. The Presentation Layer can also include standard data compression or data encryption schemes.
- Session Layer/Layer 5 - Establishes, manages, and ends the connections or sessions between the applications on the communicating computers.
- Transport Layer/Layer 4 - Takes data from the upper layer, converts it into a format that can be transmitted over the network, and manages the flow of data between the two hosts that are communicating.
- Network Layer/Layer 3 - Receives a segment from Layer 4, adds a header to it to create a "packet," and sends the packet to Layer 2. Layer 3 is responsible for delivering the packet to the destination computer. If there is more than one route to the destination computer, the Network Layer chooses the best path for the packet to take. The Network Layer treats each packet independently.
- Data Link Layer/Layer 2 - Receives packets from Layer 3. It adds another header to form a "frame." The Data Link Layer can also add a trailer to the frame, such as a CRC. A CRC is a simple mathematical calculation performed on each frame to ensure it hasn't been corrupted in transit. Finally, the Data Link Layer translates the frame into binary digits, or bits, for Layer 1. The Data Link Layer includes those protocols and methods for establishing connectivity to a neighbor sharing the same medium.
- Physical Layer/Layer 1 - This is where the binary digits, or bits, move across a physical medium. Layer 1 defines the electrical or optical signal that equals a one, and the signal that equals a zero. Physical Layer standards include cabling specifications, electrical or optical signaling, and lower-level framing of ones and zeros.
A protocol is a formal set of rules or procedures computers must understand, accept, and use to be able to communicate over a network. Different protocols are used at different layers of the OSI model.
As the data moves down the OSI layers, the data is encapsulated in headers and possibly trailers/footers at each layer. When a lower level receives the information, it treats the entire package as data.
At the receiving end, each layer examines and removes its corresponding header and trailer/footer.
Multiplexing, Demultiplexing, and Encapsulation
With a layered architecture, its natural to perform protocol multiplexing. This allows different protocols to coexist on the same infrastructure. It also allows multiple instantiations of the same protocol object to be used simultaneously without being confused.
At each layer a different sort of identifier is used for determining which protocol or stream of information belongs together. Most link layer technologies include a protocol identifier field in each packet to indicate which protocol is being carried in the link-layer frame.
When an object, called a protocol data unit (PDU), at one layer is carried by a lower layer, it is said to be encapsulated (as opaque data) by the next layer down. This is the essence of encapsulation-each layer treats the data from above as opaque, uninterpretable information.
Most commonly a layer prepends the PDU with its own header. The header is used for multiplexing data when sending, and for the receiver to perform demultiplexing, based on a demultiplexing (demux) identifier. In TCP/IP networks such identifiers are commonly hardware addresses, IP addresses, and port numbers.
Different network devices implement different subsets of the protocol stack. End hosts tend to implement all the layers. Routers implement layers below the transport layer, and switches implement link-layer protocols and below.
Routers are capable of interconnecting different types of link-layer networks and must implement the link-layer protocols for each of the networks types they interconnect.
Layers above the network layer use end-to-end protocols. The network layer provides a hop-by-hop protocol.
The TCP/IP Suite
TCP/IP (ARPANET) Reference Model
The TCP/IP reference model is a simpler, four-layer model developed by the DoD and the IETF. There are no official session or presentation layers. In addition, there are several "adjunct" or helper protocols that do not fit well into the standard layers yet perform critical functions for the operation of the other protocols.
It defines specific protocols at each layer. The four layers include:
- Application Layer - Includes all the same functionality as the Application Layer in the OSI model, but also manages encoding, data compression, encryption, and sessions.
- Transport Layer - Includes the same functionality as the Transport Layer in the OSI model.
- Internet Layer - Includes the same functionality as the Network Layer in the OSI model.
- Network Access Layer - Focuses on how data is transmitted over any type of physical network, regardless of whether it's a LAN or a WAN. Don't confuse this with the OSI Network Layer.
The Five-Layer Model
The five-layer model is a combination of the OSI and TCP/IP reference model. It takes the TCP/IP model and breaks the Network Access Layer back out to include both the Physical Layer and Data Link Layers:
- Application Layer/Layer 5
- Transport Layer/Layer 4
- Network Layer/Layer 3
- Data Link Layer/Layer 2
- Physical Layer/Layer 1
Application Layer protocols specify details such as how data should be encoded, compressed, or encrypted, and how sessions should be managed.
Some common Application Layer protocols include:
The Transport Layer provides an end-to-end service to applications running on end hosts. The Transport Layer specifies which Application Layer protocol should be used to process the data on the receiving computer.
Each Application Layer protocol is assigned a unique numerical identifier, called a port, which is used to identify the protocol. For example, HTTP: Port 80, DNS: Port 53, SMTP: Port 25.
The destination port number is analogous to the recipient's name on an envelope. Just like there might be multiple people living at a single address, multiple programs might be running on the destination computer. Each Application Layer protocol has a unique port number so incoming data can find the correct program.
The sending computer also adds a source port number. This source port number is similar to the name in a letter's return address. The source port number, usually a random number, uniquely identifies the connection on the sending side.
There are two common transport layer protocols:
- User Datagram Protocol (UDP)
- Transmission Control Protocol (TCP)
UDP is a simple and fast "best effort" delivery protocol. It provides no delivery notification, error checking, or recovery procedures. UDP allows applications to send datagrams that preserve message boundaries but imposes no rate control or error control. About all that UDP provides is a set of port numbers for multiplexing and demultiplexing data, plus a data integrity checksum. It is commonly used for short messages and time-sensitive data like DNS, online gaming, VoIP, and steaming video. Unlike TCP, UDP supports multicast delivery.
TCP is a robust protocol providing delivery notification, error checking, and recovery procedures. It deals with problems such as packet loss, duplication, and reordering that are not repaired by the IP layer. The receiving computer tells the sending computer when the data was received. TCP operates in a connection-oriented fashion and does not preserve message boundaries. Examples of applications that use TCP include HTTP, SMTP and FTP.
TCP accepts data from the Application Layer protocol and cuts the data into smaller pieces called segments. A segment is one unit of data encapsulated at Layer 4, or Transport Layer. Each segment is divided into two parts, a header followed by data. The segment header contains the data's destination port number, which indicates which application layer protocol should be used to process the data on the receiving computer. It also specifies a source port number, which uniquely identifies the connection on the sending side, allowing the receiving computer to carry on multiple sessions with the sending computer without intermixing the data.
TCP uses sequence numbers to put segments back together in the correct order. The receiving side uses the sequence numbers to tell the sending side when segments have been received. If the sending side does not receive an acknowledgement of the segments sent within a reasonable amount of time, it resends the segments.
There are two additional transport-layer protocols. The first is the Datagram Congestion Control Protocol (DCCP), specified in RFC4340. It provides a type of service midway between TCP and UDP: connection-oriented exchange of unreliable datagrams but with congestion control. Congestion control comprises a number of techniques whereby a sender is limited to a sending rate in order to avoid overwhelming the network.
The other transport protocol is called the Stream Control Transmission Protocol (SCTP), specified in RFC4960. SCTP provides reliable delivery like TCP but does not require the sequencing of data to be strictly maintained. It also allows for multiple streams to logically be carried on the same connection and provides a message abstraction, which differs from TCP. SCTP was designed for carrying signaling messages on IP networks that resemble those used in the telephone network.
The Network Layer (IP) provides an unreliable datagram service. The Network Layer receives data segments from the Transport Layer and adds a header to create a packet. A packet is one unit of data encapsulated at Layer 3. Each packet contains a header followed by the data. The packet's header specifies the data's source and destination IP addresses. Each packet header also specifies the IP protocol number, which indicates the upper layer Transport Layer protocol that is being used. Each Transport Layer protocol is assigned a unique identifier, or IP protocol number. Example IP protocol numbers include:
- UDP: IP protocol number 17
- TCP: IP protocol number 6
The PDU that IP sends to link-layer protocols is actually called an IP datagram and may be as large as 64KB and up to 4GB for IPv6. However, it is more common to call an IP datagram a packet. IP packets are datagrams, as each one contains the address of the layer 3 sender and recipient. A datagram is defined via RFC 1594 as "A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network."
The destination address of each datagram is used to determine where each datagram should be sent, and the process of making this determination and sending the datagram to its next hop is called forwarding.
There are three types of IP addresses, and the type affects how forwarding is performed:
- Unicast - destined for a single host
- Broadcast - destined for all hosts on a given network
- Multicast - destined for a set of hosts that belong to a multicast group
Fitting large packets into link-layer PDUs, called frames, that may be smaller is handled by a function call fragmentation. In fragmentation, portions of a larger datagram are sent in multiple smaller datagrams called fragments and put back together (called reassembly) when reaching the destination.
The Internet Control Message Protocol (ICMP) is an adjunct to IP, and can be labeled as an "unofficial" 3.5 protocol. It is used by the IP layer to exchange error messages and other vital information with the IP layer in another host or router. Applications also use ICMP such as ping and traceroute. ICMP messages are encapsulated within IP datagrams in the same way transport layer PDUs are.
The Internet Group Management Protocol (IGMP) is another protocol adjunct to IPv4. It is used with multicast addressing and delivery to manage which hosts are members of a multicast group.
Data Link Layer
The Data Link Layer receives packets and adds its own header to each packet to create a frame. A frame is one unit of data encapsulated at Layer 2. Each frame is divided into three parts:
The header contains the data's destination and source Layer 2 addresses. It also indicates which Layer 3 protocol should be used to process the data on the receiving computer.
The frame trailer is a checksum (CRC), which is used to verify data integrity.
Layer 2 then converts the data into ones and zeros.
There is an "unofficial" 2.5 layer. One of the most important protocols that operates here is the Address Resolution Protocol (ARP). It is a specialized protocol used with IPv4 and only with multi-access link-layer protocols to convert between the addresses used by the IP layer and the addresses used by the link layer.
Layer 1 converts bits into electrical signals and sends them across the physical medium.
Physical Layer specifications define characteristics such as cabling specifications, voltage levels, physical data rates, maximum transmission distances, and physical connectors.
Multiplexing, Demultiplexing, and Encapsulation in TCP/IP
At each layer there is an identifier that allows a receiving system to determine which protocol or data stream belongs together. Usually there is also addressing information at each layer.
The TCP/IP stack uses a combination of addressing information and protocol demultiplexing identifiers to determine if a datagram has been received correctly and, if so, what entity should process it. Several layers also check numeric values (e.g., checksums) to ensure that the contents have not been damaged in transit.
At the link layer, an arriving Ethernet frame contains a 48-bit destination address (also called a link-layer or Media Access Control (MAC) address) and a 16-bit field called the Ethernet type. A value of 0x0800 indicates that the frame contains an IPv4 datagram. Values of 0x0806 and 0x86DD indicate ARP and IPv6, respectively.
The frame is received and checked for errors, and the Ethernet Type field value is used to select which network-layer protocol should process it. The Ethernet header and trailer information is removed, and the remaining bytes, which constitute the frame's payload, are given to IP for processing.
IP checks a number of items, including the destination IP address in the datagram. If the datagram contains no errors in its header (IP does not check its payload), the 8-bit IPv4 Protocol field (called Next header in IPv6) is checked to determine which protocol to invoke next. Common values include 1 (ICMP), 2 (IGMP), 4 (IPv4), 6 (TCP), and 17 (UDP).
The resulting datagram (reassembled from fragments if necessary) is passed to the transport layer for processing. At the transport layer, most protocols use port numbers for demultiplexing to the appropriate receiving application.
Port numbers are 16-bit nonnegative integers (i.e., range 0-65535). Each IP address has 65,536 associated port numbers for each transport protocol that uses port numbers, and they are used for determining the correct receiving application.
Port numbers are divided into special ranges, including:
- Well-known: 0-1023
- Registered: 1024-49151
- Dynamic/private: 49152-65535
Usually, servers wishing to bind to a well-known port require special privileges such as administrator or "root" access.
The range of well-known ports is used for identifying many well-known services such as SSH (port 22), FTP (ports 20 and 21), Telnet (port 23), SMTP (port 25), DNS (port 53), HTTP (port 80), HTTPS (port 443), IMAP (port 143), IMAPS (port 993), SNMP (ports 161 and 162), LDAP (port 389), and several others.
The registered port numbers are available to clients or servers with special privileges, but IANA keeps a reserved registry for particular uses, so these port numbers should generally be avoided when developing new applications unless an IANA allocation has been procured.
In some circumstances the value of the port number matters little because the port number being used is transient. Such port numbers are called ephemeral port numbers. They are considered to be temporary because a client typically needs one only as long as the user running the client needs service.
The Internet Engineering Task Force (IETF) is one of the primary organizations for standardizing the various protocols and how they operate. This group meets three times each year in various locations around the world to develop, discuss, and agree on standards for the Internet's "core" protocols.
IETF is a forum that elects leadership groups called the Internet Architecture Board (IAB) and the Internet Engineering Steering Group (IESG). The IAB is chartered to provide architectural guidance to activities in IETF and to perform a number of other tasks such as appointing liaisons to other Standards-Defining Organizations (SDOs).
The IESG has decision-making authority regarding the creation and approval of new standards, along with modifications to existing standards. The "heavy lifting" or detailed work is generally performed by IETF working groups that are coordinated by working group chairs who volunteer for this task.
There are two other important groups that interact closely with the IETF. The Internet Research Task Force (IRTF) explores protocols, architectures, and procedures that are not deemed mature enough for standardization. The chair of the IRTF is a nonvoting member of the IAB. The IAB, in turn, works with the Internet Society (ISOC) to help influence and promote worldwide policies and education regarding Internet technologies and usage.
Request for Comments (RFC)
Every official standard in the Internet community is published as a Request for Comments, or RFC. RFCs can be created in a number of ways, and the publisher of RFCs (called the RFC editor) recognizes multiple document streams corresponding to the way an RFC has been developed. The current streams (as of 2010) include the IETF, IAB, IRTF, and independent submission streams. Prior to being accepted and published (permanently) as an RFC, documents exist as temporary Internet drafts while they receive comments and progress through the editing and review process.
All RFCs are not standards. Only so-called standards-track category RFCs are considered to be official standards. Other categories include best current practice (BCP), informational, experimental, and historic. It is important to realize that just because a document is an RFC does not mean that the IETF has endorsed it as any form of standard.
RFCs are all available for free from a number of web sites, including http://www.rfc-editor.org.
A number of RFCs have special significance because they summarize, clarify, or interpret particular sets of other standards. For example, RFC5000 defines the set of all other RFCs that are considered official standards of of mid-2008.
Other SDOs are responsible for defining protocols that merit attention. The most important of these groups include the Institute of Electrical and Electronics Engineers (IEEE), the World Wide Web Consortium (W3C), and the International Telecommunication Union (ITU).
Among other things, the IEEE is concerned with standards below layer 3 (e.g., Wi-Fi and Ethernet), and W3C is concerned with application-layer protocols, specifically those related to Web technologies (e.g., HTML-based syntax). ITU and more specifically ITU-T (formerly CCITT), standardizes protocols used within the telephone and cellular networks.