|By Kostya Ryvkin, Dave Houde and Tim Hoffman
From MCSE – Internetworking with Microsoft TCP/IP on Microsoft Windows NT 4.0
Published by Prentice Hall
In this chapter, we will take a look at the TCP/IP protocol stack to help us better understand how TCP/IP functions in the network. We'll look at the Department Of Defense (DoD) protocol layers and the Open Systems Interconnect (OSI) model to help us better understand how protocols and utilities function at various layers. We'll discuss the protocols that comprise the TCP/IP suite of protocols and look at some TCP/IP configuration and troubleshooting information.
At the end of this chapter, you will be able to:
TCP/IP made early WANS possible. TCP/IP's robust set of protocols provided complete networking support to connect all hosts and sites and rapidly became the standard for such activities. Over time, the TCP/IP suite of protocols and utilities has become much more than just a standard–it has helped us usher in a new era of computing. We can now configure machines around the world and monitor events on distant computers. Although local area networking standards remained in the realm of proprietary vendor standards until the mid 1980's, they have changed so much since then that today there is nearly total interoperability in the TCP/IP world. With TCP/IP, we can connect to the world. Before we do that, however, let's look closer at what the protocol suite is capable of doing.
A strict definition of the parts of the TCP/IP protocol can be found in the Requests for Comments (RFCs) listed in Table 2.1. You may find the RFCs at http://www.cis.ohio-state.edu/htbin/rfc , or you can look at all the RFC search options by going to http://www.cis.ohio-state.edu/hypertext/information/rfc.html .
Table 2.1 RFCs that Define TCP/IP
ISO/OSI and DoD Overview
TCP/IP is clearly more than just Transmission Control Protocol over Internet Protocol. When we speak of TCP/IP, we're really talking about several protocols and utilities that work together to permit interoperability of hosts on a network (local, metropolitan, or wide area). These protocols and utilities provide the means by which machines can connect to share information.
The Open Systems Interconnect (OSI) Model
The Open Systems Interconnect (OSI) model was developed by the International Standards Organization (ISO) and helps to identify how the functions of the protocols relate to each other. By showing how the functions relate, we'll define how the parts of the protocol stack connect to permit machines to effectively communicate. As we look at the OSI model, remember it is just a concept–we don't actually see it when two hosts work together. The model, however, is the standard and to communicate we must adhere to the standard. If both computers trying to establish communications are configured according to the standard, communications will take place. If they're not, you may end up getting error messages, fail to initialize services, or you may get no communications at all.
The OSI model contains seven layers:
Figure 2.1 compares the layers of the ISO OSI model to the layers of the DoD model. These models give us a sense of how communication is expected to take place. Moving from the highest layers down to the wire, we see the application needs to be able to operate without being concerned about identifying all of the lower-level hardware and maintaining drivers for each device.
Figure 2.1 Comparison of OSI and DOD models
In the OSI model, the Application, Presentation and Session Layers provide services useful to applications in general. These services are separate from similar but distinct functions that take place at the lower levels. Error detection and correction, for instance, may take place at two different points in the protocol stack.
The Application Layer provides support to end-user applications by providing the application programming interfaces (sets of procedure calls) that provide the engines that drive actual user applications. This layer is responsible for working with the originated data stream and communicating with the lower layers. Some examples of application programming interfaces (APIs) would be: Mail API (MAPI), MS FAX API (FAPI), Telephone API (TAPI), and Internet Server API (ISAPI).
The Presentation Layer provides platform-to-platform translation of syntax for the purpose of data exchange. Modification of data according to a common set of rules is done at this layer. Compression and encryption, for instance, are accomplished here.
The Session Layer provides for the establishment, maintenance, and recovery from failures that occur between applications. When two computers establish a session to share data, control of the flow and direction, and the recovery of missing or corrupt data, are the responsibility of this layer. Depending on the type of application, you might see a simplex, half-duplex, or full-duplex data flow. Simplex is a one-way data flow. Half-duplex is the same as simplex, but implies that there is a duplex channel to permit full-duplex if so configured. Full-duplex provides two-way data flow. By providing appropriate checkpoint methods, the wire between two computers can stay full of data and only the data that does not make it properly to the distant end needs to be retransmitted.
The Transport Layer guarantees that the data is delivered in the right order and in a reliable manner. Here again, we consider error checking and correction as a means to put the information in the right order and to make certain the whole message is received.
The Network Layer provides routing between internetworks and shields the layers above from the details of the lower layers (the physical topology, for example). It is at this layer that we first find addressing (for example, the IP address).
Data Link Layer
The Data Link Layer provides reliable transfer of data across the physical link. The Data Link Layer functions to provide formatting, error detection, link management, and data flow control. Again we find addressing–this time at the hardware layer (for example, the hardware address of the Network Interface Card).
The Physical Layer accepts data from the Data Link Layer and puts it in the right format for the physical medium. This layer specifies the requirements for the wire such as the voltage levels (electrical properties), connector types (mechanical specifications), and handshake (procedural specifications of how to connect).
DoD Four-Layer Model
More than one theory can be used to identify how the components in the TCP/IP protocol stack connect dissimilar systems. The DoD four-layer model was the original example. Let's take a look at how each Microsoft TCP/IP component or utility fits this model.
The DoD four-layer model contains:
Starting with the place where the signals go (the wire) and working our way up the protocol stack, we find the following layers:
Network Interface Layer
The lowest layer in the model is responsible for putting frames on the wire and pulling frames off the wire. To get information to the next higher level, which is where the routing and switching take place, there must be information that permits computers to find each other on the subnetwork, or subnet. This is the hardware address of the network card. The Network Interface Card (NIC) contains a hardware address that is mapped to and used by higher-level protocols to pass information up and down the stack and back and forth across the wire.
TCP/IP can be used in a wide variety of LAN, MAN, WAN, and dial-up environments. Supported LAN types include: Ethernet, Token Ring, Fiber Distributed Data Interface (FDDI), and ARCnet. Supported WAN types include: serial lines and packet-switched networks such as X.25, Frame Relay, and ATM. Dial-up is supported by Remote Access Service (RAS) on Windows NT computers, and will be discussed later in this book. Metropolitan Area Network (MAN) types of topologies supported using TCP/IP are the same as the previously mentioned WAN types.
Each of the LAN, MAN, WAN, and dial-up types have different requirements for cables, signaling, data encoding, and so on. The Network Interface Layer specifies the requirements equivalent to the Data Link and Physical Layers of the OSI model as we noted in Figure 2.1.
The Internet Layer protocols provide three specific services:
Five protocols are implemented at this layer:
These protocols do their job by encapsulating packets into Internet datagrams and running all the necessary routing algorithms (a datagram is a connectionless or one-way communication–it is sent with no confirmation of arrival, much as when you send a letter to someone). The user data originates in one of the higher-level protocols and is passed down to the Internet Layer. The router then examines the IP address of the datagram to determine if the destination is local or remote. If both machines are on the same network (local), the datagram is forwarded directly to the destination host. If the destination host is on a different network (remote), the datagram is forwarded to the default gateway (locally attached gateway, or router) to remote networks.
When a network joins the Internet, the administrator must apply for and receive a valid IP network and host number from the Internet Information Center (InterNIC). The hosts carry out the functions mentioned here through the use of these numbers, which, when combined, are known as an IP address.
Note The way information is handled under the OSI or DoD models is often referred to as encapsulation. Encapsulation is the process of adding a header to the data accepted from a higher-level protocol. When the application originates the data or sends a request to get data, the data or request moves down through the protocol stack, and at each level, a new header is added. This increases the total size of the information until it reaches the wire. The individual zeroes and ones are sent via the wire to the remote computer, where each of the headers is opened and peeled off, much like peeling the skin and layers off an onion. The header information is stripped off at each layer, and the information is sent upwards to, finally, reach the intended application.
Transport protocols provide communications sessions between connected computers. The desired method of data delivery determines the transport protocol. The two transport protocols provided within TCP/IP are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP provides the virtual circuit service to make end-to-end connection for user applications. Data transfer is made reliable through the use of connections and acknowledgements. UDP provides delivery, but does not use connections or acknowledgements, so it is less reliable but faster.
The terms "Host-To-Host" and "Transmission Layer," are used interchangeably with the term "Transport Layer." The Transport Layer is responsible for error detection and correction in the DoD model, and is analogous to the Transport Layer in the OSI model.
Microsoft implements two program interfaces at the Application Layer to allow applications to utilize the services of the TCP/IP protocol stack: Windows Sockets (WINSOCK) and NetBIOS.
The WINSOCK interface provides a standard API under Microsoft Windows to many transport protocols such as IPX and TCP/IP. This open standard library of function calls, data structures, and programming procedures permits Windows applications to take advantage of TCP/IP. This enables Windows NT to exchange data with foreign or non-NetBIOS systems.
NetBIOS provides a standard interface to protocols that support NetBIOS naming and message services such as TCP/IP and NetBEUI. NetBIOS is used in Microsoft products to permit application communication with the lower protocol layers as well. Three TCP ports provide NetBIOS support: port 137 for NetBIOS Name Service, port 138 for Datagram Service, and port 139 for Session Service.
Several standard TCP/IP utilities and services exist at the Application Layer. For example:
The Microsoft TCP/IP Protocol Suite
Now that we've seen the theoretical models that comprise the network standards that define the use of the elements of the TCP/IP protocol suite, let's take a look at the functionality of the Microsoft TCP/IP protocol suite.
Address Resolution Protocol
The purpose of Address Resolution Protocol (ARP) is to permit the successful mapping of an IP address to a hardware address. The process starts where one host sends a local broadcast to obtain a hardware address and puts the resulting information into a cache for future reference.
Let's suppose you try to ping a particular IP address. The first action in this procedure is a query to the existing ARP cache. If no match for the IP address is found in the cache, an ARP broadcast is sent. The target machine answers the broadcast with its hardware address and the calling machine stores the information in its cache. Once the calling machine has the target's hardware address, it can use directed communications from that point on. (When we say "directed communications," we're talking about a communication to a particular machine versus a "broadcast" to all machines on the local network.)
Resolving a Remote IP Address
If a host tries to resolve a remote host's address to its hardware address, there is a need to traverse a router. IP routers do not permit ARP broadcasts to go from one subnet to another to minimize network and Internet traffic in general. How then does a remote IP address get resolution? This is what happens when Host 1 initiates communication with a computer for which it does not have a hardware address (Host X):
Overview of the ARP Cache
The ARP cache on each host consists of static and dynamic entries that map IP addresses to hardware addresses. Each network interface configured for TCP/IP maintains its own ARP cache. The static entries remain in the ARP cache until the computer is restarted, or until they're manually deleted. An entry will be dynamically changed if the host receives an ARP broadcast for an IP address that is already in the cache but with a different hardware address than the existing entry. An address can be added manually to the ARP cache by typing arp s IP_address hardware_address (e.g., arp s 188.8.131.52 00-10-4B-86-76-3D). To delete an entry, type arp d IP_address (e.g., arp d 184.108.40.206). To view the information in the ARP cache, type arp a.
Dynamic entries are added and deleted from the ARP cache based on the exchange of information between local and destination hosts. If a dynamic entry is not used within a two-minute period, it is deleted from the cache. If it is used within two minutes, the Time to Live (TTL) is extended to ten minutes.
The ARP Utility
The ARP utility can be used to display and modify the IP-to-physical address translation tables used by ARP. It uses the following syntax:
ARP -a [IP_Addr] [-N NIC_Addr]
ARP -s IP_Addr MAC_Addr [NIC_Addr]
ARP -d IP_Addr [NIC_Addr]
ARP Packet Structure
The ARP packet structure is designed for IP address resolution. This structure can, however, be adapted to other types of address resolution. The actual packet structure is outlined in Table 2.2.
Table 2.2 ARP packet Structure
Internet Control Message Protocol
The Internet Control Message Protocol (ICMP) is responsible for handling the errors that occur when data packets are transmitted across a network. PING, as well as other utilities, use ICMP to operate. If a host fails to respond to a ping request, ICMP notifies the originator that the transmission was unsuccessful. ICMP messages are datagrams and are considered to be unreliable.
ICMP Packet Structure
While their length may vary, all ICMP packets use the same structure as defined in Table 2.3.
Table 2.3 ICMP packet structure
ICMP Source Quench Messages
Sometimes, during normal communications, hosts will send information faster than the routers, gateways, and links between them can handle it. Some routers can send an ICMP source quench message to request that a host transmit at a slower rate. A Windows NT TCP/IP host will accept source quench messages and comply by reducing its rate. A Windows NT computer that is being used as a router, however, will drop datagrams that cannot be buffered because it is not able to send source quench messages to the sending host.
Internet Group Management Protocol
The Internet Group Management Protocol (IGMP) is used to inform routers that a host or group of hosts, designated as members of a specific multicast group, is available on a given network. A multicast group is a set of hosts that are identified by a single destination address. Using IGMP, each router that supports multicasting is made aware of which host groups are on which networks. IGMP packets are sent as UDP datagrams, which makes IGMP packets unreliable.
IGMP Packet Structure
The IGMP packet structure is defined in Table 2.4.
Table 2.4 IGMP packet structure
The Internet Protocol (IP) provides several necessary functions such as the addressing and routing of packets to and from destination hosts. If the packets need to be fragmented and reassembled, the IP provides for this.
The IP is considered connectionless, which means that it does not expect or need to be connected to the other side to do its job. There is no session established when IP is used by itself. Because there is no positive response from the target computer when it receives a communication, there is no guarantee that the communication will take place, and a "best effort" is used to get information to the other side. Because of this, data can sometimes be lost or received out of sequence, and neither the sending nor receiving host know about it. In this case, acknowledgement for the receipt of packets, and the sequencing of the received packets to place them in the correct order, are the responsibility of higher-layer transport protocols, such as TCP.
IP Packet Structure
An IP packet consists of a variable-length header that prefixes the IP data. The information contained in the header is outlined in Table 2.5.
Table 2.5 IP fields
IP on the Router
When it traverses a router, the following happens to an IP packet:
Transmission Control Protocol
Transmission Control Protocol (TCP) is connection-oriented, meaning the remote computer is expected to be "connected to" the remote host before data exchange takes place. TCP guarantees a more reliable method of delivery of information through the use of sequence numbers, acknowledgements, and a three-way handshake.
TCP uses byte stream communications, which is where data elements are handled as a sequence of bytes without any boundaries. Each segment of data is assigned its own sequence number so the data can be reassembled at the receiving end. To ensure that the data is received as transmitted, the receiving host must send an acknowledgement, or ACK, within a specific period of time. If the ACK is not received, the segment is retransmitted. If a segment is received in a corrupt or unusable condition, the host on the receiving end simply sends it to the bit bucket without sending an ACK. In the absence of an ACK, the sending station knows to resend the information.
TCP functions through numbered ports to provide specific delivery locations. Any port with a number of less than 256 is considered a commonly used port. Table 2.6 shows some of TCP's commonly used ports.
Table 2.6 TCP ports
A three-way handshake is simply the way two hosts ensure they've exchanged accurate and complete data. To do so, they must make sure they're properly synchronized to send and receive portions of the data, that they each know how much data the other can receive at one time, and that they've established a virtual connection. The handshake takes place in the following three steps:
TCP Packet Structure
The TCP packet consists of a TCP header with the TCP data attached. The header consists of the ten fields outlined in Table 2.7.
Table 2.7 TCP Header fields
To ensure the most efficient communications, TCP employs a technique called sliding windows to keep data streams full of send and receive data. Each machine involved in data communication maintains two buffers (sliding windows), one for sending data and one for receiving data. Each of these windows is sized in relation to the amount of data the computer can buffer. The entire process is relatively simple:
User Datagram Protocol
The User Datagram Protocol (UDP) is a "connectionless" protocol that does not establish a session or provide for guaranteed delivery. By connectionless, we mean that UDP packets are sent out over the network very much like a telegram–the receiving computer does not send an acknowledgement. The message is sent and we must assume it has been received. This is distinct from a telephone call, where we are able to establish two-way communication to ensure the person on the other end of the line has received and understood our message. Much like IP, UDP neither guarantees delivery nor the proper sequencing of delivered packets. If these are important to the application using UDP, the application or a higher-level protocol must supply an additional level of checking. While UDP does utilize a checksum for error checking, this is an optional field and not enforced by the protocol.
UDP is most often used in one-to-many communications of small amounts of data. Later in this book, we'll discuss broadcast "messages," especially in relation to the resolution of NetBIOS names to IP addresses. Normally when we talk of "broadcasts" in the context of TCP/IP, we're referring to UDP traffic.
UDP functions through distinct UDP ports. Although TCP and UDP may use the same port number in some instances, these numbers do not represent the same port. A UDP port is a 16-bit address that exists only to transmit datagram information to the correct location above the Transport Layer of the protocol stack–simply a location for sending messages. UDP ports can receive more than one message at a time and are identified by "well-known" port numbers. Before it can use UDP, an application must supply an IP address and port number for the target of its message. Table 2.8 defines the "well-known" UDP port numbers.
Table 2.8 UDP ports
UDP Packet Structure
The UDP packet consists of an eight-byte UDP header with the UDP data appended. The header consists of the four fields outlined in Table 2.9.
Table 2.9 UDP header fields
Ports and Sockets
Our protocol discussion has, thus far, taken us from the wire, through the Network Interface Card, all the way up to the Transport Layer of the DoD model. The only remaining step is to see how the data flows to and from the applications that use and create it. The vehicles that accomplish this last step are ports and sockets. Figure 2.2 provides an overall view of where they fit into the data transmission picture.
Figure 2.2 Ports and Sockets
A port provides a location for sending messages. It functions as a multiplexed message queue, which means that it can receive more than one message at a time. Ports are identified by a numerical value between 0 and 65,536. The port numbers for client-side TCP/IP applications are assigned dynamically by the operating system when a request for service is received. The port numbers for well-known server-side applications are assigned by a group called the Internet Assigned Numbers Authority (IANA) and do not change. These well-known port numbers are documented in RFCs 1060 and 1700. You can find the port numbers in the following ASCII text file:
A socket is a bi-directional "pipe" for exchanging data between networked computers. The Windows Sockets API is a networking API used by Windows programmers in building Windows applications that will communicate over a network. The API consists of a set of calls which perform defined functions and pass information back and forth to the lower protocol layers. An application creates a socket when it specifies the IP address of an intended host, the type of service requested (TCP for connection-based requests, UDP for connectionless-based requests), and the port that the particular application will use. Sockets are identified within a host through the use of unique protocol port numbers.
In completing this chapter, you should have developed a good understanding of the parts and functions of TCP/IP. This will help you better understand the material we will present in subsequent chapters.
To better understand how data flows through the TCP/IP components, we reviewed the seven-layer OSI model and the four-layer DoD model and saw that data moves between layers during its journey from the application to the wire. We learned that, as data moves throughout the layers, header information is added (for transmission) or removed (for reception).
We then took a close look at the basic protocols that make up the TCP/IP suite. We found that ARP finds a computer's hardware address when the IP address is known. We learned that ICMP reports errors in the transmission of network data and that IGMP supports multicasting. The IP was found to be a connectionless protocol that operated at the layers between those concerned with physical transmission and those concerned with transport functions. We saw that IP can pass data to two transport protocols: TCP for connection-oriented communication and UDP for connectionless communication. Finally, we saw that the data handled by TCP or UDP makes its way to and from the application through ports and sockets.
Answers to review questions. 1.-(B) 2.-(B,C) 3.-(D) 4.-(B) 5.-(A)