Table of Contents
Linux, really a child of the Internet, offers all the necessary networking tools and features for integration into all types of network structures. An introduction into the customary Linux protocol, TCP/IP, follows. The various services and special features of this protocol are discussed. Network access using a network card can be configured with YaST. The central configuration files are discussed and some of the most essential tools described. Only the fundamental mechanisms and the relevant network configuration files are discussed in this chapter.
Linux and other Unix operating systems use the TCP/IP protocol. It is not a single network protocol, but a family of network protocols that offer various services. TCP/IP was developed based on an application used for military purposes and was defined in its present form in an RFC in 1981. RFC stands for Request for Comments. They are documents that describe various Internet protocols and implementation procedures for the operating system and its applications. Since then, the TCP/IP protocol has been refined, but the basic protocol has remained virtually unchanged.
The RFC documents describe the setup of Internet protocols. To expand your knowledge about any of the protocols, refer to the appropriate RFC document. They are available online at http://www.ietf.org/rfc.html.
The services listed in Table 21.1. “Several Protocols in the TCP/IP Protocol Family” are provided for the purpose of exchanging data between two machines via TCP/IP. Networks combined by TCP/IP, comprising a world-wide network are also referred to, in their entirety, as “the Internet.”
Table 21.1. Several Protocols in the TCP/IP Protocol Family
|TCP||Transmission Control Protocol: A connection-oriented secure protocol. The data to transmit is first sent by the application as a stream of data then converted by the operating system to the appropriate format. The data arrives at the respective application on the destination host in the original data stream format in which it was initially sent. TCP determines whether any data has been lost during the transmission and that there is no mix-up. TCP is implemented wherever the data sequence matters.|
|UDP||User Datagram Protocol: A connectionless, insecure protocol. The data to transmit is sent in the form of packets generated by the application. The order in which the data arrives at the recipient is not guaranteed and data loss is a possibility. UDP is suitable for record-oriented applications. It features a smaller latency period than TCP.|
|ICMP||Internet Control Message Protocol: Essentially, this is not a protocol for the end user, but a special control protocol that issues error reports and can control the behavior of machines participating in TCP/IP data transfer. In addition, a special echo mode is provided by ICMP that can be viewed using the program ping.|
|IGMP||Internet Group Management Protocol: This protocol controls the machine behavior when implementing IP multicast. The following sections do not contain more information regarding IP multicasting, because of space limitations.|
Almost all hardware protocols work on a packet-oriented basis. The data to transmit is packaged in packets, as it cannot be sent all at once. This is why TCP/IP only works with small data packets. The maximum size of a TCP/IP packet is approximately 64 kilobytes. The packets are normally quite a bit smaller, as the network software can be a limiting factor. The maximum size of a data packet on an ethernet is about fifteen hundred bytes. The size of a TCP/IP packet is limited to this amount when the data is sent over an ethernet. If more data is transferred, more data packets need to be sent by the operating system.
IP (Internet protocol) is where the insecure data transfer takes place. TCP (transmission control protocol), to a certain extent, is simply the upper layer for the IP platform serving to guarantee secure data transfer. The IP layer itself is, in turn, supported by the bottom layer, the hardware-dependent protocol, such as ethernet. Professionals refer to this structure as the layer model. See Figure 21.1. “Simplified Layer Model for TCP/IP”.
The diagram provides one or two examples for each layer. As you can see, the layers are ordered according to abstraction levels. The lowest layer is very close to the hardware. The uppermost layer, however, is almost a complete abstraction from the hardware. Every layer has its own special function. The special functions of each layer are mostly implicit in their description. The bit transfer and security layers represent the physical network used (such as ethernet).
While layer 1 deals with cable types, signal forms, signal codes, and the like, layer 2 is responsible for accessing procedures (which host may send data?) and error correction. Layer 1 is called the physical layer. Layer 2 is called the data link layer.
Layer 3 is the network layer and is responsible for remote data transfer. The network layer ensures that the data arrives at the correct remote destination and can be delivered to it.
Layer 4, the transport layer, is responsible for application data. It ensures that data arrives in the correct order and is not lost. While the data link layer is only there to make sure that the data as transmitted is the correct one, the transport layer protects it from being lost.
Finally, layer 5 is the layer where data is processed by the application itself.
For every layer to serve its designated function, additional information regarding each layer must be saved in the data packet. This takes place in the header of the packet. Every layer attaches a small block of data, called the protocol header, to the front of each emerging packet. A sample TCP/IP data packet traveling over an ethernet cable is illustrated in Figure 21.2. “TCP/IP Ethernet Packet”.
The proof sum is located at the end of the packet, not at the beginning. This simplifies things for the network hardware. The largest amount of usage data possible in one packet is 1460 bytes in an ethernet network.
When an application sends data over the network, the data passes through each layer, all implemented in the Linux kernel except layer 1 (network card). Each layer is responsible for preparing the data so it can be passed to the next layer below. The lowest layer is ultimately responsible for sending the data. The entire procedure is reversed when data is received. Like the layers of an onion, in each layer the protocol headers are removed from the transported data. Finally, layer 4 is responsible for making the data available for use by the applications at the destination. In this manner, one layer only communicates with the layer directly above or below it. For applications, it is irrelevant whether data is transmitted via a 100 MBit/s FDDI network or via a 56-kbit/s modem line. Likewise, it is irrelevant for the data line which kind of data is transmitted, as long as packets are in the correct format.
The discussion in the following sections is limited to IPv4 networks. For information about IPv6 protocol, the successor to IPv4, refer to Section 21.2. “IPv6 — The Next Generation Internet”.
Every computer on the Internet has a unique 32-bit address. These 32 bits (or 4 bytes) are normally written as illustrated in the second row in Table 21.1. “How an IP Address is Written”.
Example 21.1. How an IP Address is Written
IP Address (binary): 11000000 10101000 00000000 00010100 IP Address (decimal): 192. 168. 0. 20
In decimal form, the four bytes are written in the decimal number system, separated by periods. The IP address is assigned to a host or a network interface. It cannot be used anywhere else in the world. There are certainly exceptions to this rule, but these play a minimal role in the following passages.
The ethernet card itself has its own unique address, the MAC, or media access control address. It is 48 bits long, internationally unique, and is programmed into the hardware by the network card vendor. There is, however, an unfortunate disadvantage of vendor-assigned addresses — MAC addresses do not make up a hierarchical system, but are instead more or less randomly distributed. Therefore, they cannot be used for addressing remote machines. The MAC address still plays an important role in communication between hosts in a local network and is the main component of the protocol header of layer 2.
The points in IP addresses indicate the hierarchical system. Until the 1990s, IP addresses were strictly categorized in classes. However, this system has proven too inflexible so was discontinued. Now, classless routing (CIDR, classless interdomain routing) is used.
Netmasks were conceived for the purpose of informing the host with the IP address 192.168.0.0 of the location of the host with the IP address 192.168.0.20. To put it simply, the netmask on a host with an IP address defines what is internal and what is external. Hosts located internally (“in the same subnetwork”) respond directly. Hosts located externally (“not in the same subnetwork”) only respond via a gateway or router. Because every network interface can receive its own IP address, it can get quite complicated.
Before a network packet is sent, the following runs on the computer: the IP address is linked to the netmask via a logical AND and the address of the sending host is likewise connected to the netmask via the logical AND. If there are several network interfaces available, normally all possible sender addresses are verified. The results of the AND links are compared. If there are no discrepancies in this comparison, the destination, or receiving host, is located in the same subnetwork. Otherwise, it must be accessed via a gateway. The more “1” bits are located in the netmask, the fewer hosts can be accessed directly and the more hosts can be reached via a gateway. Several examples are illustrated in Table 21.2. “Linking IP Addresses to the Netmask”.
Example 21.2. Linking IP Addresses to the Netmask
IP address (192.168.0.20): 11000000 10101000 00000000 00010100 Netmask (255.255.255.0): 11111111 11111111 11111111 00000000 --------------------------------------------------------------- Result of the link: 11000000 10101000 00000000 00000000 In the decimal system: 192. 168. 0. 0 IP address (126.96.36.199): 11010101 10111111 00001111 11001000 Netmask (255.255.255.0): 11111111 11111111 11111111 00000000 --------------------------------------------------------------- Result of the link: 11010101 10111111 00001111 00000000 In the decimal system: 213. 95. 15. 0
The netmasks appear, like IP addresses, in decimal form divided by periods. Because the netmask is also a 32-bit value, four number values are written next to each other. Which hosts are gateways or which address domains are accessible over which network interfaces must be configured.
To give another example: all machines connected with the same ethernet cable are usually located in the same subnetwork and are directly accessible. When the ethernet is divided by switches or bridges, these hosts can still be reached.
However, the economical ethernet is not suitable for covering larger distances. You must transfer the IP packets to another hardware (such as FDDI or ISDN). Devices for this transfer are called routers or gateways. A Linux machine can carry out this task. The respective option is referred to as ip_forwarding.
If a gateway has been configured, the IP packet is sent to the appropriate gateway. This then attempts to forward the packet in the same manner — from host to host — until it reaches the destination host or the packet's TTL (time to live) expires.
Table 21.2. Specific Addresses
|Base network address||This is the netmask AND any address in the network, as shown in Table 21.2. “Linking IP Addresses to the Netmask” under Result. This address cannot be assigned to any hosts.|
|Broadcast address||This basically says, “Access all hosts in this subnetwork.” To generate this, the netmask is inverted in binary form and linked to the base network address with a logical OR. The above example therefore results in 192.168.0.255. This address cannot be assigned to any hosts.|
|Local host||The address 127.0.0.1 is strictly assigned to the “loopback device” on each host. A connection can be set up to your own machine with this address.|
As IP addresses must be unique all over the world, you cannot just come up with your own random addresses. There are three address domains to use to set up a private IP-based network. With these, you cannot set up any connections to the rest of the Internet, unless you apply certain tricks, because these addresses cannot be transmitted over the Internet. These address domains are specified in RFC 1597 and listed in Table 21.3. “Private IP Address Domains”.
DNS assists in assigning an IP address to one or more names and assigning a name to an IP address. In Linux, this conversion is usually carried out by a special type of software known as bind. The machine that takes care of this conversion is called a name server. The names make up a hierarchical system in which each name component is separated by dots. The name hierarchy is, however, independent of the IP address hierarchy described above.
Consider a complete name, such as laurent.suse.de, written in the format hostname.domain. A full name, referred to as a fully qualified domain name (FQDN), consists of a host name and a domain name (suse.de). The latter also includes the top level domain or TLD (de).
TLD assignment has become quite confusing for historical reasons. Traditionally, three-letter domain names are used in the USA. In the rest of the world, the two-letter ISO national codes are the standard. In addition to that, multiletter TLDs were introduced in 2000 that represent certain spheres of activity (for example, .info, .name, .museum).
In the early days of the Internet (before 1990), the file /etc/hosts was used to store the names of all the machines represented over the Internet. This quickly proved to be impractical in the face of the rapidly growing number of computers connected to the Internet. For this reason, a decentralized database was developed to store the host names in a widely distributed manner. This database, similar to the name server, does not have the data pertaining to all hosts in the Internet readily available, but can dispatch requests to other name servers.
The top of the hierarchy is occupied by root name servers. These root name servers manage the top level domains and are run by the Network Information Center, or NIC. Each root name server knows about the name servers responsible for a given top level domain. Information about top level domain NICs is available at http://www.internic.net.
DNS can do more than just resolve host names. The name server also knows which host is receiving e-mails for an entire domain — the mail exchanger (MX).
For your machine to resolve an IP address, it must know about at least one name server and its IP address. Easily specify such a name server with the help of YaST. If you have a modem dial-up connection, you may not need to configure a name server manually at all. The dial-up protocol provides the name server address as the connection is made. The configuration of name server access with SUSE LINUX is described in Section 21.7. “DNS — Domain Name System”.
The protocol whois is closely related to DNS. With this program, quickly find out who is responsible for any given domain.