System & Architecture II: Networking & Internetworking

0 Introduction

作为 System & Architecture 的第二部分,我个人感觉这门课教得一塌糊涂……


  • Java(TM) Network Programming and Distributed Computing
  • Distributed Systems - Concepts and Design (5th Edition)
  • Computer Networking - A Top-down Approach (6th Edition)

1 Networking and Internetworking

1.1 Introduction

1.1.1 Networks Components

  • Networks are built from a variety of:
    • transmission media (wirs, cable, fibre, and wireless channels)
    • hardware devices (routers, switches, bridges, hubs, repeaters, and network interfaces)
    • software components (protocol stacks, communication handles, and drivers)

1.1.2 Some Concepts

  • Communication Subsystem: the collection of hardware and software components that provide the communication facilities
  • Hosts: the computers and other devices that use the network for communication purposes
  • Node: any computer or switching device attached to a network

1.2 Networking Issues for Distributed Systems

1.2.1 Performance

  • Latency: the delay that occurs after a send operation is executed and before data starts to arrive at the destination computer (i.e. the time required to transfer an empty message)
  • Data Transfer Rate: the speed at which data can be transferred between two computers in the network once transmission has begun (usually quoted in bits per second)
  • Message Transmission Time = Latency + Length / Data Transfer Rate

On the Internet, round-trip latencies are 5~500ms, with means of 20~200ms depending on distance.

So, requests transmitted across the Internet are 10~100 times slower than those sent on fast local networks.

The bulk of this time difference derives from switching delays at routers and contention for network circuits.

1.2.2 Scalability

The growth since 2005 then has been so rapid and diverse that it is difficult to find recent reliable statistics

1.2.3 Reliability

  • Many APPs are able to recover from communication failures
  • Application-level software performs the detection and correction of communication errors
  • Most physical transmission media have high reliability
  • When errors occur they are usually due to failures in the software at the sender or receiver rather than errors in the network. For example: failure by the receiving computer to accept a packet, or buffer overflow.

1.2.4 Security

  • Firewall: the first level of defense running on a gateway, to protect the resources in organization's computers from access by external users or processes, and to control the use of resources outside the firewall by users inside the organization
  • Gateway: a computer standing at the network entry point to an organization's intranet
  • Cryptography: end-to-end authentication, privacy, and security, applied at a level above the communication subsystem

1.2.5 Mobility

  • Mobile devices need intermittent connection to many different subnets

1.2.6 Quality of Service

  • including the ability to meet deadlines when transmitting and processing streams of real-time multimedia data
  • This imposes major requirements on computer networks
  • Applications that transmit multimedia data require guaranteed bandwidth and bounded latencies for the communication channels that they use

1.2.7 Multicasting

  • Between pairs of processes: most communicatoin in distributed systems
  • One-to-many communication: sends to several destinations

1.3 Types of Network

Example Range Bandwidth (Mbps) Latency (ms)
LAN Ethernet 1 ~ 2 km 10 ~ 10, 000 1 ~ 10
WAN IP routing worldwide 0.01 ~ 600 100 ~ 500
MAN ATM 2 ~ 50 km 1 ~ 600 10
Internetwork Internet worldwide 0.5 ~ 600 100 ~ 500
WPAN Bluetooth 10 ~ 30 m 0.5 ~ 2 5 ~ 20
WLAN Wifi 0.15 ~ 1.5 km 11 ~ 108 5 ~ 20
WMAN WiMAX 5 ~ 50 km 1.5 ~ 20 5 ~ 20
WWAN 3G phone cell: 1 ~ -5 348 ~ 14.4 100 ~ 500

1.4 Network Errors

  • Packets lost:

    • external interference in wireless networks
    • processing delays and buffer overflow at switches and at the destination node
  • Out-of-order packets: principally in wide area networks where separate packets are individually routed

  • Duplicate packets: a consequence of an assumption by the sender that a packet has been

    lost. The packet is retransmitted, and both the original and the retransmitted copy then turn

    up at the destination.

1.5 Network Principles

1.5.1 Packet Transmission

  • Packet-switching: the basis for all computer networks
    • Use: enables data packets addressed to different destinations to share a single communications link
    • Mechanism: queued in a buffer and transmitted when the link is available
  • Message: logical units of information (of arbitrary length), subdivided into packets
  • Packet: a sequence of binary data (an array of bits or bytes) of restricted length (together with addressing information sufficient to identify the source and destination computers)
  • Packets is of restricted length:
    • so that each computer in the network can allocate sufficient buffer storage to hold the largest possible incoming packet;
    • to avoid the undue delays that would occur in waiting for communication channels to become free if long messages were transmitted without subdivision.

1.5.2 Data Streaming

  • Streaming: the transmission and display of audio and video in real time
    • Feature: requires much higher bandwidths than most other forms of communication
  • ATM (Asynchronous Transfer Mode)

1.5.3 Switching Schemes

  • Switching Schemes: to transmit information between two arbitrary nodes (A network consists of a set of nodes connected together by circuits.)
  • Types of switching in networking:
    • Broadcast: everything is transmitted to every node, and it is up to potential receivers to notice transmissions addressed to them. (Ethernet, wireless networking….)
    • Circuit switching: plain old telephone system (POTS)
    • Packet switching: store-and-forward network (pretty much like traditional the postal system)
    • Frame relay: overcome the delay problems by switching small packets (called frames) on the fly (ATM networks)

1.5.4 Protocals

  • Protocol: a well-known set of rules and formats to be used for communication between processes in order to perform a given task
    • containing two important parts:
      • a specification of the sequence of messages that must be exchanged
      • a specification of the format of the data in the messages.
Encapsulation as it is applied in layered protocols
Encapsulation as it is applied in layered protocols
  • Protocol layers: Network software is arranged in a hierarchy of layers
    • Each layer:
      • presents an interface to the layers above it that extends the properties of the underlying communication system
      • represented by a module in every computer connected to the network
    • On the sending side, each layer (except the topmost, or application layer) accepts items of data in a specified format from the layer above it and applies transformations to encapsulate the data in the format specified for that layer before passing it to the layer below for further processing.
    • On the receiving side, the converse transformations are applied to data items received from the layer below before they are passed to the layer above.
Protocol layers in the ISO Open Systems Interconnection (OSI) protocol model
Protocol layers in the ISO Open Systems Interconnection (OSI) protocol model
  • Protocol suites / Protocol stack: a complete set of protocol layers reflecting the layered structure
OSI protocol summary
OSI protocol summary

Differently, internetwork protocol suites include an application layer, a transport layer and an internetwork layer.

Internetwork protocols are overlaid on underlying networks. The network interface layer accepts internetwork packets and converts them into packets suitable for transmission by the network layers of each underlying network.

Internetwork layers
Internetwork layers
  • Packet assembly: The task of dividing messages into packets before transmission and reassembling them at the receiving computer is usually performed in the transport layer.

The network-layer protocol packets consist of a header and a data field.

In most network technologies, the data field is variable in length, with the maximum length called the maximum transfer unit (MTU). If the length of a message exceeds the MTU of the underlying network layer, it must be fragmented into chunks of the appropriate size, with sequence numbers for use on reassembly, and transmitted in multiple packets. (An internetwork packet is the unit of data transmitted over an internetwork.)

  • Ports: software-defined destination points at a host computer, attached to processes, enabling data transmission to be addressed to a specific process at a destination node.

    Port numbers below 1023 are defined as well-known ports whose use is restricted to privileged processes in most operating systems.

    The ports between 1024 and 49151 are registered ports for which IANA holds service descriptions.

    The remaining ports up to 65535 are available for private purposes.

    (In practice, all of the ports above 1023 can be used for private purposes instead,)

  • Addressing: a numeric identifier that uniquely identifies a host computer and enables it to be located by nodes that are responsible for routing data to it. In the Internet every host computer is assigned an IP number,

  • Packet delivery

    • Datagram packet delivery (used in most of networks): the delivery of each packet is a ‘one-shot’ process; no setup is required, and once the packet is delivered the network retains no information about it. (Every datagram packet contains the full network address of the source and destination hosts.)
    • Virtual circuit packet delivery: must be set up with a virtual circuit before packets can pass from a source host A to destination host B. (The addresses are not needed, because packets are routed at intermediate nodes by reference to the virtual circuit number.)
Routing in a wide area network
Routing in a wide area network
  • Routing: a function that is required in all networks except those LANs that provide direct connections between all pairs of attached hosts. (In large networks, adaptive routing is employed.)

    • Routing algorithm: determines the routes for the transmission of packets to their destinations, implemented by a program in the network layer at each node

      • Tasks:

        • Determining the route for each packet
        • Dynamically update its knowledge of the network based on traffic monitoring and the detection of configuration changes or failures
      • A simple routing algorithm (distance vector algorithm):

        Routing tables for the network above
        Routing tables for the network above

        A router exchanges information about the network with its neighbouring nodes by sending a summary of its routing table using a router information protocol (RIP).

        The RIP actions performed at a router are described informally as follows:

        1. Periodically, and whenever the local routing table changes, send the table (in a summary form) to all accessible neighbours. That is, send an RIP packet containing a copy of the table on each non-faulty outgoing link.
        2. When a table is received from a neighbouring router, if the received table shows a route to a new destination, or a better (lower-cost) route to an existing destination, update the local table with the new route.
        // Pseudo-code for RIP routing algorithm

        Send: Each t seconds or when Tl (local table) changes, send Tl on each non-faulty outgoing link.

        Receive: Whenever a routing table Tr (recived table) is received on link n:

        for all rows Rr in Tr {
        if ( ≠ n) {
        Rr.cost = Rr.cost + 1; = n;
        if (Rr.destination is not in Tl)
        // add new destination to Tl
        add Rr to Tl;
        for all rows Rl in Tl {
        if (Rr.destination = Rl.destination and (Rr.cost < Rl.cost or = n))
        // Rr.cost < Rl.cost : remote node has better route
        // = n : remote node is more authoritative
        Rl = Rr;

        To deal with faults, each router monitors its links and acts as follows:

        When a faulty link n is detected, set cost to ∞ for all entries in the local table that refer to the faulty link and perform the Send action.

        Thus the information that the link is broken is represented by an infinite value for the cost to the relevant destinations. When this information is propagated to neighbouring nodes it will be processed according to the Receive action (note g+1 = g) and then propagated further until a node is reached that has a working route to the relevant destinations, if one exists. The node that still has a working route will eventually propagate its table, and the working route will replace the faulty one at all nodes.

1.5.5 Internet Protocols (TCP/IP Suite of Protocols)

TCP/IP layers
TCP/IP layers
Encapsulation as it occurs when a message is transmitted via TCP over an Ethernet
Encapsulation as it occurs when a message is transmitted via TCP over an Ethernet
  • TCP (Transmission Control Protocol): a reliable connection-oriented transport protocol
  • IP (Internet Protocol): the underlying ‘network’ protocol (not the only network layer involved in the implementation of Internet communication) of the Internet virtual network (i.e. IP datagrams provide the basic transmission mechanism for the Internet and other TCP/IP networks)
  • UDP (User Datagram Protocol): a datagram protocol that does not guarantee reliable transmission

Many application services and application-level protocols now exist based on TCP/IP, including the Web (HTTP), email (SMTP, POP), netnews (NNTP), file transfer (FTP) and Telnet (telnet).

The programmer's conceptual view of a TCP/IP Internet
The programmer's conceptual view of a TCP/IP Internet

UDP • UDP is almost a transport-level replica of IP. • It provides a means of transmitting messages of up to 64 kbytes in size (the maximum packet size permitted by IP). • A UDP datagram is encapsulated inside an IP packet. • It has a short header that includes: • the source and destination port numbers; • a length field; • and a checksum (which is optional). • UDP offers no guarantee of delivery. • It incurs no setup costs and it requires no administrative acknowledgement messages.

TCP • TCP is connection-oriented. • This means, before any data is transferred, the sending and receiving processes must cooperate in the establishment of a bidirectional communication channel. • The connection is simply an end-to-end agreement to perform reliable data transmission. • Intermediate nodes (such as routers) have no knowledge of TCP connections. • The IP packets that transfer the data in a TCP transmission do not necessarily all follow the same route.

  • TCP provides reliable delivery of arbitrarily long sequences of bytes.
  • Some main features of TCP:
    • Sequencing (i.e., it retains order): By assigning sequence numbers toeach segment of the stream, the receiving process can put thesegments back together in the original order.
    • Flow (congestion) control: The sender takes care not to overwhelmthe receiver or the intervening nodes.
    • Checksum: Each segment carries a checksum covering the headerand the data in thesegment. If a received segment does not match itschecksum, the segment is dropped

2 Socket

2.1 Sockets

  • A socket is an abstraction through which an application may send and receive data.
  • A socket allows an application to “plug in” to the network and communicate with other applications that are also plugged in to the same network.
  • Information written to the socket by an application on one machine can be read by an application on a different machine, and vice versa.

3 Remote Procedure Call

3.1 Steps of a Remote Procedure Call

  1. Client procedure calls client stub in normal way;
  2. Client stub builds message, calls local OS;
  3. Client’s OS sends message to remote OS;
  4. Remote OS gives message to server stub;
  5. Server stub unpacks parameters, calls server;
  6. Server does work, returns result to the stub;
  7. Server stub packs it in message, calls local OS;
  8. Server’s OS sends message to client's OS;
  9. Client’s OS gives message to client stub;
  10. Stub unpacks result, returns to client.