Distributed Systems: Concepts and Design
To study the general characteristics of interprocess communication and the particular characteristics of both datagram and stream communication in the Internet.
To be able to write Java applications that use the Internet protocols and Java serialization.
To be aware of the design issues for Request-Reply protocols and how collections of data objects may be represented in messages (RMI and language integration are left until Chapter 5).
To be able to use the Java API to IP multicast and to consider the main options for reliability and ordering in group communication.
The material in Chapter 3 is relevant for networking, where the emphasis is on a single interconnection between a pair of components. Chapter 4 is concerned with distributed systems in which there are many components and interconnections and the emphasis is on the logical relationship between components, e.g. client and server.
Datagram communication is a useful building block for lightweight interprocess communication protocols (e.g. the Request-Reply protocol) because it carries the minimum possible overheads for the resulting protocols. Stream communication is a useful alternative, mainly because can simplify the programming task.
Client and server programs deal with data objects which must be marshalled into a standard form before they can be passed in messages. The standard form (or external data representation) deals with both data structures and primitive data items. CORBA's CDR is designed for use by a variety of programming languages.
Request-reply protocols are designed to support client-server communication in which a client asks a server to perform an operation on an object and return the result. This relationship determines the protocol as follows: (i) the client needs one primitive ( doOperation ) and the server needs two ( getRequest and sendReply ); (ii) since clients normally wait for replies doOperation is synchronous; (iii) a server must be able to receive getRequest messages while performing operations. To deal with failure, requests are re-transmitted. To ensure that an operation is performed at most once, duplicate requests are filtered and replies are saved for re-transmission.
Communication between the members of a group of processes is the basis of replicated services. IP multicast allows a process to send an IP packet to a set of computers that form a multicast group. It provides unreliable multicast, which is useful for applications such as finding discovery services. Other applications require a multicast service with stricter reliability and ordering characteristics. The latter are presented in Chapter 14.
Considerable difficulties are associated with having a clear failure model for basic interprocess communication and for the protocols built over it. In particular, students tend to associate communication errors with network failures. They should be encouraged to think about the usual cause of lost messages and carry out Exercise 4.3. They also tend to think that TCP is entirely reliable - they should study its failure model on page 139 and do Exercise 4.6.
The use of handles (for objects) in Java object serialization is tricky. See Exercises 4.10 - 4.11.
Remote object references are introduced briefly in Section 4.3.3 because they need to be passed in the request-reply protocol (they may cause problems for students without much experience of OOP). They are revisited in Chapter 5.
Although the request-reply protocol appears to be straightforward, students do not always appreciate the effort required by the protocol to ensure that an operation is performed at most once. This is important for Chapter 5.
The Java API provides students with a simple means of writing programs that use datagrams and streams. Many simple exercises can be built following on from Exercises 4.3-6. Students should be encouraged to see these two protocols as building blocks for higher-level protocols such as the request-reply protocol. See Exercises 4.12-16.
Compare CORBA CDR with Java serialization, emphasizing the common requirement to represent primitive data types and programming data structures in messages. Point out that CORBA provides a language independent representation. In contrast Java serialization deals with objects and object references and includes type information.
Before teaching the request-reply protocol, discuss the structure of a local invocation, pointing out that the target object as well as the method and arguments must be specified. Extend this to the need to specify a remote object reference to the target object as well as the method and arguments. Consider the effects of client or server failing independently and of lost messages and then introduce the use of timeouts, re-transmission of request messages and the history.
For group communication, use IP multicast as an example of a useful protocol with a well-understood fault model. Introduce some applications that have stronger requirements on reliability and ordering.