Pages

Saturday, 4 May 2013

Thoughts on "A Note on Distributed Computing"

A Note on Distributed Computing by Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall is a widely cited paper. I have been reading and trying to understand it for sometime. It's available here - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.7628

The title of the paper is innocuous but it's much more than a "note". It analyzes the key differences between local and distributed computing, and explains why attempts to unify their programming models are misguided because of the fundamental differences underlying them.


The authors used to be part of the erstwhile Sun Microsystems when they wrote it - it dates from 1994. Later some of them were members of the JINI technology team and also wrote RMI, and if you look at the Java RMI source code,  you can see some of their names.


But in 1994 when this paper was written, Java had not emerged yet. There was no J2EE and CORBA was still young.

I wish to share my thoughts after reading it, and the realization that the opinions expressed in it influenced the design of Java's RMI.


Unification

Briefly, unification = unification of the local and the distributed programming models. Note that we are talking about distributed object oriented systems here.

The unification attempts assume that objects are essentially of a single top level type (like in Java), which might span different address spaces on the same or different machines (like different JVMs on the same or different machines in the case of Java) and they can communicate in the same way irrespective of where they are located. In other words, location (same JVM versus another JVM in another country) is merely an implementation detail that can be abstracted away behind the interfaces used to communicate between two objects without any side effects.


Such a (hypothetical) system would have the following characteristics

  1. Program functionality is not affected by the location of the object on which an operation has been invoked. Or viewing it from a slightly higher level, there is a single design as to how a system communicates irrespective of whether it's deployed in one address space or in multiple ones. 
  2. Maintenance and upgrades can be done to individual objects without affecting the rest of the system. 
  3. There is no need to handle failure and performance issues in the system design.
  4. Object interfaces are always the same regardless of the context (i.e. remote or local)
The authors contend that all these statements are flawed. I'll not attempt to go into those details - the paper explains them well.

The paper then goes onto examine the 4 areas where local and distributed computing differ drastically:
    Latency

    Memory Access
    Partial Failure
    Concurrency

Those of us who have worked on distributed enterprise and internet software have come across these. These 4 differences cannot be papered over to present a 'unified' view of objects which lie on different machines.

 

Java RMI
If you look at RMI, you can see its design influenced by the assumption that the above 4 points are invalid.
  • "Remote" objects have to extend the java.rmi.Remote interface. "Remote" objects - objects that can be invoked from another JVM - are different from local objects.
  • Remote (inter-JVM) method calls have to explicitly handle the java.rmi.RemoteException, which is a checked exception, thus highlighting the fact that a distributed call is subject to modes of failure that are non-existent in a local call. In fact, it extends java.io.IOException and the javadoc is explicit about network issues "is the common superclass for a number of communication-related exceptions that may occur during the execution of a remote method call".
Let's look at #2 again. From the paper:
"As long as the interfaces between objects remain constant, the implementations of those objects can be altered at will".
Premonition of SOA, anyone? This concept would be familiar today to anybody who is acquainted with the fundamental principles of service oriented system design (replace 'object' with 'service'). But since these statements are challenged and refuted later in the paper, the question naturally arises - how come SOA is successful?  

SOA assumes that things are independent and distributed services, and any invocation of a service assumes that there are failure modes which exist because of the communication's distributed nature. This builds on the same RMI concept as having to explicitly throw RemoteException when making a remote (distributed) call. This same concept is taken into consideration while writing any SOA system, which is another way of saying that the authors of the paper were correct.

Note: A short and readable summary of Java RMI is to be found in Jim Waldo's book Java: The Good Parts.

No comments:

Post a Comment