Unit I: Distributed Systems

Introduction to Distributed Computing – 
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal. A computer program that runs in a distributed system is called a distributed program, and distributed programming is the process of writing such programs.
Distributed computing also refers to the use of distributed systems to solve computational problems. In distributed computing, a problem is divided into many tasks, each of which is solved by one or more computers.
They communicate with each other by message passing.

Scope of Problems –
1) In business enterprises, distributed computing generally has meant putting various steps in business processes at the most efficient places in a network of computers. In the typical transaction using the 3-tier model, user interface processing is done in the PC at the user’s location, business processing is done in a remote computer, and database access and processing is done in another computer that provides centralized access for many business processes. Typically, this kind of distributed computing uses theclient/server communications model.

2) More recently, distributed computing is used to refer to any large collaboration in which many individual personal computer owners allow some of their computer’s processing time to be put at the service of a large problem. The best-known example is the SETI@home project in which individual computer owners can volunteer some of their multitaskingprocessing cycles (while concurrently still using their computer) to the Search for Extraterrestrial Intelligence ( SETI ) project. This computing-intensive problem uses your computer (and thousands of others) to download and search radio telescope data.

Parallel and distributed computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (“in parallel”). There are several different forms of parallel computing: bit-level, instruction level, data, and task parallelism.

Distributed systems are groups of networked computers, which have the same goal for their work. The terms “concurrent computing“, “parallel computing“, and “distributed computing” have a lot of overlap, and no clear distinction exists between them. The same system may be characterised both as “parallel” and “distributed”; the processors in a typical distributed system run concurrently in parallel. Parallel computing may be seen as a particular tightly coupled form of distributed computing, and distributed computing may be seen as a loosely coupled form of parallel computing. Nevertheless, it is possible to roughly classify concurrent systems as “parallel” or “distributed” using the following criteria:

  • In parallel computing, all processors have access to a shared memory. Shared memory can be used to exchange information between processors.
  • In distributed computing, each processor has its own private memory (distributed memory). Information is exchanged by passing messages between the processors.

The figure on the right illustrates the difference between distributed and parallel systems. Figure (a) is a schematic view of a typical distributed system; as usual, the system is represented as a network topology in which each node is a computer and each line connecting the nodes is a communication link. Figure (b) shows the same distributed system in more detail: each computer has its own local memory, and information can be exchanged only by passing messages from one node to another by using the available communication links. Figure (c) shows a parallel system in which each processor has a direct access to a shared memory.
The situation is further complicated by the traditional uses of the terms parallel and distributed algorithm that do not quite match the above definitions of parallel and distributed systems; see the section Theoretical foundations below for more detailed discussion. Nevertheless, as a rule of thumb, high-performance parallel computation in a shared-memory multiprocessor uses parallel algorithms while the coordination of a large-scale distributed system uses distributed algorithms

Distributed Systems: Networking

main types of network that are used to support distributed systems: personal area networks, local area networks, wide area networks, metropolitan area networks and the wireless variants of them. Internetworks such as the Internet are constructed from networks of all these types.

Personal area networks (PANs) • PANs are a subcategory of local networks in which the various digital devices carried by a user  are connected by a low-cost, low-energy network. Wired PANs are not of much significance because few users wish to be encumbered by a network of wires on their person, but wireless personal area networks (WPANs) are of increasing importance due to the number of  personal devices such as mobile phones, tablets, digital cameras, music players and so on that are now carried by many people.

Local area networks (LANs) • LANs carry messages at relatively high speeds between computers connected by a single  communication medium, such as twisted copper wire, coaxial cable or optical fibre. A segment is a section of cable that serves a department or a floor of a building and may have many computers attached. No routing of messages is required within a segment, since the medium provides direct connections between all of the computers connected to it. The total system bandwidth is shared  between the computers connected to a segment

Wide area networks (WANs) • WANs carry messages at lower speeds between nodes that are often in different organizations and  may be separated by large distances. They may be located in different cities, countries or continents. The communication medium is a set of communication circuits linking a set of dedicated computers called routers. They manage the communication network and route  messages or packets to their destinations. In most networks, the routing operations introduce a delay at each point in the route, so the total latency for the transmission of a message depends on the route that it follows and the traffic loads in the various network segments that it traverses. In current networks these latencies can be as high as 0.1 to 0.5 seconds.

Metropolitan area networks (MANs) • This type of network is based on the highbandwidth copper and fibre optic cabling recently  installed in some towns and cities for the transmission of video, voice and other data over distances of up to 50 kilometres. A variety of technologies have been used to implement the routing of data in MANs, ranging from Ethernet to ATM

Wireless local area networks (WLANs) • WLANs are designed for use in place of wired LANs to provide connectivity for mobile  devices, or simply to remove the need for a wired infrastructure to connect computers within homes and office buildings to each other and the Internet. They are in widespread use in several variants of the IEEE 802.11 standard (WiFi), offering bandwidths of 10–100 Mbps over ranges up to 1.5 kilometres.

Wireless metropolitan area networks (WMANs) • The IEEE 802.16 WiMAX standard is targeted at this class of network. It aims  to provide an alternative to wired connections to home and office buildings and to supersede 802.11 WiFi networks in some applications.

Wireless wide area networks (WWANs) • Most mobile phone networks are based on digital wireless network technologies such as  the GSM (Global System for Mobile communication) standard, which is used in most countries of the world. Mobile phone networks are designed to operate over wide areas (typically entire countries or continents) through the use of cellular radio connections; their data transmission facilities therefore offer wide area mobile connections to the Internet for portable devices. The cellular networks mentioned above offer relatively low data rates – 9.6 to 33 kbps – but the ‘third generation’ (3G) of mobile phone networks is now available, with data transmission rates in the range of 2–14.4 Mbps while stationary and 348 kbps while moving (for example in a car). The underlying technology is referred to as UMTS (Universal Mobile Telecommunications System). A path has also been defined to evolve UMTS towards 4G data rates of up to 100 Mbps

Remote Procedure Calls

Message passing between a pair of processes can be supported by two message communication operations, send and receive, defined in terms of destinations and messages. To communicate, one process sends a message (a sequence of bytes) to a destination and another process at the destination receives the message. This activity involves the communication of data from the sending process to the receiving process and may involve the synchronization of the two processes.

Synchronous and asynchronous communication • A queue is associated with each message destination. Sending processes cause messages to be added to remote queues and receiving processes remove messages from local queues. Communication between the sending and receiving processes may be either synchronous or asynchronous.

In the synchronous form of communication, the sending and receiving processes synchronize at every message. In this case, both send and receive are blocking operations. Whenever a send is issued the sending process (or thread) is blocked until the corresponding receive is issued. Whenever a receive is issued by a process (or thread), it blocks until a message arrives.

In the asynchronous form of communication, the use of the send operation is nonblocking in that the sending process is allowed to proceed as soon as the message has been copied to a local buffer, and the transmission of the message proceeds in parallel with the sending process. The receive operation can have blocking and non-blocking variants. In the non-blocking variant, the receiving process proceeds with its program after issuing a receive operation, which provides a buffer to be filled in the background,
but it must separately receive notification that its buffer has been filled, by polling or interrupt.

Transaction processing systems

A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources. Distributed transactions, as any othertransactions, must have all four ACID properties, where atomicity guarantees all-or-nothing outcomes for the unit of work (operations bundle).

In computing, the XA standard is a specification by The Open Group for distributed transaction processing (DTP). It describes the interface between the global transaction managerand the local resource manager. The goal of XA is to allow multiple resources (such as databases, application servers, message queues, transactional caches, etc.) to be accessed within the same transaction, thereby preserving the ACID properties across applications. XA uses a two-phase commit to ensure that all resources either commit or rollback any particular transaction consistently (all do the same).

XA with Two-Phase Commit Operation–

This example shows basic two-phase COMMIT functionality for a distributed transaction.
This class includes a createXid() method to form transaction IDs for purposes of this example. It also includes doSomeWork1() and doSomeWork2() methods to perform SQL operations.

// You need to import the java.sql package to use JDBC
import java.sql.*;
import javax.sql.*;
import oracle.jdbc.driver.*;
import oracle.jdbc.pool.*;
import oracle.jdbc.xa.OracleXid;
import oracle.jdbc.xa.OracleXAException;
import oracle.jdbc.xa.client.*;
import javax.transaction.xa.*;

class XA4
{
public static void main (String args [])
throws SQLException
{

try
{
String URL1 = “jdbc:oracle:oci8:@”;
String URL2 =
“jdbc:oracle:thin:@
(description=(address=(host=dlsun991)(protocol=tcp)
(port=5521))(connect_data=(sid=rdbms2)))”;

DriverManager.registerDriver(new OracleDriver());

// You can put a database name after the @ sign in the connection URL.
Connection conna =
DriverManager.getConnection (URL1, “scott”, “tiger”);

// Prepare a statement to create the table
Statement stmta = conna.createStatement ();

Connection connb =
DriverManager.getConnection (URL2, “scott”, “tiger”);

// Prepare a statement to create the table
Statement stmtb = connb.createStatement ();

// Create a XADataSource instance
OracleXADataSource oxds1 = new OracleXADataSource();
oxds1.setURL(“jdbc:oracle:oci8:@”);
oxds1.setUser(“scott”);
oxds1.setPassword(“tiger”);

OracleXADataSource oxds2 = new OracleXADataSource();

oxds2.setURL
(“jdbc:oracle:thin:@(description=(address=(host=dlsun991)
(protocol=tcp)(port=5521))(connect_data=(sid=rdbms2)))”);
oxds2.setUser(“scott”);
oxds2.setPassword(“tiger”);

// Get a XA connection to the underlying data source
XAConnection pc1  = oxds1.getXAConnection();

// We can use the same data source
XAConnection pc2  = oxds2.getXAConnection();

// Get the Physical Connections
Connection conn1 = pc1.getConnection();
Connection conn2 = pc2.getConnection();

// Get the XA Resources
XAResource oxar1 = pc1.getXAResource();
XAResource oxar2 = pc2.getXAResource();

// Create the Xids With the Same Global Ids
Xid xid1 = createXid(1);
Xid xid2 = createXid(2);

// Start the Resources
oxar1.start (xid1, XAResource.TMNOFLAGS);
oxar2.start (xid2, XAResource.TMNOFLAGS);

// Do  something with conn1 and conn2
doSomeWork1 (conn1);
doSomeWork2 (conn2);

// END both the branches — THIS IS MUST
oxar1.end(xid1, XAResource.TMSUCCESS);
oxar2.end(xid2, XAResource.TMSUCCESS);

// Prepare the RMs
int prp1 =  oxar1.prepare (xid1);
int prp2 =  oxar2.prepare (xid2);

System.out.println(“Return value of prepare 1 is ” + prp1);
System.out.println(“Return value of prepare 2 is ” + prp2);

boolean do_commit = true;

if (!((prp1 == XAResource.XA_OK) || (prp1 == XAResource.XA_RDONLY)))
do_commit = false;

if (!((prp2 == XAResource.XA_OK) || (prp2 == XAResource.XA_RDONLY)))
do_commit = false;

System.out.println(“do_commit is ” + do_commit);
System.out.println(“Is oxar1 same as oxar2 ? ” + oxar1.isSameRM(oxar2));

if (prp1 == XAResource.XA_OK)
if (do_commit)
oxar1.commit (xid1, false);
else
oxar1.rollback (xid1);

if (prp2 == XAResource.XA_OK)
if (do_commit)
oxar2.commit (xid2, false);
else
oxar2.rollback (xid2);

// Close connections
conn1.close();
conn1 = null;
conn2.close();
conn2 = null;

pc1.close();
pc1 = null;
pc2.close();
pc2 = null;
} catch (SQLException sqe)
{
sqe.printStackTrace();
} catch (XAException xae)
{
if (xae instanceof OracleXAException) {
System.out.println(“XA Error is ” +
((OracleXAException)xae).getXAError());
System.out.println(“SQL Error is ” +
((OracleXAException)xae).getOracleError());
}
}
}
static Xid createXid(int bids)
throws XAException
{
byte[] gid = new byte[1]; gid[0]= (byte) 9;
byte[] bid = new byte[1]; bid[0]= (byte) bids;
byte[] gtrid = new byte[64];
byte[] bqual = new byte[64];
System.arraycopy (gid, 0, gtrid, 0, 1);
System.arraycopy (bid, 0, bqual, 0, 1);
Xid xid = new OracleXid(0x1234, gtrid, bqual);
return xid;
}

}

Distributed File Systems:

NFS

The Sun Network Filesystem (NFS) protocol provides transparent remote  access to shared files across networks.  The NFS protocol is designed to be portable across different machines, operating systems, network architectures, and transport protocols.  This portability is achieved through the use of Remote Procedure Call (RPC) primitives built on top of an eXternal Data Representation (XDR)

External Data Representation /rfc1014

The eXternal Data Representation (XDR) standard provides a common way of representing a set of data types over a network.

Specification is written using the RPC data description language. For more information, see RFC 1014, “XDR: External Data
Representation Standard”.  Although automated RPC/XDR compilers exist to generate server and client “stubs”, NFS does not require their use.  Any software that provides equivalent functionality can be used, and if the encoding is exactly the same it can interoperate
with other implementations of NFS. XDR is a standard for the description and encoding of data.  It is useful for transferring data between different computer architectures, and has been used to communicate data between such diverse machines as the SUN WORKSTATION*, VAX*, IBM-PC*, and Cray*.

XDR uses a language to describe data formats.  The language can only be used only to describe data; it is not a programming language.
For each data type in the language we show a general paradigm declaration.  Note that angle brackets (< and >) denote
variablelength sequences of data and square brackets ([ and ]) denote fixed-length sequences of data

Integer

An XDR signed integer is a 32-bit datum that encodes an integer in
the range [-2147483648, 2147483647].  The integer is represented in
two’s complement notation.  The most and least significant bytes are
0 and 3, respectively.  Integers are declared as follows:

int identifier;

(MSB)                   (LSB)
+——-+——-+——-+——-+
|byte 0 |byte 1 |byte 2 |byte 3 |                      INTEGER
+——-+——-+——-+——-+
<————32 bits————>

Enumeration

Enumerations have the same representation as signed integers.
Enumerations are handy for describing subsets of the integers.
Enumerated data is declared as follows:

enum { name-identifier = constant, … } identifier;

For example, the three colors red, yellow, and blue could be
described by an enumerated type:

enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;

Stateless Servers

The NFS protocol was intended to be as stateless as possible.  That is, a server should not need to maintain any protocol state
information about any of its clients in order to function correctly.

Inherently stateful operations such as file or record locking, and remote execution,  were implemented as separate services,

File System Model

NFS assumes a file system that is hierarchical, with directories as all but the bottom level of files.  Each entry in a directory (file,
directory, device, etc.) has a string name.  Different operating systems may have restrictions on the depth of the tree or the names
used, as well as using different syntax to represent the “pathname”, which is the concatenation of all the “components” (directory and
file names) in the name.  A “file system” is a tree on a single server (usually a single disk or physical partition) with a specified
“root”.  Some operating systems provide a “mount” operation to make all file systems appear as a single tree, while others maintain a
“forest” of file systems

http://www.ietf.org/rfc/rfc1094.txt

Server/Client Relationship

The NFS protocol is designed to allow servers to be as simple and
general as possible.  Sometimes the simplicity of the server can be a
problem, if the client wants to implement complicated filesystem
semantics.

For example, some operating systems allow removal of open files.  A
process can open a file and, while it is open, remove it from the
directory.  The file can be read and written as long as the process
keeps it open, even though the file has no name in the filesystem.
It is impossible for a stateless server to implement these semantics.
The client can do some tricks such as renaming the file on remove,
and only removing it on close.

Every NFS client can also potentially be a server, and remote and
local mounted filesystems can be freely intermixed.  This leads to
some interesting problems when a client travels down the directory
tree of a remote filesystem and reaches the mount point on the server
for another remote filesystem.  Allowing the server to follow the
second remote mount would require loop detection, server lookup, and
user revalidation.  Instead, we decided not to let clients cross a
server’s mount point.  When a client does a LOOKUP on a directory on
which the server has mounted a filesystem, the client sees the
underlying directory instead of the mounted directory.

For example, if a server has a file system called “/usr” and mounts
another file system on  “/usr/src”, if a client mounts “/usr”, it
does NOT see the mounted version of “/usr/src”.

 Permission Issues

The NFS protocol, strictly speaking, does not define the permission
checking used by servers.  However, it is expected that a server will
do normal operating system permission checking using AUTH_UNIX style
authentication as the basis of its protection mechanism.  The server
gets the client’s effective “uid”, effective “gid”, and groups on
each call and uses them to check permission.  There are various
problems with this method that can been resolved in interesting ways.

Using “uid” and “gid” implies that the client and server share the
same “uid” list.  Every server and client pair must have the same
mapping from user to “uid” and from group to “gid”.  Since every
client can also be a server, this tends to imply that the whole
network shares the same “uid/gid” space.  AUTH_DES (and the next
revision of the NFS protocol) uses string names instead of numbers,
but there are still complex problems to be solved.

Another problem arises due to the usually stateful open operation.
Most operating systems check permission at open time, and then check
that the file is open on each read and write request.  With stateless
servers, the server has no idea that the file is open and must do
permission checking on each read and write call.  On a local
filesystem, a user can open a file and then change the permissions so
that no one is allowed to touch it, but will still be able to write
to the file because it is open.  On a remote filesystem, by contrast,
the write would fail.  To get around this problem, the server’s
permission checking algorithm should allow the owner of a file to
access it regardless of the permission setting.

A similar problem has to do with paging in from a file over the
network.  The operating system usually checks for execute permission
before opening a file for demand paging, and then reads blocks from
the open file.  The file may not have read permission, but after it
is opened it does not matter.  An NFS server can not tell the
difference between a normal file read and a demand page-in read.  To
make this work, the server allows reading of files if the “uid” given
in the call has either execute or read permission on the file.

http://www.ietf.org/rfc/rfc3530.txt

Overview of NFS version 4 Features

RPC and Security
Kerberos V5
Kerberos  is a computer network authentication protocol which works on the basis of “tickets” to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. Its designers aimed primarily at a client–server model, and it providesmutual authentication—both the user and the server verify each other’s identity.
and requires a trusted third party, and optionally may use public-key cryptography by utilizing asymmetric key cryptography during certain phases of authentication.[1] Kerberos uses port 88 by default.

To enable in-band security negotiation, the NFS version 4 protocol  has added a new operation which provides the client a method of
querying the server about its policies regarding which security  mechanisms must be used for access to the server’s filesystem
resources.  With this, the client can securely match the security  mechanism that meets the policies specified at both the client and
server.

Procedure and Operation Structure

A significant departure from the previous versions of the NFS  protocol is the introduction of the COMPOUND procedure.  For the NFS
version 4 protocol, there are two RPC procedures, NULL and COMPOUND.  The COMPOUND procedure is defined in terms of operations and these operations correspond more closely to the traditional NFS procedures. With the use of the COMPOUND procedure, the client is able to build simple or complex requests.  These COMPOUND requests allow for a reduction in the number of RPCs needed for logical filesystem operations.  For example, without previous contact with a server a client will be able to read data from a file in one request by combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC.
With previous versions of the NFS protocol, this type of single request was not possible.
AFS.
It was developed by Carnegie Mellon University as part of the Andrew Project. It is named after Andrew Carnegie and Andrew Mellon. Its primary use is in distributed computing.

The Andrew Project was a distributed computing environment developed at Carnegie Mellon University (CMU) beginning in 1982.

AFS has several benefits over traditional networked file systems, particularly in the areas of security and scalability. It is not uncommon for enterprise AFS deployments to exceed 25,000 clients. AFS uses Kerberos for authentication, and implements access control lists on directories for users and groups. Each client caches files on the local filesystem for increased speed on subsequent requests for the same file. This also allows limited filesystem access in the event of a server crash or a network outage.

Read and write operations on an open file are directed only to the locally cached copy. When a modified file is closed, the changed portions are copied back to the file server. Cache consistency is maintained by callback mechanism. When a file is cached, the server makes a note of this and promises to inform the client if the file is updated by someone else.

  1. Leave a comment

Leave a comment