LITERATURE REVIEW -CHUBBY SYNCHRONIZATION

Abstract

This is an analysis of the Google Chubby Distribution Lock Service. It is an attempt to understand Chubby and to compare it with traditional distribution lock services.The Chubby lock service is intended to provide coarse-grained lockingas well as reliable (though low-volume) storage fora loosely-coupled distributed system.

Introduction

A distributed lock manager (DLM) provides distributed software applications with a means to synchronize their accesses to shared resources.DLMs have been used as the foundation for several successful clustered file systems, in which the machines in a cluster can use each other's storage via a unified file system, with significant advantages for performance and availability. The main performance benefit comes from solving the problem of disk cache coherency between participating computers. The DLM is used not only for file locking but also for coordination of all disk access.

Chubby is a distributed lock service intended for coarsegrainedsynchronization of activities within Google’sdistributed systems; it has found wider use as a nameservice and repository for configuration information.Its design is based on well-known ideas that havemeshed well: distributed consensus among a few replicasfor fault tolerance, consistent client-side caching to reduceserver load while retaining simple semantics, timelynotification of updates, and a familiar file system interface.

Chubby provides an interface much like a distributed file system with advisory locks, but the design emphasis is on availability and reliability, as opposed to high performance. The purpose of the lock service is to allow its clients

to synchronize their activities and to agree on basic information about their environment.

Overview of Chubby

The primary goals of Chubby are:

  • Reliability and availability to a moderately large setof clients
  • Easy-to-understand semantics

Throughputand storage capacity were considered secondary.

The paper mentions that Chubby’s client interface is similar to that of a simple filesystem that performs whole-file reads and writes, augmentedwith advisory locks and with notification of variousevents such as file modification.

Google used ad hoc methods for primary election(when work could be duplicated without harm), orrequired operator intervention (when correctness was essential) before Chubby. Deploying Chubby had improved availability of systems and did not require human intervention on failure of a node.

The paper does not mention the design and use of any new algorithms or techniques. It was designed in a way that it would be easier for the developers and looking at the locking service from a developer’s point of view. Keeping in mind that the initial development process starts as a prototype which can only handle little load the code may not be able to handle the unavailability of a certain system. It is only when the service gains clients that the availability factor is taken into consideration.

These are the two key design decisions:

• A lock service, as opposed to a library or service for consensus

• To serve small-files to permit elected primaries

The design decisions have been taken from observations and the expected use of the environment.

The expected use of the environment is as follows:

•To enable thousands of clients to observe this file, preferablywithout needing many servers.

• Need for an event notification mechanism to avoid polling.

• Caching of files is desirable for clients who will keep polling constantly.

• Consistent caching to avoid confusion by non-intuitive caching.

• Necessity for security mechanisms, including access control.

Chubby has two main components that communicatevia RPC: a server, and a library that client applications link against.

A Chubby cell consists of a small set of servers known as replicas, placed so as to reduce thelikelihood of correlated failure. The replicas use a distributed consensus protocol to elect a master. The master must obtain votes from amajority of the replicas, plus promises that those replicaswill not elect a different master for an interval of a fewseconds known as the master lease. The master lease isperiodically renewed by the replicas provided the mastercontinues to win a majority of the vote.

Events include:

• File contents modified

• Child node added, removed, or modified used to implementmirroring.

• Chubby master failed over—warns clients that otherevents may have been lost, so data must be rescanned.

• A handle (and its lock) has become invalid—this typicallysuggests a communications problem.

• Lock acquired—can be used to determine when a primaryhas been elected.

• Conflicting lock request from another client—allowsthe caching of locks.

Chubby uses caching, protocol-conversion servers, andsimple load adaptation to allow it scale to tens of thousandsof client processes per Chubby instance.

Backups provide both disaster recovery and a meansfor initializing the database of a newly replaced replicawithout placing load on replicas that are in service.Every few hours, the master of each Chubby cell writesa snapshot of its database to a GFS file server.

There is just one master per cell, andits machine is identical to those of the clients, the clientscan overwhelm the master by a huge margin. Thus, themost effective scaling techniques reduce communicationwith the master by a significant factor.Chubby uses partitioning and proxies for improving scalability.

Analysis

The abstraction and interface presented to clients by Chubby is novel. It would have been possible to give clients a more lock oriented view, but this method allows Chubby to be more flexible and programmable, while still providing the services it set out to provide.

The result of this paper is a design for a distributed lock manager that can support a wide range of services and that works well for them. In addition, this paper provides a lot of useful information about how applications will use a lock service and also, how they won't use it.

There are no experimental results mentioned in the paper but the fact that they can support a large number of clients and have been at the core of some of Google's most used services proves that it works quite well.

Chubby brings together all of the existing techniques. It relies on Paxos primarily, and work in distributed file systems. To compare Chubby to other similar lock services one would need to set up controlled experiments and test how long it takes to obtain and release locks, how usable the API was (perhaps with a user study), how stable the service was, and how well it recovered from failures.

Chubby borrows many of its ideas from distributed file systems, such as caching. It also utilizes the idea that many objects may be represented as a file, which is an idea present in Unix. The idea of providing a lock service is present in VMS.

Since the techniques Chubby uses are all well-known and proven already, it makes the system more reliable.

Chubby appears a as a piece of data center architecture research and social research of how well programmers understand theconcepts and how they use them is useful as well.

Chubby locks are heavier-weight, andneed sequencers to allow externals resources to be protectedsafely.

Acknowledgements

I would like to acknowledge that I have referred to many online reviews to help me understand more about the Chubby locking service and I have mentioned few opinions that substantiated my understanding.

References

  1. Mike Burrows, The Chubby lock service for loosely-coupled distributed systems, Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, p.24-24, November 06-08, 2006, Seattle, WA
  2. Wikipedia