History of distributed file systems

Completed

As a recap, a distributed file system (DFS) is a file system that has files distributed among multiple file servers. It is important to note that in a distributed file system, the client views a single, global namespace that encompasses all the files across all the file system servers. A DFS requires metadata management so that clients can locate the required files and file blocks across the file servers. They are typically deployed on multiple file-sharing nodes and are intended for use by multiple users at the same time. As with any shared resource, multiple design considerations must be considered: Performance, consistency, fault tolerance, availability are some of them.

Origins and evolution of distributed file systems

File systems have largely been influenced by the UNIX file system and BSD's Fast File System (FFS), which were used as local file systems. Recall that the primary focus of these file systems is to organize data on disk in a manner that is fast and reliable.

Networked file systems such as NFS have since emerged in order to allow users to share files over a network. NFS uses a client-server architecture, where a server can share the data that it holds to a number of clients. It is a simple protocol that continues to be used to this day for sharing files over a network. Files cannot be distributed across multiple servers in a coordinated fashion in NFS, each server can simply share a number of files. There is no consistent global view of the namespace, either. Clients can mount NFS shares anywhere within their local file system tree. Hence, this approach is limited in its ability to scale to thousands of clients/servers, and is limited to use in local area networks (LANs).

Andrew File System (AFS) is an early example of a true distributed file system. AFS enables cooperating hosts (clients and servers) to efficiently share file system resources across both local area and wide area networks. AFS consists of cells, an administrative grouping of servers that present a single cohesive file system. Cells can be combined to form a single global namespace. Any client that accesses data from AFS will first copy the file locally to the client. Changes to the file will be made locally as long as the file is open. When the file is closed, the AFS client will sync the changes back to the server. An evolution of AFS is CODA, which is a distributed file system that improves on AFS, particularly with respect to sharing semantics and replication. Both AFS and CODA are POSIX compliant, which means that they work with existing UNIX applications without any modifications.

In 2003, Google revealed the design of its distributed file system, called GFS2, which was designed from scratch to provide efficient, reliable access to data using large clusters of commodity hardware. GFS is designed to store very large files as chunks stored on multiple servers (typically of size 64MB), in a replicated fashion. Although GFS has a singular client view like AFS, the location of file chunks is exposed to the user, providing clients with opportunities to fetch files from the closest available replica. GFS, however, is not POSIX compliant, which means that applications have to use a special API to work with GFS. The Hadoop Distributed File System (HDFS), is an open-source variant of GFS, which we will explore in detail in this module.

In 2006, Ceph was first described in a paper by Weil et.al.1 Ceph is designed to be a distributed object storage service that can be scaled to hundreds of thousands of machines, while storing petabytes of data. Applications then talk to Ceph through various APIs, ranging from a native API that is similar to the way GFS works, to a POSIX-compliant file system API called Ceph FS. Ceph also supports a block device abstraction, which makes it a file system that is suitable for storing virtual machine images.

Google has since evolved GFS into a system known as Colossus.3.


References

  1. Weil, S. A., Brandt, S. A., Miller, E. L., & Maltzahn, C. (2006). Ceph: A scalable, high-performance distributed file system Proceedings of the 7th symposium on Operating systems design and implementation (OSDI) 307-320
  2. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (2003). The Google File Systems 19th ACM Symposium on Operating Systems Principles
  3. McKusick, Kirk and Quinlan, Sean (March 2010). GFS: Evolution on Fast-forward Communications of the ACM