Summary

5 minutes

The Hadoop Distributed File System (HDFS) is an open-source clone of Google File System (GFS).
HDFS is designed to run on a cluster of nodes and supports the MapReduce programming model by providing a distributed file system (DFS) for the model's I/O data.
HDFS has a common, cluster-wide namespace; is able to store large files; is optimized for write-once, read-many access; and is designed to provide high availability in the presence of node failures.
HDFS follows a primary-secondary topology; the NameNode handles the metadata, and the data is stored on the DataNodes.
Files in HDFS are split into blocks (also called chunks), with a default size of 128 MB.
Blocks are replicated by default three times (called replication factor) over the entire cluster.
HDFS assumes a tree-style cluster topology, optimizes file access to improve performance, and attempts to place block replicas across racks.
The original HDFS design follows immutable semantics and does not allow existing files to be opened for writes. Newer versions of HDFS support file appends.
HDFS is strongly consistent because a file write is marked complete only after all the replicas have been written.
The NameNode keeps track of DataNode failures using a heartbeat mechanism; if DataNodes fail to respond, they are marked as dead, and more copies of the blocks that were on that DataNode are created to maintain the desired replication factor.
The NameNode is a single point of failure (SPOF) in the original HDFS design. A secondary NameNode can be designated to periodically copy metadata from the primary NameNode but does not provide full failover redundancy.
HDFS provides high bandwidth for MapReduce, high reliability, low costs per byte, and good scalability.
HDFS is inefficient with small files (owing to large default block size), is non-POSIX compliant, and does not allow for file rewrites, except for appends in the latest versions of HDFS.
Ceph is a storage system designed for cloud applications. Ceph is based on a distributed object store with services layered on top of it.
At the core of Ceph is RADOS, a self-managing cluster of object store daemons (OSDs) and monitor nodes. The nodes use advanced techniques to be self-managing, fault-tolerant, and scalable.
An object in RADOS is hashed to a placement group and is then associated to an OSD using the CRUSH algorithm.
RADOS can be accessed through librados, a native RADOS client that works with different programming languages.
Applications can also access data in RADOS as Objects through the RADOS Gateway, which supports the S3 and Swift protocols through a REST interface.
RADOS can also export block storage devices by using RBD. These can be used as disk images for virtual machines.
Ceph FS is a filesystem layered over RADOS. This is achieved by using special Metadata nodes that keep track of the filesystem metadata. Metadata nodes partition the filesystem tree dynamically through a special algorithm. Metadata entries are also journaled to RADOS for fault-tolerance.

Continue

Feedback