Ceph: A Distributed Storage System with Two Decades of Innovation

Ceph is celebrating 20 years of continuous evolution, solidifying its place as one of the most powerful open-source storage solutions. Ceph was Initially started in 2004 by Sage Weil during his Ph.D. research at the University of California, Santa Cruz, and has now grown from an academic project to a global, community-driven platform trusted by enterprises and cloud providers alike. Ceph’s journey from its academic origins to its current position in the data storage world is a testament to its innovative architecture and adaptability.

What is Ceph?

Ceph is an open-source, distributed storage platform designed to provide scalable and highly reliable storage for cloud computing and data-intensive applications. It is often used in private and public cloud environments due to its versatility and deep integration with projects such as OpenStack, Proxmox and Kubernetes.

Ceph’s architecture is fundamentally distributed, meaning data is always kept across multiple servers or nodes within a cluster. This setup ensures high availability, fault tolerance, and scalability, enabling organizations to manage ever-growing amounts of data with minimal administrative overhead.

On the diagram below we can see some examples of clients and protocols that Ceph supports.

What’s inside the Ceph Cluster?

A Ceph cluster consists of multiple storage nodes (minimum 3, recommended 10 and more) that are running so-called Object Storage Devices (OSDs), which are essentially disk drives that store and manage data. These OSDs work in unison to replicate data, manage failures, and distribute workloads efficiently and OSDs can be based on any kind of drives – HDD, SSD, NVMe. The cluster also includes monitoring and management components, known as Monitors (MONs), which maintain the health of the cluster and ensure data consistency. This combination of OSDs and MONs, along with other components like the Metadata Servers (MDS) for CephFS, creates a robust, scalable system that can grow seamlessly as more storage nodes are added.

On the diagram below we can see:

  • OSD Nodes – storage hosts with multiple disk drives where data is saved. You typically have 4-10 drives per host with one OSD process per drive. Each OSD also communicates with MONs and MGRs.

     

  • Monitors (MONs) – services that maintain a master copy of the storage cluster map. All clients contact a Ceph monitor to retrieve the current storage cluster map. Thus MONs enable clients to bind to storage pools to read and write data. It is recommended to run at least 3 Monitor daemons per cluster.

     

  • Managers (MGRs) – daemons that provide additional monitoring and interfaces. They keep track of runtime metrics, cluster utilization and load. They also provide the Ceph dashboard. You’d need to run a minimum of 2 Managers, with 3 being recommended for High-Availability purposes.

     

  • RADOS-GW (Gateway) – a RESTful interface for object storage. RADOS-GW supports both S3 and Swift APIs and those could be used simultaneously even with the same objects stored in the cluster.

     

  • iSCSI-GW (Gateway) – exports RADOS Block Device (RBD) images as SCSI disks. The iSCSI protocol allows clients (initiators) to send SCSI commands to storage devices (targets) over a TCP/IP network stack. This is useful when you have clients that don’t support Ceph block storage natively.

     

  • Metadata Services (MDS) – manages metadata for files and directories stored on the Ceph File System (CephFS).

Key Features of Ceph

  1. Unified Storage: Object, Block, and File Storage

Ceph is unique in its ability to provide object storage, block storage, and file storage – all within a single platform and at the same time. This flexibility makes it suitable for various use cases, from cloud storage solutions to virtual machine hosting and file/object sharing. For object storage, the Ceph Object Gateway provides a RESTful S3 interface compatible with many applications for backups and content distribution. The RBD offers block-level storage, ideal for virtualized environments, while CephFS offers a distributed file system accessible by Linux and Windows clients.

  1. RADOS (Reliable Autonomic Distributed Object Store)

At the core of Ceph is RADOS, a highly reliable and self-healing object store. RADOS is responsible for distributing, replicating, and recovering data across the cluster, ensuring automatic data integrity without manual intervention. This system has made Ceph a go-to for mission-critical workloads that demand high reliability and redundancy.

  1. CRUSH Algorithm

Ceph uses the so-called Controlled Replication Under Scalable Hashing (CRUSH) algorithm to manage data placement across the cluster. Unlike traditional methods that rely on a centralized table to track data locations, CRUSH dynamically calculates where data should reside based on user-defined rules. This eliminates bottlenecks and ensures efficient use of storage resources across large and complex infrastructures.

  1. Scalability and Fault Tolerance

Ceph is designed to scale effortlessly. As storage needs increase, additional nodes can be added to the Ceph cluster, enhancing both capacity and performance. Moreover, Ceph’s self-healing capabilities allow it to detect failures and redistribute data without human intervention, ensuring minimal downtime and consistent performance.

  1. BlueStore Storage Engine

In its 12th release, known as “Luminous,” Ceph introduced the BlueStore storage engine. This innovation significantly improved performance by bypassing traditional file systems, allowing direct management of storage devices such as SSDs, HDDs and NVMes.

BlueStore uses an embedded RocksDB key/value store to manage metadata, ensuring data is read and verified before being returned to the user. This architecture boosts efficiency and minimizes performance overhead.

  1. Community-Driven and Open Source

Ceph is backed by a vibrant open-source community, with contributions from developers worldwide. Its development has been supported by institutions such as Lawrence Livermore National Laboratory and companies like Red Hat, which acquired Ceph’s parent company, Inktank, in 2014. In 2018, the Linux Foundation launched the Ceph Foundation, further bolstering the project’s growth and support from all over the industry.

Ceph’s Early Days and Evolution

Ceph’s origins trace back to Sage Weil’s quest to develop a scalable, distributed file system that could handle high-performance computing (HPC) workloads. The initial design of Ceph, focused on pushing intelligence to the edges of the system, contrasted sharply with the centralized management approach taken by other storage systems like Lustre and Google File System. Weil’s approach allowed Ceph to avoid single points of failure and achieve a high level of autonomy. 

Throughout its first decade, Ceph gained traction in academic and research institutions, but its big break came when DreamHost, a web hosting company co-founded by Weil, supported its development between 2007 and 2011. During this period, Ceph became more stable, new features were introduced, and a roadmap for broader adoption was established. Red Hat’s acquisition of Inktank in 2014 helped Ceph become production-ready for enterprise use, providing professional support and further accelerating its development. Today, the largest Ceph clusters are storing over 50PBs of data and Ceph is used as Software Defined Storage in more than 50% of OpenStack installations, proving a high degree of software maturity.

A Game-Changer in the Storage World

Ceph’s distinctive features, such as its dynamic distributed metadata management (using Dynamic Subtree Partitioning) and CRUSH algorithm, set it apart from other storage solutions. By allowing data replication and recovery to happen autonomously at the storage node level, Ceph ensures faster recovery times and reduces unused space waste by avoiding the need for empty spare disks.

This approach has made Ceph a trusted solution for industries requiring high data availability and rapid recovery from disk and storage node failures.

Looking Ahead: What’s next for Ceph?

As Ceph enters its third decade, its relevance continues to grow, particularly in the realms of AI, machine learning, and cloud infrastructure. The Ceph Foundation, under the continued guidance of Weil and a committed community, is focused on positioning Ceph as a key player in these evolving fields. Ceph’s ability to scale, coupled with its robust and self-healing architecture, ensures it will remain a cornerstone of modern data storage solutions.

Ceph is clearly a pioneering approach to distributed storage that has not only stood the test of time but is poised to drive the future of storage technology.

At Cloudification, we trust Ceph as one of the essential components in our cloud solution – c12n. Its scalability, flexibility, and reliability make it an indispensable part of our open-source cloud distribution.

Whether you’re considering Ceph for your organization or exploring open-source alternatives to VMware, our team at Cloudification is here to help. Contact us to learn how we can assist with your Ceph deployment, or explore our c12n.cloud for an end-to-end secure and vendor-lock-free cloud environment.