Distributed data storage subsystem

🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions.

1. Overview

Subsystem that provides the functionality of object (S3-compatible), block, and file storage in the Platform.

2. Subsystem functions

  • Provision of object, block, and file storage

  • Storage monitoring

  • Scaling and disaster recovery of the object, block, and file storage

3. Subsystem technical design

The following diagram displays the components that comprise Data storage subsystem, and their interaction with other subsystems.

distributed data storage.drawio
You can find more detailed information on ceph and rook components in ceph architecture and rook architecture documentation.

The most widely used storages on the Platform are:

  • Object Storage — a scalable, distributed data storage system for unstructured data, accessible via S3 protocols compatible API;

  • Block Storage — high-performance, distrubuted block storage for virtual servers or containers.

4. Subsystem glossary

RGW

RADOS Gateway, which is an object storage interface of Ceph. It provides RESTful API for data storing and receiving in Ceph clusters.

OSD

Object Storage Device is a fundamental component of the Ceph storage system. It is responsible for receiving and storing of the data on physical devices in Ceph cluster. Each OSD controls its own local storage, and the data is distributed beyween several OSDs in the cluster, to ensure reliability and fault-tolerance.

OCS

OpenShift Container Storage - a data storage solution based on Ceph technology that integrates with OpenShift container orchestration platform.

MDS

Metadata Server - a CephFS file system component. It controls metadata for CephFS.

cephFS

Ceph File System.

CSI

Акронім від "Container Storage Interface" — це стандартний інтерфейс для забезпечення сумісності різних систем зберігання даних з платформами оркестрації контейнерів Kubernetes або OpenShift.

CSI

Container Storage Interface - the standard interface used to provide interoperability between data storage systems with Kubernetes or OpenShift container orchestration platforms.

RBD

RADOS Block Device - a block storage that uses Ceph technology to store data.

OSD Map

Data structure in Ceph, which contains information on the state and location of OSD in Ceph cluster. An OSD Map displays the state of each OSD (active, inactive, non-functioning), interconnections between OSD, and OSD grouping in the distributed storage system.

DS Map

Data structure in Ceph, which contains information on the MDS map over a certain time period - it’s creation date and last change. Also, it contains a metadata storage pool, metadata server list and their states (active or inactive).

5. Subsystem components

Component name Namespace Deployment Source Repository Function

Ceph dashboard

openshift-storage

rook-ceph-dashboard

3rd-party

github:/red-hat-storage/ocs-operator

github:/rook-operator

gerrit:/infrastructure/storage

Viewing of the main Ceph metrics, storage state, and distributed data storage system logs.

Rook Ceph Operator

openshift-storage

rook-ceph-operator

3rd-party

Auxiliary software that orchestrates Ceph storage.

OpenShift Container Storage Operator

openshift-storage

ocs-operator

3rd-party

Auxiliary software that orchestrates OpenShift Storage.

Ceph Metadata Server

openshift-storage

rook-ceph-mds

3rd-party

Component that controls file metadata in Ceph storage.

Ceph Manager

openshift-storage

rook-ceph-mgr

3rd-party

Component that provides Ceph storage monitoring and interaction with external monitoring and management systems.

Ceph Monitor

openshift-storage

rook-ceph-mon

3rd-party

Component that keeps Ceph storage state map, and OSD map.

Ceph Object Storage Device

openshift-storage

rook-ceph-osd

3rd-party

Ceph software that interacts with OpenShift cluster logical disks.

Ceph Object Gateway

openshift-storage

rook-ceph-rgw

3rd-party

Ceph storage component that provides a gateway to the Amazon S3 object storage API.

Ceph RBD CSI Driver

openshift-storage

rook-ceph-rgw

3rd-party

Driver that provides the integration of Ceph-compatible storage objects, like RBD or CephFS block devices, with the OKD container orchestration system.

CephFS CSI Driver

openshift-storage

rook-ceph-rgw

3rd-party

Driver that provides the integration of Ceph-compatible storage objects, like RBD or CephFS block devices, with the OKD container orchestration system.

OCS Metrics Exporter

openshift-storage

ocs-metrics-exporter

3rd-party

Prometheus exporter that gathers OCS and ceph metrics for monitoring and further analysis.

Rook Ceph Crash Collector

openshift-storage

ocs-metrics-exporter

3rd-party

Component that gathers and aggregates information on crashes in Ceph.

6. Object storage data classification

Bucket Owner subsystem Description

lowcode-file-storage

Business process execution subsystem

Temporary storing of digital documents, uploaded during Business Process execution

datafactory-ceph-bucket

Registry data management subsystem

Storing of signed data during its writing to Registry

file-ceph-bucket

Storing of Registry digital documents

response-ceph-bucket

Temporary storing of data for transfer in inter-service interaction

file-excerpt-bucket

Excerpt forming subsystem

Gathering of generated and signed excerpts from the Registry

excerpt-signature-bucket (deprecated)

Storing of generated excerpts from the Registry

excerpt-templates

Storing of excerpt templates

user-import

Registry regulations modelling subsystem

Storing of the files that contain a list of officers, for import to the Registry

user-import-archive

Storing of the files that contain a list of officers imported to the Registry

8. Subsystem quality attributes

8.1. Scalability

Distributed data storage subsystem was designed for horizontal scaling to hundreds or even thousands of data storing nodes, providing data storage on an extensive scale. The subsystem has dynamic scaling capabilities, which allows the clusters to scale up or down on demand.

8.2. Reliability

Distributed data storage subsystem uses data replication and erasure coding (EC) to avoid data loss and provide subsystem fault-tolerance. In case of a node or device failure, the subsystem replicates its data automatically on operational nodes to ensure the reliable storing of data.

8.3. Resilience

Distributed data storage subsystem remains operational even when it encounters network problems or data storage nodes failure. Thanks to the dynamic load balance, data distribution methods, and fault-tolerant design, it provides resilience in case of hardware and software problems.

8.4. Performance

Distributed data storage subsystem provides high performance and throughput thanks to parallel storage object read/write availability (data is broken into small parts and replicated between several OSD, and CRUSH algorithm is used), and adaptive load balancing.