Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Ceph Storage Cluster Administration: A Comprehensive Guide, Study Guides, Projects, Research of Education Planning And Management

A detailed guide on how to install, manage, and monitor a ceph storage cluster using various methods such as salt, ceph-deploy, and manual installation. It covers topics like adding and removing ceph osd nodes, monitoring usage graphs, server maintenance, networking settings, and more.

Typology: Study Guides, Projects, Research

2017/2018

Uploaded on 01/25/2018

vivek-shwarup
vivek-shwarup 🇮🇳

3 documents

1 / 320

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Administration and
Deployment Guide
SUSE Enterprise Storage 4
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Ceph Storage Cluster Administration: A Comprehensive Guide and more Study Guides, Projects, Research Education Planning And Management in PDF only on Docsity!

Administration and

Deployment Guide

SUSE Enterprise Storage 4

Administration and Deployment Guide

SUSE Enterprise Storage 4

Publication Date: 02/28/

SUSE LLC 10 Canal Park Drive Suite 200 Cambridge MA 02141 USA https://www.suse.com/documentation

Copyright © 2017 SUSE LLC

Copyright © 2010-2014, Inktank Storage, Inc. and contributors.

The text of and illustrations in this document are licensed by Inktank Storage under a Creative Commons Attribution- Share Alike 4.0 International ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/ licenses/by-sa/4.0/legalcode. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

This document is an adaption of original works found at http://ceph.com/docs/master/ (2015-01-30).

Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the Infinity Logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. Linux® is the registered trademark of Linus Torvalds in the United States and other countries. Java® is a registered trademark of Oracle and/or its affiliates. XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries. All other trademarks are the property of their respective owners.

iv Administration and Deployment Guide

Contents

vii Administration and Deployment Guide

viii Administration and Deployment Guide

17.4 Managing RADOS Gateway Access 134 Managing S3 Access 134 • Managing Swift Access 135 17.5 Multi-site Object Storage Gateways 135 Terminology 135 • Example Cluster Setup 136 • System Keys 137 • Naming Conventions 137 • Default Pools 138 • Creating a Realm 138 • Deleting the Default Zonegroup 139 • Creating a Master Zonegroup 139 • Creating a Master Zone 140 • Creating a Secondary Zone 144 • Adding RADOS Gateway to the Second Cluster 147

18 Ceph iSCSI Gateway 152 18.1 iSCSI Block Storage 152 The Linux Kernel iSCSI Target 153 • iSCSI Initiators 153 18.2 General Information about lrdb 154 18.3 Deployment Considerations 155 18.4 Installation and Configuration 156 Install SUSE Enterprise Storage and Deploy a Ceph Cluster 156 • Installing the ceph_iscsi Pattern 156 • Create RBD Images 157 • Export RBD Images via iSCSI 157 • Optional Settings 160 • Advanced Settings 160 18.5 Connecting to lrbd-managed Targets 165 Linux (open-iscsi) 165 • Microsoft Windows (Microsoft iSCSI initiator) 168 • VMware 176 18.6 Conclusion 182

19 Clustered File System 183 19.1 Ceph Metadata Server 183 Adding a Metadata Server 183 • Configuring a Metadata Server 184 19.2 CephFS 184 Creating CephFS 184 • Mounting CephFS 185 • Unmounting CephFS 188 • CephFS in /etc/fstab 188 19.3 Managing Failover 188 Configuring Standby Daemons 188 • Examples 190

 - I SUSE ENTERPRISE STORAGE About This Guide xv 
  • 1 About SUSE Enterprise Storage
  • 1.1 Introduction
  • 1.2 Additional Information
    • 2 System Requirements
  • 2.1 Minimal Recommendations per Storage Node
  • 2.2 Minimal Recommendations per Monitor Node
  • 2.3 Minimal Recommendations for RADOS Gateway Nodes
  • 2.4 Minimal Recommendations for iSCSI Nodes
  • 2.5 Naming Limitations - II CLUSTER DEPLOYMENT AND UPGRADE
    • 3 Introduction
    • 4 Deploying with DeepSea and Salt
  • 4.1 Introduction to DeepSea - Organization and Important Locations
  • 4.2 Deploying with DeepSea and Salt
  • 4.3 Configuration and Customization - The policy.cfg File 15 • Customizing the Default Configuration
    • 5 Deploying with ceph-deploy v Administration and Deployment Guide
  • 5.1 Ceph Layout
  • 5.2 Network Recommendations
  • 5.3 Preparing Each Ceph Node
  • 5.4 Cleaning Previous Ceph Environment
  • 5.5 Running ceph-deploy
    • 6 Deploying with Crowbar
  • 6.1 Installing and Setting Up the Crowbar Admin Server - Prepare Software Repositories
  • 6.2 Deploying the Ceph Nodes - Ceph Logging In 37 • Node Installation 38 • Barclamps 44 • Deploying
    • 7 Upgrading from Previous Releases
  • 7.1 General Upgrade Procedure
  • 7.2 Upgrade from SUSE Enterprise Storage 2.1 to
  • 7.3 Upgrade from SUSE Enterprise Storage 3 to
    • III OPERATING A CLUSTER
    • 8 Introduction
    • 9 Operating Ceph Services
  • 9.1 Starting, Stopping, and Restarting Services using Targets
  • 9.2 Starting, Stopping, and Restarting Individual Services
  • 9.3 Identifying Individual Services
  • 9.4 Service Status
    • 10 Determining Cluster State vi Administration and Deployment Guide
  • 10.1 Checking Cluster Health
  • 10.2 Watching a Cluster
  • 10.3 Checking a Cluster’s Usage Stats
  • 10.4 Checking a Cluster’s Status
  • 10.5 Checking OSD Status
  • 10.6 Checking Monitor Status
  • 10.7 Checking Placement Group States
  • 10.8 Using the Admin Socket
    • 11 Authentication with cephx
  • 11.1 Authentication Architecture
  • 11.2 Key Management - Management 78 • Command Line Usage Background Information 71 • Managing Users 74 • Keyring
    • 12 Stored Data Management
  • 12.1 Devices
  • 12.2 Buckets
  • 12.3 Rule Sets
  • 12.4 CRUSH Map Manipulation - CRUSH Weight 89 • Remove an OSD 90 • Move a Bucket Editing a CRUSH Map 88 • Add/Move an OSD 89 • Adjust an OSD’s
  • 12.5 Mixed SSDs and HDDs on the Same Node
    • 13 Managing Storage Pools
  • 13.1 Operating Pools - Replicas 101 • Get the Number of Object Replicas Pool Values 98 • Get Pool Values 101 • Set the Number of Object
    • 14 Snapshots
  • 14.1 RBD Snapshots - Cephx Notes 103 • Snapshot Basics 104 • Layering
  • 14.2 Pool Snapshots - Make a Snapshot of a Pool 110 • Remove a Snapshot of a Pool
    • 15 Erasure Coded Pools
  • 15.1 Creating a Sample Erasure Coded Pool
  • 15.2 Erasure Code Profiles
  • 15.3 Erasure Coded Pool And Cache Tiering
    • 16 Cache Tiering
  • 16.1 Tiered Storage Terminology
  • 16.2 Points to Consider
  • 16.3 When to Use Cache Tiering
  • 16.4 Cache Modes
  • 16.5 Setting Up an Example Tiered Storage - Configuring a Cache Tier - IV ACCESSING CLUSTER DATA
    • 17 Ceph RADOS Gateway
  • 17.1 Managing RADOS Gateway with ceph-deploy - RADOS Gateway from a Node Installation 126 • Listing RADOS Gateway Installations 126 • Removing
  • 17.2 Managing RADOS Gateway Manually - Installation 127 • Configuring RADOS Gateway
  • 17.3 Operating the RADOS Gateway Service
    • 20 NFS-Ganesha: Export Ceph Data via NFS
  • 20.1 Installation
  • 20.2 Configuration - Data 192 • NFS Access to RADOS Gateway Buckets NFS-Ganesha Common Configuration 192 • NFS Access to CephFS
  • 20.3 Starting NFS-Ganesha Related Services
  • 20.4 Verifying the Exported NFS Share
  • 20.5 Mounting the Exported NFS Share - V MANAGING CLUSTER WITH GUI TOOLS
    • 21 openATTIC
  • 21.1 Installing openATTIC - openATTIC from the Salt master Node Installing Required Packages 197 • openATTIC Initial Setup 198 • Removing
  • 21.2 openATTIC Web User Interface
  • 21.3 Dashboard
  • 21.4 Ceph Related Tasks - Nodes 208 • Viewing the Cluster CRUSH Map RADOS Block Devices (RBDs) 203 • Managing Pools 206 • Listing
    • 22 Calamari
  • 22.1 Installing Calamari with ceph-deploy
  • 22.2 Installing Calamari Using Crowbar - VI INTEGRATION WITH VIRTUALIZATION TOOLS
    • 23 Using libvirt with Ceph
  • 23.1 Configuring Ceph
  • 23.2 Preparing the VM Manager x Administration and Deployment Guide
  • 23.3 Creating a VM
  • 23.4 Configuring the VM
  • 23.5 Summary
    • 24 Ceph as a Back-end for QEMU KVM Instance
  • 24.1 Installation
  • 24.2 Usage
  • 24.3 Creating Images with QEMU
  • 24.4 Resizing Images with QEMU
  • 24.5 Retrieving Image Info with QEMU
  • 24.6 Running QEMU with RBD
  • 24.7 Enabling Discard/TRIM
  • 24.8 QEMU Cache Options - VII BEST PRACTICES
    • 25 Introduction
  • 25.1 Reporting Software Problems
    • 26 Hardware Recommendations
  • 26.1 Can I Reduce Data Replication
  • 26.2 Can I Reduce Redundancy Similar to RAID 6 Arrays?
  • 26.3 What is the Minimum Disk Size for an OSD node?
  • 26.4 How Much RAM Do I Need in a Storage Server?
  • 26.5 OSD and Monitor Sharing One Server
  • 26.6 How Many Disks Can I Have in a Server
  • 26.7 How Many OSDs Can Share a Single SSD Journal xi Administration and Deployment Guide
    • 27 Cluster Administration
  • 27.1 Using ceph-deploy on an Already Setup Server
  • 27.2 Adding OSDs with ceph-disk
  • 27.3 Adding OSDs with ceph-deploy
  • 27.4 Adding and Removing Monitors - Adding a Monitor 237 • Removing a Monitor
  • 27.5 Usage of ceph-deploy rgw
  • 27.6 RADOS Gateway Client Usage - S3 Interface Access 239 • Swift Interface Access
  • 27.7 Automated Installation via Salt
  • 27.8 Restarting Ceph services using DeepSea
  • 27.9 Node Management - and Reinstalling Salt Cluster Nodes Adding Ceph OSD Nodes 242 • Removing Ceph OSD Nodes 243 • Removing
    • 28 Monitoring
  • 28.1 Usage Graphs on Calamari
  • 28.2 Checking for Full OSDs
  • 28.3 Checking if OSD Daemons are Running on a Node
  • 28.4 Checking if Monitor Daemons are Running on a Node
  • 28.5 What Happens When a Disk Fails?
  • 28.6 What Happens When a Journal Disk Fails?
    • 29 Disk Management
  • 29.1 Adding Disks
  • 29.2 Deleting disks
  • 29.3 How to Use Existing Partitions for OSDs Including OSD Journals xii Administration and Deployment Guide
    • 30 Recovery
  • 30.1 'Too Many PGs per OSD' Status Message
  • 30.2 Calamari Has a Stale Cluster
  • 30.3 ' nn pg stuck inactive' Status Message
  • 30.4 OSD Weight is
  • 30.5 OSD is Down
  • 30.6 Fixing Clock Skew Warnings
    • 31 Accountancy
  • 31.1 Adding S3 Users
  • 31.2 Removing S3 Users
  • 31.3 User Quota Management
  • 31.4 Adding Swift Users
  • 31.5 Removing Swift Users
  • 31.6 Changing S3 and Swift User Access and Secret Keys
    • 32 Tune-ups
      • Performance? 32.1 How Does the Number of Placement Groups Affect the Cluster
  • 32.2 Can I Use SSDs and Hard Disks on the Same Cluster?
  • 32.3 What are the Trade-offs of Using a Journal on SSD?
    • 33 Integration
  • 33.1 Storing KVM Disks in Ceph Cluster
  • 33.2 Storing libvirt Disks in Ceph Cluster
  • 33.3 Storing Xen Disks in Ceph Cluster
    • 33.4 Mounting and Unmounting an RBD Image xiii Administration and Deployment Guide
      • 34 Cluster Maintenance and Troubleshooting
    • 34.1 Creating and Deleting Pools from Calamari
    • 34.2 Managing Keyring Files
    • 34.3 Creating Client Keys
    • 34.4 Revoking Client Keys
    • 34.5 Checking for Unbalanced Data Writing
    • 34.6 Time Synchronization of Nodes
    • 34.7 Upgrading Software
    • 34.8 Increasing the Number of Placement Groups
    • 34.9 Adding a Pool
  • 34.10 Deleting a Pool
  • 34.11 Troubleshooting - System Sending Large Objects with rados Fails with Full OSD 278 • Corrupted XFS File - 35 Performance Diagnosis
    • 35.1 Finding Slow OSDs
    • 35.2 Is My Network Causing Issues?
      • 36 Server Maintenance
    • 36.1 Adding a Server to a Cluster
    • 36.2 Removing a Server from a Cluster
    • 36.3 Increasing File Descriptors
      • 37 Networking
    • 37.1 Setting NTP to a Ceph Cluster
  • 37.2 Firewall Settings for Ceph xiv Administration and Deployment Guide
  • 37.3 Adding a Private Network to a Running Cluster - Glossary - Installation A Example Procedure of Manual Ceph - B Documentation Updates - Update 1) B.1 February, 2017 (Release of SUSE Enterprise Storage 4 Maintenance
    • B.2 December, 2016 (Release of SUSE Enterprise Storage 4)
    • B.3 June, 2016 (Release of SUSE Enterprise Storage 3)
    • B.4 January, 2016 (Release of SUSE Enterprise Storage 2.1)
    • B.5 October, 2015 (Release of SUSE Enterprise Storage 2)

xvi Feedback SES 4

2 Feedback

Several feedback channels are available:

User Comments We want to hear your comments about and suggestions for this manual and the other documentation included with this product. Use the User Comments feature at the bottom of each page in the online documentation or go to http://www.suse.com/documentation/ feedback.html and enter your comments there.

Mail For feedback on the documentation of this product, you can also send a mail to doc- team@suse.de. Make sure to include the document title, the product version, and the publication date of the documentation. To report errors or suggest enhancements, provide a concise description of the problem and refer to the respective section number and page (or URL).

3 Documentation Conventions

The following typographical conventions are used in this manual:

/etc/passwd : directory names and file names

placeholder : replace placeholder with the actual value

PATH : the environment variable PATH

ls , --help : commands, options, and parameters

user : users or groups

Alt (^) , Alt (^) – F1 (^) : a key to press or a key combination; keys are shown in uppercase as on a keyboard

File , File Save As : menu items, buttons

Dancing Penguins (Chapter Penguins , ↑Another Manual): This is a reference to a chapter in another manual.

xvii About the Making of This Manual SES 4

4 About the Making of This Manual

This book is written in Geekodoc, a subset of DocBook (see http://www.docbook.org ). The XML source files were validated by xmllint , processed by xsltproc , and converted into XSL- FO using a customized version of Norman Walsh's stylesheets. The final PDF can be formatted through FOP from Apache or through XEP from RenderX. The authoring and publishing tools used to produce this manual are available in the package daps. The DocBook Authoring and Publishing Suite (DAPS) is developed as open source software. For more information, see http:// daps.sf.net/.

2 Introduction SES 4

1 About SUSE Enterprise Storage

1.1 Introduction

SUSE Enterprise Storage is a distributed storage designed for scalability, reliability and performance based on the Ceph technology. As opposed to conventional systems which have allocation tables to store and fetch data, Ceph uses a pseudo-random data distribution function to store data, which reduces the number of look-ups required in storage. Data is stored on intelligent object storage devices (OSDs) by using daemons, which automates data management tasks such as data distribution, data replication, failure detection and recovery. Ceph is both self- healing and self-managing which results in reduction of administrative and budget overhead.

The Ceph storage cluster uses two mandatory types of nodes—monitors and OSD daemons:

Monitor Monitoring nodes maintain information about cluster health state, a map of other monitoring nodes and a CRUSH map. Monitor nodes also keep history of changes performed to the cluster.

OSD Daemon An OSD daemon stores data and manages the data replication and rebalancing processes. Each OSD daemon handles one or more OSDs, which can be physical disks/partitions or logical volumes. OSD daemons also communicate with monitor nodes and provide them with the state of the other OSD daemons.

The Ceph storage cluster can use the following optional node types:

Metadata Server (MDS) The metadata servers store metadata for the Ceph file system. By using MDS you can execute basic file system commands such as ls without overloading the cluster.

RADOS Gateway RADOS Gateway is an HTTP REST gateway for the RADOS object store. You can use this node type also when using the Ceph file system.

3 Introduction SES 4

Note: Each Node Type on a Separate Server

We strongly recommend to install only one node type on a single server.

The Ceph environment has the following features:

Controlled, Scalable, Decentralized Placement of replicated Data using CRUSH The Ceph system uses a unique map called CRUSH (Controlled Replication Under Scalable Hashing) to assign data to OSDs in an efficient manner. Data assignment offsets are generated as opposed to being looked up in tables. This does away with disk look-ups which come with conventional allocation table based systems, reducing the communication between the storage and the client. The client armed with the CRUSH map and the metadata such as object name and byte offset knows where it can find the data or which OSD it needs to place the data. CRUSH maintains a hierarchy of devices and the replica placement policy. As new devices are added, data from existing nodes is moved to the new device to improve distribution with regard to workload and resilience. As a part of the replica placement policy, it can add weights to the devices so some devices are more favored as opposed to others. This could be used to give more weights to Solid State Devices (SSDs) and lower weights to conventional rotational hard disks to get overall better performance. CRUSH is designed to optimally distribute data to make use of available devices efficiently. CRUSH supports different ways of data distribution such as the following:

n-way replication (mirroring) RAID parity schemes Erasure Coding Hybrid approaches such as RAID-

Reliable Autonomic Distributed Object Storage (RADOS) The intelligence in the OSD Daemons allows tasks such as data replication and migration for self-management and self-healing automatically. By default, data written to Ceph storage is replicated within the OSDs. The level and type of replication is configurable. In case of failures, the CRUSH map is updated and data is written to new (replicated) OSDs. The intelligence of OSD Daemons enables to handle data replication, data migration, failure detection and recovery. These tasks are automatically and autonomously managed. This also allows the creation of various pools for different sorts of I/O.