ZFS on Centos7

Step 1: Add ZFS Repository

First, we need to check which version of CentOS is currently installed using the following command:

cat /etc/redhat-release

After the CentOS version is verified, we can add the repo of ZFSOnLinux using the following command:

yum install -y http://download.zfsonlinux.org/epel/zfs-release.el7_8.noarch.rpm

Step 2: DKMS vs kABI

DKMS and kABI are two ways ZFS module can be loaded into the kernel. If DKMS is used then if CentOS kernel is ever updated, the ZFS module will need to be recompiled again. But with kABI no recompilation is necessary. In this guide, we are going to use kABI. We can enable it by editing the ZFS repository:

vi /etc/yum.repos.d/zfs.repo

The following content should be in the repository file where [zfs] is for DKMS and [zfs-kmod] is for kABI. We can see that the DKMS is enabled by default and kABI is disabled:

[zfs]
name=ZFS on Linux for EL7 - dkms
baseurl=https://download.zfsonlinux.org/epel/7.4/$basearch/
enabled=1
metadata_expire=7d
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux

[zfs-kmod]
name=ZFS on Linux for EL7 - kmod
baseurl=https://download.zfsonlinux.org/epel/7.4/kmod/$basearch/
enabled=0
metadata_expire=7d
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux

We are going to disable DKMS and enable kABI by editing the enable= in both sections as following:

[zfs]
name=ZFS on Linux for EL7 - dkms
baseurl=https://download.zfsonlinux.org/epel/7.4/$basearch/
enabled=0
metadata_expire=7d
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux

[zfs-kmod]
name=ZFS on Linux for EL7 - kmod
baseurl=https://download.zfsonlinux.org/epel/7.4/kmod/$basearch/
enabled=1
metadata_expire=7d
gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux

Step 3: Installing ZFS

With repository fully configured, we are now ready to install ZFS using the following command:

yum install zfs -y

echo modprobe zfs >> /etc/rc.modules
chmod +x /etc/rc.modules

## reboot to get zfs work

## config ARC size
vi /etc/modprobe.d/zfs.conf

# Min 512MB / Max 2048 MB Limit
## options zfs zfs_arc_min=536870912
## options zfs zfs_arc_max=2147483648

# 4GB = 4294967296
# 8GB = 8589934592
# 12GB = 12884901888
# 20GB = 21474836480
# 28GB = 30064771072
# 42GB = 45097156608

reboot

Step 4: Check ZFS Kernel Module

After rebooting is done, use the following command to check if the ZFS kernel module is loaded automatically:

lsmod | grep zfs

If the module is loaded properly, we should see it as follows:

zfs           3841048     0
zunicode       293493     1   zfs
zavl            21918     1   zfs
icp            299583     1   zfs
zcommon        38959     1   zfs
znvpair        98827     2   zfs,zcommon
spl            43928     4   icp,zfs,zcommon,znvpair

If for some reason the module is not loaded, we can manually load it using the following command and check again:

modprobe zfs

## Check arc min size
cat /sys/module/zfs/parameters/zfs_arc_min

## Check arc max size
cat /sys/module/zfs/parameters/zfs_arc_max

## Check arc current size
cat /proc/spl/kstat/zfs/arcstats | grep size | head -n 1

## Create a pool for seperate disk
## ashift
Diskname=disk_a
Diskdevice=/dev/sda

zpool create -o ashift=12 $Diskname $Diskdevice

## recordsize
zfs set recordsize=1M $Diskname
zfs set relatime=on $Diskname
zfs set atime=off $Diskname
#

Step 5: Creating ZFS Pool

A ZFS pool combines drives together to perform single storage. Pools should always be created on disks which are currently not in use. So when the storage needs to be expanded simply add drives to the pool to increase overall storage capacity. This offers to scale without restriction.

Step 5a: Pool Naming Convention

ZFS pool and dataset name must follow the strict naming convention:

  • The name can only contain alphanumeric characters including the following four special characters:
    • Hyphen –
    • Underscore _ 
    • Colon ; 
    • Period .
  • The name must begin with a letter with the following exceptions:
    • Reserved names cannot be used: log, mirror, raidz, raidz1, raidz2, raidz3, spare.
    • Name must not contain percent (%) symbol. 
    • The naming sequence must never be as c[0-9].

There are three types of pools that can be created in ZFS:

  • Stripped Pool
  • Mirrored Pool 
  • Raid Pool

Each offers its own sets of advantages and disadvantages. It is important to decide which type of pool is going to be used. Because once the pool is created it cannot be undone. In order to change pool, a new pool would need to be created, then migrate all data from the old pool to the new pool then delete the old pool.

Step 5b: Creating Striped Pool

This is the basis ZFS storage pool where incoming data is dynamically striped across all disks in the pool. Although this offers maximum write performance, it also comes with a price. Any single failed drive will make the pool completely unusable and data loss will occur. Besides the performance, the biggest advantage of Stripped pool is total storage capacity is equal to the total size of all disks. We can use the following command to create a ZFS Striped pool:

$ zpool create /dev/sdX /dev/sdX

To increase the size of the striped pool, we can simply add a drive using the following command:

$ zpool add /dev/sdX

It is important to note here that, when a new disk is added to a striped pool, ZFS will not redistribute existing data over to the new disk, but will favour the newly added disk for new incoming data. The only way to redistribute existing data is to delete, then recopy the data in which case data will be stripped on all disks.

Step 5c: Creating Mirrored Pool

As the name suggests, this pool is consists of mirrored disks. There are no restrictions on how the mirror can be formed. The main caveat using the mirrored pool is we lose 50% of total disk capacity due to the mirror.

To create a mirror pool of just two disks:

$ zpool create mirror /dev/sda /dev/sdb

To expand a mirror pool we simply need to add another group of the mirrored disk:

$ zpool add mirror /dev/sdd /dev/sde /dev/sdf

When adding another mirror group, data is striped on to the new mirrored group of the disk. Although it is rare, it is also possible to create a mirror of more than two disks:

$ zpool create mirror /dev/sda /dev/sdb /dev/sdc

Step 5d: Creating Raid-Z1, Raid-Z2 or Raid-Z3 Pool

ZFS offers software-defined RAID pools for disk redundancy. Since it is not depended on hardware RAID, all disks of a pool can be easily relocated to another server during a server failure. All Raid-ZX in ZFS works similarly with the difference in disks tolerance. The main difference between Raid-Z1, Raid-Z2 and Raid-Z3 are they can tolerate a maximum of 1, 2 and 3 disk failure respectively without any data loss.

To create Raid-Z1 we need a minimum of two drives:

$ zpool create raidz1 /dev/sda /dev/sdb

To create Raid-Z2 we need a minimum of 3 drives:

$ zpool create raidz2 /dev/sda /dev/sdb /dev/sdc

To create Raid-Z3 we need a minimum of 4 drives:

$ zpool create raidz3 /dev/sda /dev/sdb /dev/sdc /dev/sdd

When using any raidzX pool, it is important to keep in mind that, a disk loss puts the pool under heavy load due to data rebalancing. The bigger the pool, the longer it will take for rebalancing to complete.

Once a Raid-ZX pool is created it cannot be expanded just by adding new disk to it. In order to expand the pool we need to add another complete vdev. A vdev is a complete group of disks which can be standalone forming a pool or multiple vdevs forming a pool. For example, a Raid-Z3 consisting of 4 drives is one vdev. To expand the pool we need to another vdev of mirror, Raid-Z1, Raid-Z2 or Raid-Z3 vdev. Following command is to expand a Raid-Z3 with another Raid-Z3 vdev:

$ zpool add raidz3 /dev/sde /dev/sdf /dev/sdg /dev/sdh

Step 6: Adding Cache/Log Disk

We can increase both read and write performance significantly by adding faster disks such as SSD or NVMe. Cache disks increased read performance while Log disks increases write performance. These disks can be added during pool creation or even after the pool has been created. The cache and log disks can also be mirrored to increase performance redundancy.

To add mirror Cache disks during pool creation to increase read performance:

$ zpool create mirror /dev/sda /dev/sdb cache /dev/sdk /dev/sdl

Note that it may take a while to achieve maximum read performance because ZFS will automatically copy most frequently accessed data to the Cache disk over time.

To add mirror Log disks during pool creation to increase write performance:

$ zpool create mirror /dev/sda /dev/sdb log /dev/sdk /dev/sdl

Common ZFS Commands

To check pool status:

$ zpool status

To see list of ZFS datasets

$ zfs list

To import a ZFS pool which was created on another server:

$ zpool import pool_name