ZFS is a stable, robust, and fault-tolerant file system with built-in RAID-like properties and drive pools. We show you how to create a ZFS drive pool and control access to it.
- What is ZFS?
- Installing ZFS
- ZFS Pool Types
- Creating a Mirrored Pool
- Verifying the Pool
- Defining the Mount Point
- Setting User Permissions
- Data is the New Gold
What is ZFS?
ZFS is an advanced file system that originated at Sun Microsystems for their Solaris operating system. It is now owned by Oracle Corporation following Oracle’s 2009 acquisition of Sun Microsystems. However, an open source version was released by Sun Microsystems from 2005 onwards. This was ported to Linux making it widely available. The open source version of ZFS is managed and maintained by the OpenZFS project.
ZFS is a high-capacity, fault tolerant file system. ZFS originally stood for Zettabyte File System. Nowadays it can store up to 256 Zebibytes of data.
ZFS is exceptionally fault-tolerant. It combines features that deliver file system pooling, cloning and copying, and RAID-like functionality, natively. Each file has a checksum, so ZFS can tell if a file is corrupted or not.
We’re going to walk through the steps required to install ZFS on Ubuntu 20.04 (Focal Fossa) and to set up and use a drive pool. We’ll also describe a way to control access to the data in the pool.
Installing ZFS
To install ZFS, use this command:
sudo apt-get install zfsutils-linux
When the installation is complete you can check ZFS is present and correct by using the which
and where
commands.
which zfs
whereis zfs
The which
command ensures ZFS is in your command search path. The whereis
command shows where the ZFS binaries are located, where its supporting and additional files are located, and that the man page has been installed too.
ZFS Pool Types
A ZFS pool is created by logically combining and treating different physical hard drives as though they were a single addressable entity.
There are two ways to do this. If you combine the hard drives as a striped or RAID 0 pool, you get to use all of the combined capacity of the hard drives. However, there is no redundant storage. If a hard drive fails it will break the file system and you will lose data.
The preferred—and strongly recommended—method is to create a mirrored or RAID 1 pool. With this type of pool, your capacity is limited to the size of the smallest drive in the pool but, even with the loss of a hard drive, the file system will be operational.
You can replace a failed drive with no loss of data, and no downtime. With a pool of three drives you could withstand a failure of two of the physical drives and still have an operational system with intact data.
The generic form of the command to create a striped pool is:
sudo zpool create pool-name drive-1 drive-2 drive-3 ...
To create a mirror pool we added the word “mirror” to the command:
sudo zpool create pool-name mirror drive-1 drive-2 drive-3 ...
Creating a Mirrored Pool
Before we can tell ZFS which hard drives to include in your pool, we need to identify them. To do so, use this command:
sudo blkid | grep /dev/sd
The blkid
(print block device attributes) command lists the block devices in your system, and we’re piping that through grep
to filter out the /dev/sd
devices. These are the hard drives. We can see the four hard drives fitted to this computer.
Linux identifies drives by letter and partitions by number.
- sda: The first hard drive.
- sdb: The second hard drive.
- sdc: The third hard drive.
- sda1: The first partition on the first hard drive.
We’re going to use hard drives two, three, and four. So we will be using /dev/sdb
, /dev/sdc
, and /dev/sdd
.
Here is the command to create the pool. Note that we are including the “mirror” parameter to create a RAID 1 pool, and we’re naming our pool “itenterpriser.” We’ll be able to refer to the pool by that name later.
sudo zpool create itenterpriser mirror /dev/sdb /dev/sdc /dev/sdd
Verifying the Pool
You’re quietly returned to the command prompt. Did anything actually happen? We can check the status of all of our ZFS pools using the zpool status command.
sudo szpool status
We have a single pool configured on this computer, and it is called “itenterpriser.”
Let’s use the df
(disk free) command and pipe that through grep
(search using regular expression) to locate entries with “itenterpriser” in them. The -h
(human-readable) option tells df
to show hard drive capacities in user-friendly units.
df -h | grep itenterpriser
This tells us two things. Our “itenterpriser” pool is mounted on “/itenterpriser”. As expected, although there are three hard drives in the pool, it has the capacity of just one of the hard drives.
We can cd
into that location just as though it was any other directory in the computer’s file system.
cd /itenterpriser/
That’s working. Great. Now let’s destroy it.
Defining the Mount Point
By default, the pool will be mounted on a mount point in the root of the file system, and the mount point is named the same as the pool. Usually, you’d choose where to mount your pool.
To remove a pool we use the zpool destroy
command and the name of the pool:
sudo zpool destroy itenterpriser
Again, we’re silently returned to the command prompt. We’ll recreate our pool and use the -m
(mount point) option to specify where we’d like the pool to be mounted.
sudo zpool create -m /usr/share/itenterpriser itenterpriser mirror /dev/sdb /dev/sdc /dev/sdd
Setting User Permissions
Only root is able to store information in the pool. To allow other users to have write access to the pool we need to follow a few steps.
We’re going to control who can access the pool. We’ll create a new user group, and set that user group as the group-owner of the data location. That means we can add and remove users from that group to grant or remove access to the data.
We use groupadd
(create a new group) to add a user group. We’re calling it “ite-pool.” We use the usermod
(modify a user account) command to add a user to a group. The -a
(append) and -G
(groups) options combine to add the new group to the list of existing groups that the user is in.
sudo groupadd ite-pool
sudo usermod -a -G ite-pool dave
The user must log out and back in before they are seen as a member of the group.
We’re going to create a directory in the pool and change its group ownership to the “ite-pool” group. We’ll then set the group file permissions for that directory to read, write, and execute. The effect of this is to grant those permissions to any users who are in the “ite-pool” group.
We could do this on the root folder of the pool, of course, but we gain flexibility and control by setting the permissions on the new directory. For example, we can create as many directories in the pool as we need, and configure different groups of users to have access to them. It also means the users don’t need to have read, write, and execute permissions across the entire pool.
To create a directory called “data” in the pool, we type:
sudo mkdir /usr/share/itenterpriser/data/
We’ll use chgrp
(change group ownership) to set the group owner of the directory to “ite-pool”:
sudo chgrp ite-pool /usr/share/itenterpriser/data/
To set group permissions for the directory we’ll use chmod
. The “s” sticky bit flag means all files and folders created under that directory will inherit these permissions.
sudo chmod g+rwsx /usr/share/itenterpriser/data/
What we’ve achieved is to set the group ownership of the “data” directory to “ite-pool.” members of that group will have read, write, and execute permissions in that directory. Earlier, we added our current user “dave” to the “ite-pool” group.
If that user tries to create a file in the root of the pool they are denied permission. If they repeat that command inside the “data” folder, they’re granted permission and the file is created.
touch user/share/itenterpriser/text.txt
touch user/share/itenterpriser/data/text.txt
ls user/share/itenterpriser/data/
Our permissions are working.
Data is the New Gold
ZFS makes it simple for you to keep your data as safe and always-accessible as possible, and with relative ease too. But fault-tolerant file systems don’t replace backups. What ZFS does is allow you to have drive failures without downtime nor having to resort to restoring backups.
You must maintain your backup regime and schedule regular backups. But with ZFS you should need to turn to your backups much less frequently.