ZFS is one of my favorite file systems. I use it at home on my nas server (running FreeBSD) as well as my Fedora 32 desktop. Today, I’m going to show you (and my future self) how to replace a (faulty) disk in a Zpool.

The server I’ll be working on is nas2 and the pool is tank.

List the status of the current pool.

Password:
root@nas2:~ # zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: resilvered 472K in 0 days 00:00:02 with 0 errors on Sun Aug 23 16:17:21 2020
config:

        NAME                      STATE     READ WRITE CKSUM
        tank                      DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            diskid/DISK-Z4Z46MXG  ONLINE       0     0     0
            620907994788427922    UNAVAIL      0     0     0  was /dev/diskid/DISK-Z1E4ZCM2
          mirror-1                ONLINE       0     0     0
            diskid/DISK-Z1E5PFCH  ONLINE       0     0     0
            diskid/DISK-Z4Z46XQP  ONLINE       0     0     0

errors: No known data errors

From the above output, we can see that one of the disk in mirro-0 vdev was missing. (In fact, it has been removed prior to the system rebooted. I also inserted a new disk to the system.)

Let’s see check what other device we can use to replace this missing disk.

root@nas2:~ # ls -1 /dev/diskid/
DISK-Z1E5PFCH
DISK-Z4Z46MXG
DISK-Z4Z46PK2
DISK-Z4Z46XQP
root@nas2:~ #

A careful inspection shows that the disk that is not being used by my zpool is DISK-Z4Z46PK2. Warning: Be careful if you’re following this guide to replace a disk in a z-pool in your system. Choosing a wrong disk might destroy your other working file system.

So, I can now proceed to replace this disk using the command: zpool replace [POOL_NAME] [OLD_DISK] [NEW_DISK].

root@nas2:~ # zpool replace tank /dev/diskid/DISK-Z1E4ZCM2 /dev/diskid/DISK-Z4Z46PK2

The next thing we have to do is wait for the resiliver to complete.

root@nas2:~ # zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Sep 20 12:03:51 2020
        300G scanned at 52.2M/s, 156G issued at 2.11G/s, 300G total
        7.03G resilvered, 52.13% done, 0 days 00:01:07 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        DEGRADED     0     0     0
          mirror-0                  DEGRADED     0     0     0
            diskid/DISK-Z4Z46MXG    ONLINE       0     0     0
            replacing-1             UNAVAIL      0     0     0
              620907994788427922    UNAVAIL      0     0     0  was /dev/diskid/DISK-Z1E4ZCM2
              diskid/DISK-Z4Z46PK2  ONLINE       0     0     0
          mirror-1                  ONLINE       0     0     0
            diskid/DISK-Z1E5PFCH    ONLINE       0     0     0
            diskid/DISK-Z4Z46XQP    ONLINE       0     0     0

errors: No known data errors

That’s it.