Question: I am trying to decide on a filesystem and would like to know if it is possible to replace a failed drive in btrfs RAID without downtime.
Suppose I create a new btrfs filesystem using the command
mkfs.btrfs -d raid1 /dev/sdb /dev/sdc
Now suppose one day /dev/sdc fails. There are two possibilities: it can fail gradually, showing S.M.A.R.T. errors – in this situation I can add a new device with btrfs device add /dev/sde /mnt; btrfs filesystem balance /mnt and then remove the old one with btrfs device delete /dev/sdc /mnt.
But if it suddenly fails, becoming unreadable… A web search says in this situation I must first unmount the filesystem, mount in degraded mode, add a new device, then remove the missing device.
umount /mntmount -o degraded /dev/sdb /mntbtrfs device add /dev/sdf /mnt btrfs device delete missing /mnt
An unmount is obviously a disruptive operation so there would be downtime – any application using ?the filesystem would get an I/O error. But these kind of “tutorials” on btrfs look outdated, considering btrfs is under heavy development.
Question is: considering current state of btrfs, is it possible to do this online, i.e. without unmounting?
If not, there is a software-only solution that can fulfill this need?
Answer: In Linux 3.8, btrfs replace mountpoint old_disk new_disk was added. If you’re running a recent kernel, it will provide the functionality you are looking for.