Enlarge /. No, you cannot buy Ironwolf hard drives with an OpenZFS logo. However, since they're guaranteed to be SMR-free, they're a good choice.
Jim Salter
Storage basics
View more stories
Can we interest you as we all enter the third month of the COVID 19 pandemic and look for new projects to keep us busy (read: reasonable)? This spring we have already discussed some necessary basics, e.g. B. how to test the speed of your hard drives and what the hell RAID is. In the second of these stories, we even promised a follow-up examining the performance of different multi-disk topologies in ZFS, the next-generation file system that you heard of from Apple to Ubuntu due to its appearance.
Well, today is the day to explore, ZFS curious readers. Just know in advance that it is "really complicated" with the understated words of OpenZFS developer Matt Ahrens.
But before we get to the numbers – and they come, as I promise! -, we must first talk about how ZFS stores your data on the hard drive so that you can design ZFS worth eight hard drives.
Zpools, vdevs and devices
-
This complete pool diagram contains one of the three supported vdev classes and four RAIDz2 storage vdevs.
-
Usually you don't want to create a "Mutt" pool with mismatched vdev types and sizes – but nothing will stop you if you want to.
To really understand ZFS, you really need to pay attention to its actual structure. ZFS merges the traditional disk management and file system layers and uses a copy-on-write transaction mechanism. Both mean that the system is structurally very different from conventional file systems and RAID arrays. The first important building blocks that you should understand are Zpools, VDEVs and devices.
zpool
The Zpool is the top ZFS structure. A zpool contains one or more vdevs, each of which contains one or more devices. Zpools are self-contained units. There may be two or more separate zpools on a physical computer, but each is completely independent of the other. Zpools cannot share vdevs with each other.
ZFS redundancy is at the vdev level and not at the zpool level. There is absolutely no redundancy at zpool level. If storage vdev or SPECIAL vdev is lost, the entire zpool is lost.
Modern Zpools can survive the loss of a CACHE or LOG vdev – although they may lose a small amount of dirty data if they lose a LOG vdev during a power outage or system crash.
It is a common misconception that ZFS writes "stripes" about the pool – but this is inaccurate. A Zpool is not a fun looking RAID0 – it is a fun looking JBOD with a complex distribution mechanism that is subject to change.
For the most part, writes are distributed to the available vdevs according to their available free space, so that theoretically all vdevs become full at the same time. In newer versions of ZFS, the vdev utilization can also be taken into account. If one vdev is significantly more busy than another (e.g. due to the read load), it can be skipped temporarily for writing despite the highest available free space.
The usage detection mechanism built into modern ZFS write distribution methods can reduce latency and increase throughput in times of unusually high load. However, it should not be confused with a license to mix slow rust disks and fast SSDs in the same pool. Such a mismatched pool generally still works as if it were composed entirely of the slowest device available.
vdev
Each zpool consists of one or more vdevs (short for virtual device). Each vdev consists of one or more real devices. Most vdevs are used for easy storage, but there are also several special vdev support classes – including CACHE, LOG and SPECIAL. Each of these vdev types can offer one of five topologies: single device, RAIDz1, RAIDz2, RAIDz3 or mirror.
RAIDz1, RAIDz2 and RAIDz3 are special variants of what storage grizzlers call "diagonal parity RAID". The values 1, 2 and 3 relate to how many parity blocks are assigned to each data strip. Instead of using entire hard drives for parity, RAIDz vdevs distribute this parity semi-evenly on the hard drives. A RAIDz array can lose as many hard drives as it contains parity blocks. If it loses another one, it fails and takes the Zpool with it.
Mirror vdevs are exactly what they sound like – in a mirror vdev every block is stored on every device in the vdev. Although two wide mirrors are most commonly used, a vdev mirror can contain any number of devices – three-way are common in larger configurations for higher reading performance and error resistance. A mirror VDEV can survive any error as long as at least one device in the VDEV remains error-free.
One-device Vdevs are exactly what they sound like – and they're inherently dangerous. A vdev with one device cannot survive an error – and if it is used as storage or SPECIAL vdev, its failure will shut down the entire zpool. Be very, very careful here.
CACHE, LOG and SPECIAL vdevs can be created with one of the topologies mentioned above. Note, however, that losing a SPECIAL-vdev means losing the pool. A redundant topology is therefore strongly recommended.