Enlarge /. OpenZFS 2.0.0 brings a lot of new features and performance improvements for Linux and BSD platforms.
This Monday, Brian Behlendorf, lead developer of ZFS on Linux, released the OpenZFS 2.0.0 version for GitHub. Along with many new features, the announcement ends the previous distinction between "ZFS on Linux" and ZFS elsewhere (e.g. under FreeBSD). This step took a long time – the FreeBSD community set out their side of the roadmap two years ago – but this is the release that makes it official.
The new OpenZFS 2.0.0 version is already available on FreeBSD, where it can be installed on FreeBSD 12 systems via ports (which overwrite the base ZFS system) and will be the base version of FreeBSD in the upcoming FreeBSD 13. Under Linux, the situation is a bit less secure and depends largely on the Linux distribution.
Linux distro users who use OpenZFS kernel modules created by DKMS will usually get the new version fairly quickly. Users of the better-supported, but slower-running Ubuntu will likely not see OpenZFS 2.0.0 until Ubuntu 21.10, in almost a year. For Ubuntu users willing to live on the edge, the popular but third-party and individually serviced jonathonf PPA could make it available significantly sooner.
OpenZFS 2.0.0 modules can be built from source code for Linux kernels from 3.10-5.9. However, most users should stick to getting pre-built modules from distributions or established developers. "Far beyond the beaten track" is not a phrase that should generally be applied to the file system that contains your valuable data!
Rebuilding degraded arrays in ZFS has historically been very different from traditional RAID. With almost empty arrays, ZFS rebuild, known as "resilvering," was much faster because ZFS only needs to touch the portion of the hard drive in use rather than cloning every sector across the entire drive. However, this process involved a plethora of random I / O operations. With near-complete arrays, block-by-block rebuilding of the entire hard drive using traditional RAIDs was much faster. With sequential resilvering, ZFS gets the best of both worlds: largely sequential access, while unused parts of the affected hard disk (s) are skipped.
One of the most compelling features of ZFS is the advanced read cache known as ARC. Systems with very large, very hot work sets can also implement an SSD-based read cache called L2ARC, which is filled from blocks in the ARC that are about to be evacuated. Historically, one of the biggest problems with L2ARC is that while the underlying SSD is persistent, the L2ARC itself is not – it becomes empty on every reboot (or export and import of the pool). This new feature allows data in L2ARC to remain available and functional between pool import / export cycles (including system reboots), greatly increasing the potential value of the L2ARC device.
Zstd compression algorithm
OpenZFS offers transparent inline compression that can be controlled via the granularity per data set. Traditionally, the most common algorithm used has been lz4, a streaming algorithm that offers a relatively poor compression ratio but very low CPU usage. OpenZFS 2.0.0 provides support for zstd – an algorithm developed by Yann Collet (the author of lz4) that is supposed to provide compression similar to gzip and CPU usage similar to lz4.
The x-axis is the transmission speed, the y-axis is the compression ratio. Look for ZSTD1-19 in the upper left in dark blue, ZSTD-FAST in the lower right in light blue, and LZ4 as a single orange dot to the right of ZSTD-FAST.
Zstd is the dark blue cluster in the top left, zstd-fast is the light blue line that runs along the bottom of the graph, and LZ4 is the single orange dot slightly to the right of the upright segment of zstd-fast.
These graphs are a little tricky to follow – but essentially they show that zstd is achieving its goals. During compression (disk writes), zstd-2 is more efficient than even gzip-9 while maintaining high throughput.
Compared to lz4, zstd-2 achieves 50 percent higher compression in return for a throughput penalty of 30 percent. On the decompression (disk read) side, the throughput penalty is slightly higher at around 36 percent.
Note that the throughput "penalties" described assume a negligible bottleneck on the storage medium itself. In practice, most CPUs can run rings around most storage media (even relatively slow CPUs and fast SSDs). ZFS users are generally used to lz4 compression speeding up the workload in the real world, not slowing it down!
This is a bit of a brain-breaker. For example, suppose there are parts of your data that you don't want to back up using ZFS Replication. First, you'll clone the data set. Next, you'll delete the sensitive data from the clone. Then you create a bookmark for the parent record that highlights the blocks that have changed from parent to clone. Finally, you can send the parent record to its backup destination, including the –redact redaction_bookmark argument – and this will only replicate the non-sensitive blocks to the backup destination.
Additional improvements and changes
In addition to the main features described above, OpenZFS 2.0.0 offers fallocate support. improved and reorganized man pages; higher performance for zfs destroy, zfs send and zfs receive; more efficient memory management; and optimized encryption performance. In the meantime, some rarely used functions – deduplicated broadcast streams, dedupditto blocks, and the module option zfs_vdev_scheduler – are obsolete.
For a full list of changes, see the original release announcement on GitHub at https://github.com/openzfs/zfs/releases/tag/zfs-2.0.0.