A write barrier is a kernel mechanism used to ensure that file system metadata is correctly written andordered on persistent storage, even when storage devices with volatile write caches lose power. File systems with write barriers enabled also ensure that data transmitted via fsync() is persistent throughout a power loss.
Enabling write barriers incurs a substantial performance penalty for some applications. Specifically, applications that use fsync() heavily or create and delete many small files will likely run much slower.
Import ance of Write Barriers
File systems take great care to safely update metadata, ensuring consistency. Journalled file systems bundle metadata updates into transactions and send them to persistent storage in the following manner:
1. First, the file system sends the body of the transaction to the storage device.
2. Then, the file system sends a commit block.
3. If the transaction and its corresponding commit block are written to disk, the file system assumes that the transaction will survive any power failure.
However, file system integrity during power failure becomes more complex for storage devices with extra caches. Storage target devices like local S-ATA or SAS drives may have write caches ranging from 32MB to 64MB in size (with modern drives). Hardware RAID controllers often contain internal write caches. Further, high end arrays, like those from NetApp, IBM, Hitachi and EMC (among others), also have large caches.
Storage devices with write caches report I/O as " complete" when the data is in cache; if the cache loses power, it loses its data as well. Worse, as the cache de-stages to persistent storage, it may change the original metadata ordering. When this occurs, the commit block may be present on disk without having the complete, associated transaction in place. As a result, the journal may replay these uninitialized transaction blocks into the file system during post-power-loss recovery; this will cause data inconsistency and corruption.
How Write Barriers Work
Write barriers are implemented in the Linux kernel via storage write cache flushes before and after the I/O, which is order-critical. After the transaction is written, the storage cache is flushed, the commit block is written, and the cache is flushed again. This ensures that:
The disk contains all the data.
No re-ordering has occurred.
With barriers enabled, an fsync() call will also issue a storage cache flush. This guarantees that file data is persistent on disk even if power loss occurs shortly after fsync() returns.
Enabling/Disabling Write Barriers
To mitigate the risk of data corruption during power loss, some storage devices use battery-backed write caches. Generally, high-end arrays and some hardware controllers use battery-backed write caches. However, because the cache's volatility is not visible to the kernel, Red Hat Enterprise Linux 7 enables write barriers by default on all supported journaling file systems
Write caches are designed to increase I/O performance. However, enabling write barriers means constantly flushing these caches, which can significantly reduce performance.
For devices with non-volatile, battery-backed write caches and those with write-caching disabled, you can safely disable write barriers at mount time using the -o no barri er option for mount.
However, some devices do not support write barriers; such devices will log an error message to /var/log/messages
Write barrier error messages per file system
|File Sysem||Error Message|
|ext3/ext4||JBD : barrier-based sync failed on device - disabling barriers|
|XFS||File system device - Disabling barriers, trial barrier write failed|
|btrfs||btrfs: disabling barriers on dev device|
Write Barrier Considerations
Some system configurations do not need write barriers to protect data. In most cases, other methods are preferable to write barriers, since enabling write barriers causes a significant performance penalty.
Disabling Write Caches
One way to alternatively avoid data integrity issues is to ensure that no write caches lose data on power failures. When possible, the best way to configure this is to simply disable the write cache. On a simple server or desktop with one or more SATA drives (off a local SATA controller Intel AHCI part), you can disable the write cache on the target SATA drives with the hd parm command, as in:
# hdparm -W0 /device/
Battery-Backed Write Caches
Write barriers are also unnecessary whenever the system uses hardware RAID controllers with battery-backed write cache. If the system is equipped with such controllers and if its component drives have write caches disabled, the controller will advertise itself as a write-through cache; this will inform the kernel that the write cache data will survive a power loss.
Most controllers use vendor-specific tools to query and manipulate target drives. For example, the LSI Megaraid SAS controller uses a battery-backed write cache; this type of controller requires the Meg aC l i 6 4 tool to manage target drives. To show the state of all back-end drives for LSI Megaraid SAS, use:
# MegaCli64 -LDGetProp -DskCache -LAll -aALL
To disable the write cache of all back-end drives for LSI Megaraid SAS, use:
# MegaCli64 -LDSetProp -DisDskCache -Lall -aALL
Hardware RAID cards recharge their batteries while the system is operational. If a system is powered off for an extended period of time, the batteries will lose their charge, leaving stored data vulnerable during a power failure.
High-end arrays have various ways of protecting data in the event of a power failure. As such, there is no need to verify the state of the internal drives in external RAID storage.
NFS clients do not need to enable write barriers, since data integrity is handled by the NFS server side. As such, NFS servers should be configured to ensure data persistence throughout a power loss (whether through write barriers or other means).