HomeFreeBSD

Direct IO support

Description

Direct IO support

Direct IO via the O_DIRECT flag was originally introduced in XFS by
IRIX for database workloads. Its purpose was to allow the database
to bypass the page and buffer caches to prevent unnecessary IO
operations (e.g. readahead) while preventing contention for system
memory between the database and kernel caches.

On Illumos, there is a library function called directio(3C) that
allows user space to provide a hint to the file system that Direct IO
is useful, but the file system is free to ignore it. The semantics
are also entirely a file system decision. Those that do not
implement it return ENOTTY.

Since the semantics were never defined in any standard, O_DIRECT is
implemented such that it conforms to the behavior described in the
Linux open(2) man page as follows.

  1. Minimize cache effects of the I/O. By design the ARC is already scan-resistant which helps mitigate the need for special O_DIRECT handling. Data which is only accessed once will be the first to be evicted from the cache. This behavior is in consistent with Illumos and FreeBSD. Future performance work may wish to investigate the benefits of immediately evicting data from the cache which has been read or written with the O_DIRECT flag. Functionally this behavior is very similar to applying the 'primarycache=metadata' property per open file.
  2. O_DIRECT _MAY_ impose restrictions on IO alignment and length. No additional alignment or length restrictions are imposed.
  3. O_DIRECT _MAY_ perform unbuffered IO operations directly between user memory and block device. No unbuffered IO operations are currently supported. In order to support features such as transparent compression, encryption, and checksumming a copy must be made to transform the data.
  4. O_DIRECT _MAY_ imply O_DSYNC (XFS). O_DIRECT does not imply O_DSYNC for ZFS. Callers must provide O_DSYNC to request synchronous semantics.
  5. O_DIRECT _MAY_ disable file locking that serializes IO operations. Applications should avoid mixing O_DIRECT and normal IO or mmap(2) IO to the same file. This is particularly true for overlapping regions. All I/O in ZFS is locked for correctness and this locking is not disabled by O_DIRECT. However, concurrently mixing O_DIRECT, mmap(2), and normal I/O on the same file is not recommended.

This change is implemented by layering the aops->direct_IO operations
on the existing AIO operations. Code already existed in ZFS on Linux
for bypassing the page cache when O_DIRECT is specified.

References:

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #224
Closes #7823

Details

Provenance
Brian Behlendorf <behlendorf1@llnl.gov>Authored on Aug 27 2018, 5:04 PM
GitHub <noreply@github.com>Committed on Aug 27 2018, 5:04 PM
Parents
rG5097b4e42513: Remove %changelog from spec file
Branches
Unknown
Tags
Unknown

Event Timeline

GitHub <noreply@github.com> committed rGa584ef260530: Direct IO support (authored by Brian Behlendorf <behlendorf1@llnl.gov>).Aug 27 2018, 5:04 PM