HomeFreeBSD

OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue

Description

OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue

PROBLEM

When dmu_tx_assign is called from zil_lwb_write_issue, it's possible
for either ERESTART or EIO to be returned.

If ERESTART is returned, this will cause an assertion to fail directly
in zil_lwb_write_issue, where the code assumes the return value is
EIO if dmu_tx_assign returns a non-zero value. This can occur if the
SPA is suspended when dmu_tx_assign is called, and most often occurs
when running zloop.

If EIO is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, zil_commit_waiter_timeout contains the
following logic:

lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if dmu_tx_assign returned EIO from within
zil_lwb_write_issue, the lwb variable passed in will not be issued
to disk. Thus, it's lwb_state field will remain LWB_STATE_OPENED and
this assertion will fail. zil_commit_waiter_timeout assumes that after
it calls zil_lwb_write_issue, the lwb will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where dmu_tx_assign returns EIO.

SOLUTION

This change modifies the dmu_tx_assign function such that txg_how is
a bitmask, rather than of the txg_how_t enum type. Now, the previous
TXG_WAITED semantics can be used via TXG_NOTHROTTLE, along with
specifying either TXG_NOWAIT or TXG_WAIT semantics.

Previously, when TXG_WAITED was specified, TXG_NOWAIT semantics was
automatically invoked. This was not ideal when using TXG_WAITED within
zil_lwb_write_issued, leading the problem described above. Rather, we
want to achieve the semantics of TXG_WAIT, while also preventing the
tx from being penalized via the dirty delay throttling.

With this change, zil_lwb_write_issued can acheive the semtantics that
it requires by passing in the value TXG_WAIT | TXG_NOTHROTTLE to
dmu_tx_assign.

Further, consumers of dmu_tx_assign wishing to achieve the old
TXG_WAITED semantics can pass in the value TXG_NOWAIT | TXG_NOTHROTTLE.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:

  • Additionally updated zfs_tmpfile to use TXG_NOTHROTTLE

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084

Details

Provenance
Prakash Surya <prakash.surya@delphix.com>Authored on Jan 8 2018, 9:45 PM
Brian Behlendorf <behlendorf1@llnl.gov>Committed on Jan 27 2018, 4:19 AM
Parents
rG522db29275b8: zpool import -d to specify device path
Branches
Unknown
Tags
Unknown