Page MenuHomeFreeBSD

Add aio_writev and aio_readv
ClosedPublic

Authored by asomers on Dec 23 2020, 6:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 21, 6:14 AM
Unknown Object (File)
Sat, Jan 18, 12:05 AM
Unknown Object (File)
Wed, Jan 15, 2:36 AM
Unknown Object (File)
Mon, Jan 13, 10:57 PM
Unknown Object (File)
Mon, Jan 13, 3:04 AM
Unknown Object (File)
Sat, Jan 11, 3:04 AM
Unknown Object (File)
Thu, Jan 9, 4:39 AM
Unknown Object (File)
Tue, Jan 7, 4:29 PM

Details

Summary

POSIX AIO is great, but it lacks vectored I/O functions. This commit fixes
that shortcoming by adding aio_writev and aio_readv. They aren't part of
the standard, but they're an obvious extension. They work just like their
synchronous equivalents writev and readv.

It isn't yet possible to use vectored aiocbs with lio_listio, but that could
be added in the future.

Test Plan

test cases added

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This is the same as D27624, but rebased onto git, and after D27682.

sys/kern/vfs_aio.c
814

Extra ()

1276

So if an error occurs in the loop, after some iovs are already queued, you return -1 and aio_aqueue_file() re-queues them ? Am I missing something ?

asomers added inline comments.
sys/kern/vfs_aio.c
1276

If aio_qbio returns -1, then aio_aqueue_file will queue it using the slow path, aio_process_rw.

sys/kern/vfs_aio.c
1276

So you confirm my observation, and it is a valid bug.

sys/kern/vfs_aio.c
1276

No, it's not a bug. aio_queue_file tries the fast path first, and if that doesn't work it tries the slow path. Perhaps you're thinking that aio_process_rw calls aio_qbio? It doesn't. Only aio_queue_file does.

sys/kern/vfs_aio.c
1276

The loop processes iovs one by one. If e.g. first iov was queued, but second causes an error, aio_queue_file() requeues everything. So first iov is queued twice.

sys/kern/vfs_aio.c
1276

Ok, I see what you're saying. Some of the iovs will be issued twice in that case. I think I can fix that.

  • aio_qbio: fail vectored requests faster if they're going to be retried
  • aio: better handling for EFAULT during vectored operations
sys/kern/vfs_aio.c
1280

struct bio * (space before star)

BTW might be mallocarray(9) is better

1324

Why -i ?

Also the cleanup for kaio_buffer_count might be more logical in destroy_bios.

1358

You need to qremove and unhold pages for already processed buffers.

asomers added inline comments.
sys/kern/vfs_aio.c
1324

Leftover from an earlier revision. Should be just iovcnt.

asomers marked an inline comment as done.
  • Completely cleanup bios after EFAULT

The chelsio ones it looks like I can fixup later since they explicitly whitelist LIO_READ and LIO_WRITE.

sys/kern/vfs_aio.c
1659

This needs to be done in the syscalls themselves as different ABIs need to populate the uio. For example, the freebsd32 versions need to use freebsd32_copyinuio rather than copyinuio. Either that or you have to make a new ops->copyinuio callback that matches the copyinuio signature. It can use copyinuio for the native ABI and freebsd32_copyinuio for freebsd32.

asomers added inline comments.
sys/kern/vfs_aio.c
1659

Good catch!

asomers marked an inline comment as done.
  • Add notes about EFAULT to the man pages
  • Fix 32-bit build of one test, and improve an error message
  • Make freebsd32_copyinuio a public function
  • Fix vectored aio in 32-bit emulation

I think this looks ok, though I'm going to defer to kib@ about the bio questions he raised earlier and if those are all resolved to his satisfaction.

sys/kern/vfs_aio.c
1519

The blank line here in the old code is a style feature. :)

2780–2790

style nit: blank line before this comment.

This revision is now accepted and ready to land.Dec 31 2020, 11:49 PM
asomers marked 2 inline comments as done.
  • fix typo in comment
  • Fix an atomic access
  • Make the aio zvol tests parallelizable.
  • fixup to "Fix 32-bit build of one test"
  • style changes requested by jhb
This revision now requires review to proceed.Jan 1 2021, 3:09 AM

aio_biocleanup() and its uses looks fine.

I do have the hope that changes to generated files will be pushed as separate commit.

sys/kern/vfs_aio.c
2434–2435

Space before * in the cast type.

sys/sys/aio.h
99

Anonymous unions where added in C11. In other words, this change as is breaks system headers use in C99 mode.

Also AFAIR they are not in C++11, but you might want to recheck this claim.

105

aio_iovcnt is count of array elements, why it is int and not size_t, at least ? If this is done to match the uio_iovcnt type, then arguably it is type error (fortunately kernel-only).

It would definitely require an overflow check on lowering type rank assignment.

asomers added inline comments.
sys/sys/aio.h
99

Is it a requirement for our headers to all be strictly C99 compatible? I notice that we already have many anonymous unions that aren't guarded by #ifdef KERNEL. For example, struct kinfo_file includes an anonymous union.

Even so, I could do the stupid "named union with #defines for each element` hack, if you would prefer.

105

I did it to match the signature of pwritev, where iovcnt is an int. Do you think I should change it?

sys/sys/aio.h
99

I think it is a reasonable goal to have system headers usable for userspace in as many situations as possible.

Note that kinfo/user.h is very much FreeBSD-specific header, while aio.h is POSIX-mandated. POSIX still requires compliance with C99, so if we cannot provide aio.h usable with C99, we already lost POSIX.

105

I think it is a type error in preadv/pwritev as well. But I see that our preadv/pwritev have a signature matching Linux implementation. It is too late to fix this, perhaps.

Ok then.

  • style fix
  • Restore C99 compliance to aio.h
asomers added inline comments.
sys/sys/aio.h
99

The "named union with #defines" hack didn't work very well, because it clashed with local variables and other structs' fields that shared the same name. What do you think about this alternative? We're still POSIX compliant, if the right macros are defined. But we don't have to hack up the structure or abuse the preprocessor too badly.

sys/sys/aio.h
99

Can you show the hack that 'did not work very well'.

Your patch currently provides different ABI for struct aiocb depending on the compiler flags, and most likely does not work with C++. Even if you fixed what you intended to do, it is ugly and really not needed. We have named unions working in more complex situations without a glitch.

asomers added inline comments.
sys/sys/aio.h
99

How does it provide different ABIs depending on the compiler flags? The structure layout is the same.

Here is the named union hack:

--- a/sys/sys/aio.h
+++ b/sys/sys/aio.h
@@ -96,13 +96,13 @@ typedef struct aiocb {
 	int	aio_fildes;		/* File descriptor */
 	off_t	aio_offset;		/* File offset for I/O */
 	union {
-		volatile void *aio_buf;	/* I/O buffer in process space */
-		struct iovec *aio_iov;	/* I/O scatter/gather list */
-	};
+		volatile void *aio_un1_buf; /* I/O buffer in process space */
+		struct iovec *aio_un1_iov;  /* I/O scatter/gather list */
+	} aio_buf_un;
 	union {
-		size_t	aio_nbytes;	/* Number of bytes for I/O */
-		int	aio_iovcnt;	/* Length of aio_iov */
-	};
+		size_t	aio_un2_nbytes;	/* Number of bytes for I/O */
+		int	aio_un2_iovcnt;	/* Length of aio_iov */
+	} aio_nbytes_un;
 	int	__spare__[2];
 	void	*__spare2__;
 	int	aio_lio_opcode;		/* LIO opcode */
@@ -111,6 +111,11 @@ typedef struct aiocb {
 	struct	sigevent aio_sigevent;	/* Signal to deliver */
 } aiocb_t;
 
+#define	aio_buf		aio_buf_un.aio_un1_buf
+#define	aio_iov		aio_buf_un.aio_un1_iov
+#define	aio_nbytes	aio_nbytes_un.aio_un2_nbytes
+#define	aio_iovcnt	aio_nbytes_un.aio_un2_iovcnt
+
 #ifdef _KERNEL
 
 typedef void aio_cancel_fn_t(struct kaiocb *);

The aio_buf macro conflicted with the field of the same name in struct oaiocb, struct oaiocb32 and struct aiocb32. It would also conflict with any similarly named struct definitions or local variables in user code.

sys/sys/aio.h
99

First, lets clear my note about ABI difference. I thought that your union hack changes alignment requirements but it seems that you are right.

That said, several notes:

  1. userspace must not use any symbols starting with aio_, they are reserved by POSIX. See vol.2 2.2.2 Namespace section.
  2. oaiocb of course can be fixed, while problem with aiocb32 perhaps means that this structure definition does not properly match struct aiocb and this needs fixing.

But looking at the stuff from some distance, why do you need aio_iov at all ? aio_buf is typed so that assignment of struct iovec * to it should just work. Then aio_nbytes is size_t. So if you use aio_buf/aio_nbytes as the pointer to iovec and counter, everything would work without the surgery on the struct aiocb.

sys/sys/aio.h
99

You mean why not just overload aio_buf and aio_nbytes? I certainly could've. I used the union instead because I thought it would be clearer. As a matter of style, I don't like reusing even local variables for different purposes, let alone struct members. I'll update the review to show you what that option looks like.

  • Don't use union types in struct aiocb
  • Add notes about sys/uio.h to the man pages

Please do not push generated files in the same commit.

This revision is now accepted and ready to land.Jan 2 2021, 10:54 PM
This revision was automatically updated to reflect the committed changes.
arichardson added inline comments.
tests/sys/aio/aio_test.c
1639

This test and vectored_zvol_poll appear to be failing in CI recently (https://ci.freebsd.org/job/FreeBSD-main-amd64-test/19158/testReport/junit/sys.aio/aio_test/vectored_unaligned/). Any ideas why this could be happening?

I am not sure if it has ever succeeded in CI, most previous runs show it as skipped.

tests/sys/aio/aio_test.c
1639

No, and I can't reproduce that failure myself. It might be a race condition in ZFS, or maybe for some other reason the zvol doesn't have the expected name.