Page MenuHomeFreeBSD

add a kernel flag for VOP_COPY_FILE_RANGE for return after 1second
ClosedPublic

Authored by rmacklem on Sep 4 2021, 1:57 AM.
Tags
None
Referenced Files
F108601586: D31829.diff
Sun, Jan 26, 7:41 PM
Unknown Object (File)
Tue, Jan 14, 2:46 PM
Unknown Object (File)
Tue, Jan 14, 10:22 AM
Unknown Object (File)
Thu, Jan 9, 3:49 AM
Unknown Object (File)
Sun, Jan 5, 10:08 AM
Unknown Object (File)
Dec 1 2024, 12:52 AM
Unknown Object (File)
Nov 27 2024, 10:24 PM
Unknown Object (File)
Oct 15 2024, 5:06 AM
Subscribers
None

Details

Summary

Although it is not specified in the RFCs, the concept that
the NFSv4 should reply within a reasonable time is accepted
practice within the NFSv4 community.

Without this patch, the NFSv4.2 server attempts to reply to
a Copy operation within 1second by limiting the copy to
vfs.nfs.maxcopyrange (default 10Mbytes). This is crude at
best, given the large variation in I/O subsystem performance.

This patch adds a kernel only flag COPY_FILE_RANGE_TIMEO1SEC
that the NFSv4.2 can specify, which tells VOP_COPY_FILE_RANGE()
to return after approximately 1 second with a partial result and
implements this in vn_generic_copy_file_range(), used by
vop_stidcopyfilerange().

Test Plan

Tested against both the FreeBSD and Linux NFSv4.2 clients.
For copies where the file exists on disk (not yet read into the
buffer cache for UFS), behaviour is as desired, with the NFSv4.2
Copy returning after 1second.

When the same copy test is repeated under UFS (file being
copied from is in the buffer cache), the RPC times range from
1->2.5seconds. This occurs because a single vn_rdwr(UIO_WRITE, ...)
can take more than 1second to complete. I believe this is because
the buffer cache flushes many writes at once.

For this application, having the RPC take up to 2.5seconds for
this unusual case is acceptable, since there is no set limit (or
definition of 1second as a desired RTT) for RPC RTT in any of the RFCs.
--> 1second is simply the value for "reasonable time" I have typically used.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rmacklem created this revision.

Looks pretty safe. How will that flag ever get set?

sys/kern/vfs_syscalls.c
4883 ↗(On Diff #94644)

Wouldn't this check make the patch always return EINVAL for any kernel flags?
I think it should be moved into sys_copy_file_range().

That said, I do not see why limiting the flag to kernel.

sys/kern/vfs_syscalls.c
4883 ↗(On Diff #94644)

Yes. I never tested setting the flag from userland.

The NFS server sets it as an argument to vn_copy_file_range(),
which passes it to VOP_COPY_FILE_RANGE(). I didn't include
the trivial NFS server patch, because I figured it would be committed
separately, after this one.

The only argument against allowing this flag on the syscall is "Linux compatibility".
I avoided "API debates" by making the syscall "Linux compatible" and the Linux
syscall does not have any defined flags yet, as far as I know.
What do others think?

4965 ↗(On Diff #94644)

The only two places that kern_copy_file_range() gets called is
sys_copy_file_range() and linux_copy_file_range(). Since the
latter needs the "flags != 0" --> EINVAL check, the question
is whether the FreeBSD syscall should allow the flag?

Either way, the above line should just be deleted.

Delete useless line that cleared kernel flags
after a check for flags == 0.

sys/kern/vfs_syscalls.c
4883 ↗(On Diff #94644)

I think that might be exposed later if we really want this in userspace.

I am okay with the flag in general, just wonder if it is possible to make it slightly more tunable than fixing it to be 1sec.

Well, if you think there might be future uses of different
timeouts, the high order 8bits could be defined as the
timeout value (when non-zero) in 1/10seconds.
--> That would use up 8 of the 32bits, but since Linux

has not defined any flags yet, I think that's ok.

Even 1/10sec resolution isn't always going to be achieved.
I mentioned the case where the vn_rdwr() calls takes more than
one second.

What do you guys think?

Well, if you think there might be future uses of different
timeouts, the high order 8bits could be defined as the
timeout value (when non-zero) in 1/10seconds.
--> That would use up 8 of the 32bits, but since Linux

has not defined any flags yet, I think that's ok.

Even 1/10sec resolution isn't always going to be achieved.
I mentioned the case where the vn_rdwr() calls takes more than
one second.

This point is fair enough.

This revision is now accepted and ready to land.Sep 6 2021, 6:37 AM