Page MenuHomeFreeBSD

Improve POSIX compliance for RLIMIT_FSIZE
ClosedPublic

Authored by kib on Sep 18 2022, 8:12 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 15, 7:40 AM
Unknown Object (File)
Thu, Oct 24, 2:53 AM
Unknown Object (File)
Sun, Oct 20, 10:20 AM
Unknown Object (File)
Oct 1 2024, 3:06 PM
Unknown Object (File)
Sep 22 2024, 7:29 AM
Unknown Object (File)
Sep 11 2024, 12:40 PM
Unknown Object (File)
Sep 9 2024, 9:42 PM
Unknown Object (File)
Sep 9 2024, 9:42 PM
Subscribers

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Sep 18 2022, 8:12 PM

Thanks for fixing this. As for what the correct behavior should be when the file doesn't yet exceed the limit but a write would cross that threshold, I'm agnostic. I think the old behavior of aborting the write entirely was fine. But the new behavior is fine too. I can fix truncate with fusefs after you merge this change.

This revision is now accepted and ready to land.Sep 18 2022, 8:42 PM

As for what the correct behavior should be when the file doesn't yet exceed the limit but a write would cross that threshold, I'm agnostic. I think the old behavior of aborting the write entirely was fine. But the new behavior is fine too.

If the request would cause the file size to exceed the soft file size limit for the process and there
is no room for any bytes to be written, the request shall fail and the implementation shall
generate the SIGXFSZ signal for the thread.

From the decription of write() in IEEE Std 1003.1™-2017

sys/fs/tmpfs/tmpfs_vnops.c
660
663

I think VFS_TO_TMPFS(vp->v_mount)->tm_maxfilesize deserves a local variable.

sys/ufs/ffs/ffs_vnops.c
895

In dofilewrite(), we compare resid before and after the operation to determine whether to return to userspace, and to determine the return value for write(). Won't this clamping break it?

kib marked 3 inline comments as done.Sep 21 2022, 10:45 PM

Fix the issue with uio_resid:

  • move both limit check and max file size check into new helper vn_rlimit_fsizex(), and clamps uio_resid
  • the helper returns original uio_resid, which must be passed to vn_rlimit_fsizex_res() on return
  • _res() adjusts uio_resid back to the original value minus processed bytes

vn_rlimit_fsize() is kept around for unconverted filesystems, but is implemented by using compat mode in vn_rlimit_fsizex()

This revision now requires review to proceed.Sep 21 2022, 10:49 PM
sys/kern/vfs_vnops.c
2393

A comment or some other kind of documentation is needed to know when to use this variant.

2458

or maybe "... can handle writes truncated to the file size limit".

2482

Why not simply drop the const qualifier from the function?

kib marked 3 inline comments as done.Sep 22 2022, 2:34 PM
kib added inline comments.
sys/kern/vfs_vnops.c
2482

Because I think it would be wrong, the function does not modify *uio. DECONST there is an implementation detail.

Also, after all filesystems are converted, vn_rlimit_fsize() should be removed.

kib marked an inline comment as done.

Update comments

After Peter testing; fixed bugs.

Always update *orig_resid, since vn_rlimit_fsizex_res() is called unconditionally.
Interpret maxfsz == 0 as no file size limit.
Do not exclude file size limit check for td == NULL case (kernel write).
Improve comments.

markj added inline comments.
sys/kern/vfs_vnops.c
2434

Is there some specific reason putpages does not set uio_td? Other kernel writers (e.g., md thread) do set it.

This revision is now accepted and ready to land.Sep 24 2022, 4:24 PM
sys/kern/vfs_vnops.c
2434

I suspect it comes down to the question of the ownership of the writes. vnode_pager is executed in context which is, generally, different from the thread that dirtied the pages. In particular, this was a bug I have in the intermediate version of vn_rlimit_fsizex().

Current nfs client forcibly set uio_td to curthread in nfs_putpages(), I did not checked what did the old client. Perhaps other network filesystems could be affected too.

Note that e.g. md(4) worker thread is the owner of writes, so setting uio_td to md thread is correct.