cred: convert the refcount from int to long
ClosedPublic
Actions

Authored by mjg on Mar 22 2023, 9:09 PM.

Details

Reviewers

emaste
markj

Commits

rG37337709d333: cred: convert the refcount from int to long
rG21d29c519221: cred: make the refcount signed

Summary

commit 998a1c313a986dc7c7457d0b5369381f65fa1330
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Wed Mar 22 20:42:04 2023 +0000

    cred: convert the refcount from int to long
    
    On 64-bit platforms this sorts out worries about mitigating bugs which
    overflow the counter without pessimizng anything, most notably it avoids
    whacking per-thread operation in favor of refcount(9) API.
    
    The struct already had two instances of 4 byte padding with 256 bytes in
    size, cr_flags gets moved around to avoid growing it.
    
    32-bit platforms could also get the extended counter, but I did not do
    it as one day(tm) the mutex protecting centralized operation should be
    replaced with atomics and 64-bit ops on 32-bit platforms remain quite
    penalizing.
    
    While worries of counter overflow are addressed, the following is not:
    - counter *underflows*
    - buffer overruns from adjacent allocations
    - UAF due to stale cred pointer
    
    As such, while lipstick was placed, the pig should not be participating
    in any beauty pageants.
    
    Reviewed by:
    Differential Revision:

commit ceb8f401fcc5d956b9b92cff6aa6946a932f48bf
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Wed Mar 22 21:44:55 2023 +0000

    cred: make ref signed
    
    There are asserts on the count being > 0, but they are less useful than
    they can be because the type itself is unsigned. The kernel is compiled
    with -frapv, making wraparound perfectly defined.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

mjg created this revision.Mar 22 2023, 9:09 PM

Herald added a subscriber: imp. · View Herald TranscriptMar 22 2023, 9:09 PM

mjg requested review of this revision.Mar 22 2023, 9:09 PM

jrtc27 added a subscriber: jrtc27.Mar 22 2023, 9:15 PM

jrtc27 added inline comments.

sys/kern/kern_prot.c
1854	Please use `__SIZEOF_LONG__` or some other similar variable for these; CHERI architectures are neither ILP32 nor LP64 but the native integer size is still 32/64-bit depending on the base architecture (only ever 64-bit in CheriBSD). Though really for the formats I'd be inclined to say just use intmax_t, it's not performance-critical and involves (undocumented) tight coupling between here and ucred.h. Though why not just use long itself?
sys/sys/ucred.h
64	This was previously unsigned, is signed intended? The summary doesn't make any reference to this change.

I'm going to have to sleep on whether to do 64 bit everywhere.

Will post an updated review one way or the other tomorrow.

mjg updated this revision to Diff 119287.Mar 22 2023, 10:09 PM

mjg retitled this revision from cred: make the refcount 64-bit on 64-bit platforms to cred: convert the refcount from int to long.

mjg edited the summary of this revision. (Show Details)

mjg marked 2 inline comments as done.

I have no objection, though I suspect that you cannot change the layout of ucred on stable branches.

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

zlei added a subscriber: zlei.Mar 24 2023, 5:33 PM

zlei added inline comments.

sys/kern/kern_prot.c
1863	I see `crcowget()` is called during thread creating. Can we possibly reach 2^32 threads? That sounds horrible. Maybe user want be warned when tuning and `kern.maxproc * kern.threads.max_threads_per_proc` exceed 2^32 .
1870	I don't understand why we can't simply check for over/underflow when manipulating the counter +1 for that. So we might assert a overflow on increase. KASSERT(cr->cr_ref > 0, ("overflow")); And bench on a machine with huge amount of resources.
2058	And assert an underflow on decrease. KASSERT(cr->cr_ref > 0, ("will underflow"));

And bench on a machine with huge amount of resources.

So we have such a REAL machine with 4096 cpu cores and 8 hardware threads per core, and run average 8192 soft threads per hardware thread, then this machine will have total ~ 2^28 threads. Then a 32bit u_int is still sufficient.

I doubt we will have such machine and real usage in years.

In D39220#893122, @markj wrote:

I have no objection, though I suspect that you cannot change the layout of ucred on stable branches.

No, but if someone wants this in stable, they can use the spare fields for the new ref.

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

How the refcount API could possibly address it? You drop what it sees like the last reference and past that the object is getting freed. Subsequent attempts at manipulating the counter are pretty UB. There is no leeway here.

mjg added inline comments.Mar 27 2023, 3:15 PM

sys/kern/kern_prot.c
1870	That's impossible to do here because the value is completely arbitrary as long as cr_users > 0 -- then the real can only be found by summing up all the counts from all threads.
2058	see the above comment why this can't work

At best some of the asserts could be converted into runtime checks, but then again the refcount API has no advantage over any of it and has the disadvantage of not scaling.

zlei added inline comments.Mar 28 2023, 3:31 AM

sys/kern/kern_prot.c
1870	That's impossible to do here because the value is completely arbitrary as long as cr_users > 0 -- then the real can only be found by summing up all the counts from all threads. I don't quite understand the statement by summing up all the counts from all threads . As both `cr->cr_users` and `cr->cr_ref` is mutex protected, it is impossible that `cr->cr_ref` got touched by other threads (if they follows the order lock and write), so I think the overflow (of `cr_ref`) can be safely and stably caught. Am I misreading? That is indeed not enough as I see `crunuse()` and `crunusebatch()` also increase `cr_ref`. See comments below.
1883	KASSERT(cr->cr_ref >= td->td_ucredref, ("overflow"));
1911	KASSERT(cr->cr_ref >= ref, ("overflow"));

mjg added inline comments.Mar 28 2023, 8:04 AM

sys/kern/kern_prot.c
1870	i refer you to the comment starting with 'Credential management' in the file

This revision was not accepted when it landed; it landed in state Needs Review.Mar 29 2023, 5:04 AM

Closed by commit rG21d29c519221: cred: make the refcount signed (authored by mjg). · Explain Why

This revision was automatically updated to reflect the committed changes.

mjg added a commit: rG21d29c519221: cred: make the refcount signed.

mjg added a commit: rG37337709d333: cred: convert the refcount from int to long.

In D39220#894483, @mjg wrote:

In D39220#893122, @markj wrote:

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

How the refcount API could possibly address it? You drop what it sees like the last reference and past that the object is getting freed. Subsequent attempts at manipulating the counter are pretty UB. There is no leeway here.

Certainly it cannot be addressed in general, but there are specific bug patterns which can be mitigated this way. For instance:

A thread decrements the reference count to zero but erroneously fails to invoke the object's destructor and free it. If something else later tries to manipulate the counter, it'll stay frozen.
A thread decrements an object refcount to zero and invokes a destructor, which releases other references and invokes other destructors, one of which incorrectly tries to release a reference on the original object (e.g., because some code path incorrectly treats a weak (i.e., not-counted) reference as a strong reference. In this scenario, we'd avoid a double free of the original object.

Such a mitigation might not help in the face of some particular bug, but it's all UB once buggy refcount handling triggers a double free or a UAF anyway.