Page MenuHomeFreeBSD

kern/intr: have intr_event_destroy() return success on NULL event
Needs ReviewPublic

Authored by ehem_freebsd_m5p.com on Feb 15 2023, 3:17 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 17, 12:58 PM
Unknown Object (File)
Mon, Dec 9, 3:07 AM
Unknown Object (File)
Nov 25 2024, 10:11 PM
Unknown Object (File)
Nov 17 2024, 6:21 AM
Unknown Object (File)
Nov 17 2024, 4:55 AM
Unknown Object (File)
Oct 23 2024, 3:15 AM
Unknown Object (File)
Oct 23 2024, 3:14 AM
Unknown Object (File)
Oct 23 2024, 2:52 AM
Subscribers

Details

Reviewers
markj
jhb
kib
Summary

A NULL event likely indicates the event was already released. In such
case success is more useful than failure, so return 0. This follows the
pattern of gracefully handling NULL events.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 49794
Build 46685: arc lint + arc unit

Event Timeline

Trying for other reviewers since this has already been sitting around for months with no action.

"My kingdom for a review"?

The real issue is if a caller actually checks the return code, returning an error would cause the caller to go into error recovery mode. This could be retry the destroy later, or might very well be a panic(). This is more useful to tell the caller to proceed with release/destroy action, not go into error recovery.

I think this is fairly arbitrary. intr_event_destroy() does not get called frequently, unlike, say, free(). So it's reasonable to be less permissive, since intr_event_destroy(NULL) is likely the result of a programming mistake.

I made an argument along the same lines in D38599. As a matter of principle, I prefer code like this to fail as early as possible, so I don't really agree with the change. That's not an absolute position, but so far there is no example code to make me reconsider.

A handy example is D38599. It could test isrc->isrc_event before calling intr_event_destroy(), but if intr_event_destroy() will gracefully handle ie == NULL then it is easier not to bother. If you prefer to have a panic() sooner rather than later, the

if (ie == NULL)
        return (EINVAL);

should be taken out and hopefully a panic() will ensue on line 539. If intr_event_destroy() is going to check for ie == NULL then it should do something useful if that occurs. Returning EINVAL is almost certain not to be useful and thus returning 0 seems more sensible.

I think one real issue is what question is the caller most interested in having answered? (since C functions have a single return, only 1 question can be answered) Another is how is the caller likely to react if intr_event_destroy() gives an error return?

For the first, I think the most likely question is "Has everything been cleaned up?" ie can cleanup steps proceed? Whereas returning an error for NULL amounts to answering, "Was at least 1 cleanup step done and did cleanup finish?" ie NULL would mean no cleanup steps were done, even though cleanup did complete.

For the second a few recovery strategies might be employed. If the caller is doing lazy event release, then it may remain on a queue of events to be released at a later time (or perhaps remains on an active list). If the caller one does the call once, then they might call printf("Warning! Failed to release event, leaking memory!\n");, then drop pointers on the floor on the principle things are in an unknown state and not touching things is most likely to keep the system operational for longer. Another implementation might simply call panic("Failed to release event, no recovery possible.");.

For what I've got, the second strategy is the one in use. This seems the least harm and most likely to survive long enough to catch someone's attention. If the first strategy is used, a thread has been unnecessarily placed in an infinite loop. If the third strategy is used, the system will panic when basically everything was fine.

If a NULL event is passed there isn't really anything for the caller to do. The caller should behave as if everything was fine, since no action can improve the situation. I could see a printf("Warning: NULL event passed to %s.", __func__); inside a block with the return (0);, since intr_event_destroy() knows the situation.

The argument against a printf() is INTRNG does lazy allocation of events. Thus passing in a NULL likely indicates no event was ever allocated and nothing has gone wrong. PowerPC isn't quite as lazy, but due to some laziness would likely fall into the same case.

Perhaps passing in a NULL pointer is nominally wrong, but returning an error is near certain to exacerbate an otherwise recoverable situation.

To put it a different way, intr_event_destroy() is called from recovery and shutdown routines. Callers have likely already received an error status from another function (since they passed a NULL event), yet now they're trying to recover and get yet another error indicator. Recovery is impossible.

A few callers will panic() in debug builds, yet simply ignore in release builds. I've got something which emits a warning message about leaking memory since it is dumping everything on the floor. For that one panic() is too severe since there is hope the administrator can still do a clean shutdown.