All callers of sctp_aloc_assoc() mark the PCB as connected after a
successful call (for one-to-one-style sockets). In all cases this is
done without the PCB lock, so the PCB's flags can be corrupted. We also
do not atomically check whether a one-to-one-style socket is a listening
socket, which violates various assumptions in solisten_proto().
We need to hold the PCB lock across all of sctp_aloc_assoc() to fix
this. In order to do that without introducing lock order reversals, we
have to hold the global info lock as well.
So:
- Convert sctp_aloc_assoc() so that the inp and info locks are consistently held. It returns with the association lock held, as before. - Fix an apparent bug where we failed to remove an association from a global hash if sctp_add_remote_addr() fails. - sctp_select_a_tag() is called when initializing an association, and it acquires the global info lock. To avoid lock recursion, push locking into its callers. - Introduce sctp_aloc_assoc_connected(), which atomically checks for a listening socket and sets SCTP_PCB_FLAGS_CONNECTED.
There is still one edge case in sctp_process_cookie_new() where we do
not update PCB/socket state correctly. I'm not sure how best to fix
that yet, but it looks hard to trigger.
I believe this fixes quite a few of the outstanding syzbot reports.