Why? Most trivial point, it shaves around 600 bytes from the dynamic binaries on amd64. Less trivial, the removed code is no longer part of the ABI, and we can ship updates to it with libc updates. Right now most of the csu is linked into the binaries and require us to do somewhat tricky ABI compat when it needs to change. For instance, the init_array change would be much simpler and does not require note tagging if we have init calling code in libc. This could be improved more, by splitting dynamic and static initialization. For instance, &_DYNAMIC tests can be removed then. I left this for later, after this change stabilizes.
[Only amd64/i386 are tested ATM]