armv8crypto: Use cursors to access crypto buffer data
ClosedPublic
Actions

Authored by markj on Feb 26 2021, 6:31 PM.

Details

Reviewers

jmg
jhb
gonzo

Group Reviewers

arm64

Commits

rG26b08c5d21b5: armv8crypto: Use cursors to access crypto buffer data

Summary

Currently armv8crypto copies the scheme used in aesni(9), where payload
data and output buffers are allocated on the fly if the crypto buffer is
not virtually contiguous. This scheme is simple but incurs a lot of
overhead: for an encryption request with a separate output buffer we
have to

allocate a temporary buffer to hold the payload
copy input data into the buffer
copy the encrypted payload to the output buffer
zero the temporary buffer before freeing it

We have a handy crypto buffer cursor abstraction now, so reimplement the
armv8crypto routines using that instead of temporary buffers. This
introduces some extra complexity, but not a lot. The driver still
allocates an AAD buffer for AES-GCM if necessary.

Some profiling of a sendfile+KTLS workload on an Altra indicates that we
spend almost as much CPU time copying and zeroing as we do encrypting.
I am doing some profiling of ipsec on an espressobin now to see if we
get any improvements or degradations with smaller payloads.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

No Lint Coverage

Unit

No Test Coverage

Build Status

Buildable 37552
Build 34441: arc lint + arc unit

Event Timeline

markj created this revision.Feb 26 2021, 6:31 PM

Herald added a reviewer: jmg. · View Herald TranscriptFeb 26 2021, 6:31 PM

Herald added subscribers: andrew, imp. · View Herald Transcript

markj requested review of this revision.Feb 26 2021, 6:31 PM

Harbormaster completed remote builds in B37419: Diff 84761.Feb 26 2021, 6:31 PM

markj added a parent revision: D28949: opencrypto: Add a routine to copy a crypto buffer cursor.Feb 26 2021, 6:31 PM

markj added reviewers: jhb, gonzo, arm64.Feb 26 2021, 6:32 PM

markj added inline comments.

sys/crypto/armv8/armv8_crypto.c
482–483	I am not sure why we bother zeroing an AAD buffer.

markj added a subscriber: gallatin.Feb 26 2021, 6:44 PM

FWIW, I have a patch to use cursors for aesni, but it actually made things a bit slower for KTLS when I tried it. isal(4) does use cursors though.

https://github.com/freebsd/freebsd-src/compare/master...bsdjhb:aesni_cursor

Rebase

Harbormaster completed remote builds in B37552: Diff 85077.Mar 3 2021, 9:41 PM

Rebase

Harbormaster completed remote builds in B44415: Diff 102667.Feb 11 2022, 3:33 PM

@gallatin reports a 10% increase in throughput with this change, without any increase in CPU usage. Not sure how it compares to ossl yet.

Note that ossl(4) doesn't have AES-GCM bindings yet. Not sure how much AES-CBC gallatin@ is testing with? That said, I think in general using cursors to avoid copies when possible, and using FPU_KERN_NOCTX to avoid more expensive save/restores are directions we want to be moving in. ossl(4) and isal(4) use both of those.

The implementation looks fine to me.

sys/crypto/armv8/armv8_crypto.c
482–483	That is possibly not warranted. Note that in some cases AAD might not be on the wire (the ESN in IPsec comes to mind, or the TLS sequence number for TLS). It might be simpler to zero it then to worry about it.

This revision is now accepted and ready to land.Feb 14 2022, 6:11 PM

Closed by commit rG26b08c5d21b5: armv8crypto: Use cursors to access crypto buffer data (authored by markj). · Explain WhyFeb 16 2022, 3:04 AM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rG26b08c5d21b5: armv8crypto: Use cursors to access crypto buffer data.