Page MenuHomeFreeBSD

Fix ggatec request handling
ClosedPublic

Authored by jo_bruelltuete.com on Jul 27 2021, 1:02 AM.
Tags
None
Referenced Files
F107355512: D31318.diff
Sun, Jan 12, 11:24 PM
Unknown Object (File)
Dec 10 2024, 12:52 PM
Unknown Object (File)
Dec 7 2024, 5:35 PM
Unknown Object (File)
Dec 7 2024, 5:35 PM
Unknown Object (File)
Dec 7 2024, 5:35 PM
Unknown Object (File)
Nov 25 2024, 7:13 AM
Unknown Object (File)
Nov 25 2024, 2:36 AM
Unknown Object (File)
Nov 24 2024, 11:30 PM

Details

Summary
Test Plan

Tested in a VM, commenting out the new code in turn to trigger that particular error. The messages show up in the log.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jo_bruelltuete.com created this revision.

Can you reproduce the problem on demand? If so, you should add a test case to tests/sys/geom/gate

I did some experiments with ggate in late 2019 and was concerned about the lack of sanity checking - I managed to get the client system to generate a 1GiB read request, which the server choked on. From my notes at the time:
4659 ggatec CALL recvfrom(0x4,0xffffbffddf08,0x3ffbc000,0x40<MSG_WAITALL>,0,0)
4659 ggatec RET recvfrom -1 errno 14 Bad address
Unfortunately, I didn't keep a record of how I had generated that request. (And for other reasons, I lost interest in ggate at that time).

I've recently resurrected my interest in ggate and had noticed that the code hadn't been cleaned up. Whilst I wasn't able to reproduce that issue, I was concerned about the lack of validation of data read off the network. I have contemplated similar patches but not actually implemented them.

I agree with the patches but agree with asomers that it would be better if a test case could be added.

Re testing, it's easy to repro if you can generate a request that is not read or write. I don't know of a reliable way to do that. At the moment zpool-create does that... but only if the right sysctls are set?
And looking at https://github.com/freebsd/freebsd-src/blob/main/tests/sys/geom/class/gate/ggate_test.sh, I'm not sure what to make of that, it's not obvious whats going on with the tests...

sbin/ggate/ggatec/ggatec.c
147 ↗(On Diff #92774)

This is the real fix: fail requests that are not read or write. ie fail delete or flush requests.
Without this a full device delete (for example initiated by zpool create) would create a bogus read (or write or whatever) request with a huge size.

155 ↗(On Diff #92774)

This is just hardening.

235 ↗(On Diff #92774)

This has security implications. I've emailed secteam@ but got no response.

Re testing, it's easy to repro if you can generate a request that is not read or write. I don't know of a reliable way to do that. At the moment zpool-create does that... but only if the right sysctls are set?

What about fsyncing the ggate device file? That ought to generate BIO_FLUSH, I would think.

And looking at https://github.com/freebsd/freebsd-src/blob/main/tests/sys/geom/class/gate/ggate_test.sh, I'm not sure what to make of that, it's not obvious whats going on with the tests...

It's just a basic smoke test. It fires up ggated and ggatec on the same server and does a basic I/O test with dd.

sbin/ggate/ggatec/ggatec.c
155 ↗(On Diff #92774)

The kernel can absolutely handle a read(2) or write(2) operation larger than MAXPHYS. It will just split it up before sending it to the disks. So unless you know of a particular failure that a long request can trigger, I would not preemptively add this check.

235 ↗(On Diff #92774)

Where does the buffer overflow happen exactly? MAXPHYS normally only matters in-kernel. It can also vary at runtime on FreeBSD 13, which makes this usage suspicious.

Re testing, it's easy to repro if you can generate a request that is not read or write. I don't know of a reliable way to do that.

trim(8), perhaps? It should generate BIO_DELETE requests.

sbin/ggate/ggatec/ggatec.c
155 ↗(On Diff #92774)

The failure is in ggatec handling the over-sized response from ggated, see the other change below. If ggatec sends off a large request, ggated will dutifully respond with a large response but that response will not fit into the statically sized buffer of ggatec and overflow. By failing that request here (instead of crashing out later) we are dealing with it and letting the requesting process now that it failed (otherwise it'll just hang).

235 ↗(On Diff #92774)

See local variable char buf[MAXPHYS] up top. It's statically sized. Any malformed (intentional or not) response from ggated (or someone in a priviledged network position) will overwrite stack.

sbin/ggate/ggatec/ggatec.c
235 ↗(On Diff #92774)

In that case the condition should be if (ggio.gctl_length > sizeof(buf)). And wow, this is quite the classic buffer overflow.

What do we want the test to verify? That the fixed version only supports reads and writes? That feels a bit limiting because adding bio_delete or bio_flush support looks pretty straight forward.
Any opinions?

What do we want the test to verify? That the fixed version only supports reads and writes? That feels a bit limiting because adding bio_delete or bio_flush support looks pretty straight forward.
Any opinions?

It should verify that a BIO_FLUSH or BIO_DELETE gets a suitable error message, rather than "ggatec sometimes stops working".

While I have your attention here, can someone assign the bug reports linked in the description to geom@freebsd.org? Right now they are dangling assigned to nobody.

Maybe found another problem, this time with the ggate kernel module. I'm pretty unsure about some details, esp around ioctl, but here goes:
I think it's possible to craft ioctl with ggate to exfiltrate kernel memory.

  1. Right now ggatec sends commands (and receives them) by sending an ioctl to the ggctl device. It passes a pointer to a structure that matches on the kernel side. Notice that the ioctl does not have a size parameter, so cannot copyin the data. Instead the ggate-ioctl handler receives the corresponding user mode address as a direct-map address, https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L736 can be see by setting kern.geom.gate.debug=4.
  1. lets say we are doing a G_GATE_CMD_CREATE. https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L465. The ioctl structure has strings as arrays embedded, https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.h#L120.
  1. by filling gctl_readprov member with garbage and not null terminating we can trigger a printf of its contents: https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L553. So far this is only our own data.
  1. But we can position this structure at the end of a page and fill the remaining members with non-zeros. The following G_GATE_DEBUG will leak the contents of the next physical page into the system console.
  1. That next physical page can be from a different process! The address passed in to the ioctl handler is from the direct map.

Does that sound about right?

A quick fix would be to check that gctl_readprov is null terminated.

Maybe found another problem, this time with the ggate kernel module. I'm pretty unsure about some details, esp around ioctl, but here goes:
I think it's possible to craft ioctl with ggate to exfiltrate kernel memory.

  1. Right now ggatec sends commands (and receives them) by sending an ioctl to the ggctl device. It passes a pointer to a structure that matches on the kernel side. Notice that the ioctl does not have a size parameter, so cannot copyin the data. Instead the ggate-ioctl handler receives the corresponding user mode address as a direct-map address, https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L736 can be see by setting kern.geom.gate.debug=4.
  1. lets say we are doing a G_GATE_CMD_CREATE. https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L465. The ioctl structure has strings as arrays embedded, https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.h#L120.
  1. by filling gctl_readprov member with garbage and not null terminating we can trigger a printf of its contents: https://github.com/freebsd/freebsd-src/blob/main/sys/geom/gate/g_gate.c#L553. So far this is only our own data.
  1. But we can position this structure at the end of a page and fill the remaining members with non-zeros. The following G_GATE_DEBUG will leak the contents of the next physical page into the system console.
  1. That next physical page can be from a different process! The address passed in to the ioctl handler is from the direct map.

Does that sound about right?

A quick fix would be to check that gctl_readprov is null terminated.

Nope, all good. The ioctl code has encoded the size of the structure and sys_ioctl does the right thing.

I have been experimenting with zfs send to geli on ggate and had ggatec processes die reporting "ioctl(/dev/ggctl): Cannot allocate memory." and replaced line ggatec.c:134 with the following:

			g_gate_xlog("ioctl(START, {%d, %#x, %#jx}): %s.",
			    ggio.gctl_unit, ggio.gctl_cmd,
			    (intmax_t)ggio.gctl_length, strerror(error));

(I think that's a worthwhile patch since the existing error message is not particularly helpful). That revealed that FreeBSD can try to issue a BIO_WRITE that is 1 page larger than MAXPHYS. That surprised me - I didn't expect writes exceeding MAXPHYS.

One other comment: MAXPHYS now (as of 13.0) depends on the architecture: ILP32 architectures retain the previous 128KiB whilst other architectures now have a 1MiB MAXPHYS. This means that it's somewhat more likely that the two ends of a ggate link will have different MAXPHYS values - which I believe could also trigger errors (though I'm not in a position to easily test this).

As another source of problems: ggatec uses separate receive and send threads. The default stacksize for everything except the initial thread is 1MiB but (on 64-bit 13.x and later) it allocates a 1MiB array on the stack, which is unsafe. Either the send and recv buffers should be made static, allocated dynamically or pthread_attr_setstacksize() needs to be used to enlarge the stack. (I discovered this the hard way when I doubled the buffer size to work around by first point above).

sbin/ggate/ggatec/ggatec.c
156 ↗(On Diff #93065)

As noted above, this should be "sizeof(buf)".

158 ↗(On Diff #93065)

I'm not familiar with the kernel side of geom_gate but I would expect ENOMEM to be more appropriate here.

Yeah lots of stuff broken here. We can fix it all in a later patch. For now I really want to plug that remote code execution hole.

sbin/ggate/ggatec/ggatec.c
156 ↗(On Diff #93065)

No, that's misleading: the size of this buf here (the send buf) is not what matters. This buf is a separate local variable, distinct from the buf in recv_thread.

I'm still on 12-stable with all my machines, and will not be able to upgrade for a few more months...

Things I would change in a follow up patch:

  • make the buffers heap-alloc'd
  • recv thread deals with its own buffer sizing and adjusts as necessary, send thread does not need to check
  • get rid of all the endian swapping: each sender sends in their native, receiver swaps if necessary, use the version field as a bom.
sbin/ggate/ggatec/ggatec.c
235 ↗(On Diff #92774)

MAXPHYS normally only matters in-kernel. It can also vary at runtime on FreeBSD 13

Got any pointers? Is it a boot-time-config?

sbin/ggate/ggatec/ggatec.c
158 ↗(On Diff #93065)

From user-mode pov of ggatec "not supported" seems good: I do not support this large request.
No idea what the internals of geom expect or how they handle different error codes.

My comments aside, I believe this is worth committing now. Further improvements are needed but this corrects some clear bugs that are relatively easily triggered.

sbin/ggate/ggatec/ggatec.c
156 ↗(On Diff #93065)

In that case, I think the existing comment should be augmented to make it clear that the test here is against the size of "buf" in recv_thread.

158 ↗(On Diff #93065)

My point is that this error code is processed by the kernel code and needs to chosen with that in mind. I did some digging and the error code is passed into sys/geom/geom_io.c:g_io_deliver() and that code treats ENOMEM specially. The error code will then be forwarded to the bio originator.

This revision is now accepted and ready to land.Aug 3 2021, 9:17 AM

Thanks for review and comments, everyone!

Philip will do the commit, I think.

the security patch has landed. do we still want the tests?

Yes, I think we should still add the tests. What's the status? Do you need to rebase now that the ggatec patch is in?

there's a rebase in https://github.com/bruelltuete/freebsd-src/commit/999394f2930d341b422a5421ff931dfb024eefc7, if that's easy to use (as a patch).
i dont actually know how to commit anything to src, never done that before.
do you commit via phabricator, like click a button here in the ui somewhere? does the patch in this review need to be updated?

there's a rebase in https://github.com/bruelltuete/freebsd-src/commit/999394f2930d341b422a5421ff931dfb024eefc7, if that's easy to use (as a patch).
i dont actually know how to commit anything to src, never done that before.
do you commit via phabricator, like click a button here in the ui somewhere? does the patch in this review need to be updated?

There's no merge button, if that's what you're asking. Committing a revision requires a committer to run a couple of CLI commands to pull the patch out of phabricator and commit it. And yes, please do update the review with the rebased change.

Add test for ggatec to check handling of non-read/write request (e.g. trim).

This revision now requires review to proceed.Aug 27 2021, 7:31 PM
This revision is now accepted and ready to land.Aug 27 2021, 8:47 PM