Page MenuHomeFreeBSD

ntpd: Change default stack limit to 4096 pages.
Needs RevisionPublic

Authored by dgr_semihalf.com on Apr 19 2021, 10:45 AM.
Tags
None
Referenced Files
F102422986: D29832.diff
Tue, Nov 12, 1:55 AM
Unknown Object (File)
Oct 11 2024, 7:08 PM
Unknown Object (File)
Oct 11 2024, 7:08 PM
Unknown Object (File)
Oct 11 2024, 6:50 PM
Unknown Object (File)
Oct 3 2024, 10:03 PM
Unknown Object (File)
Oct 3 2024, 8:00 PM
Unknown Object (File)
Sep 26 2024, 4:39 AM
Unknown Object (File)
Sep 25 2024, 2:39 PM
Subscribers

Details

Reviewers
mw
emaste
imp
kib
cy
Summary

Default stack limit of 50 pages almost always causes ntpd to segfault
when ASLR stack gap is enabled. This happens because in default
FreeBSD install the maximum stack gap size is around 15.36MiB and
the stack gap that is created is almost always bigger than stack
limit. This results in immediate segfault after setrlimit call.
Changing the default to 4096 4K pages resolves this problem.

Stack limit can be configured to previous value by adding
'rlimit stacksize 50' to ntp.conf.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

dgr_semihalf.com created this revision.
This revision is now accepted and ready to land.Apr 24 2021, 6:55 AM

Does ntpd still mlocall(2) itself? Issue there, I believe, is that all this memory, including gap, is wired.

Why not disable stack gap instead, with elfctl -e +noaslrstkgap ntpd (I think this is the right syntax)?

In D29832#672594, @kib wrote:

Does ntpd still mlocall(2) itself? Issue there, I believe, is that all this memory, including gap, is wired.

Why not disable stack gap instead, with elfctl -e +noaslrstkgap ntpd (I think this is the right syntax)?

As far as I know ntpd does not mlockall(2) istelf anymore. However, it still calls setrlimit to set maximum stack, which results in the process dying with default ASLR stack gap. This is clearly visible when using ktrace on nptd - setrlimit is the last thing that ntpd does before receiving SIGSEGV.

Using elfctl to disable stack gap will work, I sent the patch proposing it earlier in D29553 (along with D29550, D29551 and D29552, which allow it to actually work), but @imp suggested to tweak the stack size.

Using elfctl to disable stack gap will work, I sent the patch proposing it earlier in D29553 (along with D29550, D29551 and D29552, which allow it to actually work), but @imp suggested to tweak the stack size.

IMO using elfctl is the right approach.

Do D29533 is fine with me (I cannot accept it because it is abandoned). For build system changes and perhaps for elfctl endianess change, you should ask maintainers.

I re-opened D29553. I'm leaving this up as I don't know which will be the accepted method for solving the problem.

In D29832#672594, @kib wrote:

Does ntpd still mlocall(2) itself? Issue there, I believe, is that all this memory, including gap, is wired.

Why not disable stack gap instead, with elfctl -e +noaslrstkgap ntpd (I think this is the right syntax)?

We already disable stack gap at line 447 of ntpd.c. Upstream inserted this patch after we applied it locally.

usr.sbin/ntp/config.h
293

Do you have examples of segfault that this resolves? There is no special O/S specific handling of DFLT_RLIMIT_STACK in their ./configure. I'd be interested in finding out more before we make this change and recommend our upstream create a special case for FreeBSD.

In D29832#673048, @cy wrote:
In D29832#672594, @kib wrote:

Does ntpd still mlocall(2) itself? Issue there, I believe, is that all this memory, including gap, is wired.

Why not disable stack gap instead, with elfctl -e +noaslrstkgap ntpd (I think this is the right syntax)?

We already disable stack gap at line 447 of ntpd.c. Upstream inserted this patch after we applied it locally.

Currently, ntpd does not start when ASLR is enabled, which means that calling procctl does not help. However, either disabling stackgap with elfctl or significantly increasing the stack limit help with this issue.

If I understand correctly, when using procctl the stack gap still exists but we can write to it. Because of this gap our stack is bigger than the limit being set, so we're getting a segfault right after calling setrlimit. Elfctl, by comparison, stops stack gap from being created. However, I'm not 100% sure that this is what happens, as I'm not an expert when it comes to FreeBSD ASLR implementation.

The problem can be easily reproduced by installing recent -CURRENT snapshot with enable_aslr selected in installer. Starting ntpd should then result in a segfault. The fact that this happens after setrlimit is clearly visible when using ktrace, so I'm attaching ktrace.out. I used the snapshot from 15th April. There is also a recent PR253208, which mentions this problem.

usr.sbin/ntp/config.h
293

Well, ntpd simply doesn't start with ASLR enabled on CURRENT. Segfault happens right after call to setrlimit.

In D29832#673048, @cy wrote:
In D29832#672594, @kib wrote:

Does ntpd still mlocall(2) itself? Issue there, I believe, is that all this memory, including gap, is wired.

Why not disable stack gap instead, with elfctl -e +noaslrstkgap ntpd (I think this is the right syntax)?

We already disable stack gap at line 447 of ntpd.c. Upstream inserted this patch after we applied it locally.

Currently, ntpd does not start when ASLR is enabled, which means that calling procctl does not help. However, either disabling stackgap with elfctl or significantly increasing the stack limit help with this issue.

This is incorrect. I use ASLR and ntpd and have done so for a couple of years on -CURRENT.

If I understand correctly, when using procctl the stack gap still exists but we can write to it. Because of this gap our stack is bigger than the limit being set, so we're getting a segfault right after calling setrlimit. Elfctl, by comparison, stops stack gap from being created. However, I'm not 100% sure that this is what happens, as I'm not an expert when it comes to FreeBSD ASLR implementation.

There is no segfault after calling procctl(). The stack gap exists when ntpd starts. It calls procctl(), then forks itself disabling the stack gap and avoiding the segfault.

The problem can be easily reproduced by installing recent -CURRENT snapshot with enable_aslr selected in installer. Starting ntpd should then result in a segfault. The fact that this happens after setrlimit is clearly visible when using ktrace, so I'm attaching ktrace.out. I used the snapshot from 15th April. There is also a recent PR253208, which mentions this problem.

I have taken that bug and will look at it. I am not sure why that user is having a problem while I am not. The only thing that comes to mind might be a possible -n option. But, again, I'm running -CURRENT as of yesterday here and I have no such problem. PR/253208 will require more investigation.

Ok, I can finally reproduce the problem with kern.elf64.aslr.pie_enable=1. PIE introduces a new set of problems that were not addressed by procctl() at the time.

Whatever we do here will need to be pushed back to nwtime.org.

Adjusting the stack to 200, like OpenBSD does, resolves this issue when PIE is enabled.

case "$host" in
 *-*-openbsd*)
    ans=200
    ;;
 *) ans=50
    ;;
esac

We can either set stack to 200 in config.h (like upstream does for OpenBSD) or test the sysctl and if it's enabled adjust the stack to 200 and adjust it at around line 1060 of ntpd.c. I've reached out to the folks at nwtime.org for their input. The simplest is to adjust the stack to 200.

Whatever we do here will need to be applied to the port for those who want to enable drivers that are not enabled in base.

cy requested changes to this revision.Apr 27 2021, 6:57 PM

Juxtapositioned to this is that enabling PIE causes firefox to segfault but that's for another review.

Anyhow, I'm ok with this if we make the above two changes.

usr.sbin/ntp/config.h
293

Let's adjust this to 200 for now until the folks at nwtime.org get back to me. This is simpler than adding logic to ntpd.c.

usr.sbin/ntp/ntpd/ntp.conf
115

This comment is incorrect. It should read: results in segfault when PIE is enabled.

This revision now requires changes to proceed.Apr 27 2021, 6:57 PM

Having done a couple of reboots since enabling PIE and increasing stack to 200, now after this last reboot about an hour ago it now fails regardless of how much stack I give it (even obscene amounts). Increasing stack to 200, 1024, 20000, 200000, all of which I've tried, will not work after this latest reboot.

BTW, the following does not work during this boot since uptime:

kern.elf64.aslr.stack_gap: 3
kern.elf64.aslr.honor_sbrk: 1
kern.elf64.aslr.pie_enable: 1
kern.elf64.aslr.enable: 0

Notice that ASLR is disabled but PIE is enabled.

But ASLR enabled and PIE disabled works:

kern.elf64.aslr.stack_gap: 3
kern.elf64.aslr.honor_sbrk: 1
kern.elf64.aslr.pie_enable: 0
kern.elf64.aslr.enable: 1

The only solution is D29553.

The work around works on my sandbox machine today but does not on the laptop when PIE is enabled on both. Without PIE and with ASLR there is no need for the workaround.

So, I've looked at this more and setting the stack size to 4096 pages should technically also resolve this problem. I tested this restarting the daemon multiple times and in my case it worked. However, the problem will occur randomly due to random nature of stack gap randomization. By default, the maximum size of stack gap is 15.36MiB on amd64, I'm not sure about other platforms. The default value of kern.elf64.aslr.stack_gap is 3, so in case of amd64, where the default stack limit is equal to 512MiB, that gives us the 15.36MiB number. I used 4096 as this means that we are setting the stack limit to 16MiB, which should leave enough stack for ntpd to work - I think that 15.56MiB is needed (15.36 of stack gap and 200K taken from the fact that we were limiting this to 200K previously), but I rounded this up to 16MiB.

Setting rlimit through ntp.conf to a higher value will not work, as from what I've seen, ntpd first sets the stack limit to default value then reads the configuration file and then calls setrlimit again to set the value to the one specified in config.

Of course, setting the ELF flag to disable ASLR stack gap as in D29553 will also work. I do in fact prefer the elfctl approach, however I will leave this revision up for a while.

usr.sbin/ntp/ntpd/ntp.conf
115

The problem in here is the random stack gap, that's why D29553 works. Setting kern.elf64.aslr.pie_enable=0 means that the stack gap randomization is also disabled. kern.elf64.aslr.stack_gap=0 should also resolve the problem.

So, I've looked at this more and setting the stack size to 4096 pages should technically also resolve this problem. I tested this restarting the daemon multiple times and in my case it worked. However, the problem will occur randomly due to random nature of stack gap randomization. By default, the maximum size of stack gap is 15.36MiB on amd64, I'm not sure about other platforms. The default value of kern.elf64.aslr.stack_gap is 3, so in case of amd64, where the default stack limit is equal to 512MiB, that gives us the 15.36MiB number. I used 4096 as this means that we are setting the stack limit to 16MiB, which should leave enough stack for ntpd to work - I think that 15.56MiB is needed (15.36 of stack gap and 200K taken from the fact that we were limiting this to 200K previously), but I rounded this up to 16MiB.

Technically, adjusting the stack should work. It worked on my laptop after two reboots and after each of two more it didn't. But consistently works on my sandbox so far. Increasing the stack is an inconsistent solution.

Setting rlimit through ntp.conf to a higher value will not work, as from what I've seen, ntpd first sets the stack limit to default value then reads the configuration file and then calls setrlimit again to set the value to the one specified in config.

Increasing the stack does work in ntp.conf because it forks itself after that. This is how procctl() works.

Of course, setting the ELF flag to disable ASLR stack gap as in D29553 will also work. I do in fact prefer the elfctl approach, however I will leave this revision up for a while.

I think this is the stopgap solution until our upstream has had a chance to look at it. I made them aware of it. My closer contact there tells me their senior developer (perly) is aware of it now.