The LTO issue has been fixed. While -flto for some reason is commented out,
since it wasn't completely removed, it may be expected to be reenabled.
Details
- Reviewers
se - Commits
- rG6ae90f59851c: gh-bc: don't disable LTO on powerpc64
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
I have just pushed a commit that re-enables LTO - you may need to rebase your patch since a nearby line has been modified in commit 77606d5a8c98.
I don't understand why this program is special and should default to using LTO. Shouldn't there be a global WITH_LTO switch instead that builds all base system programs with LTO?
I had suggested introduction of LTO macros in the base system several years ago, but got no positive feedback.
My suggestion was to allow base system components to be tagged for compilation with LTO, but the actual options being determined by the framework depending on the compiler and target architecture.
This particular program benefits a lot from LTO because of its structure: the author first developed an abstract library to manipulate vectors (originally in C++, then ported to C), then based the implementation of bc on this library. If these vector operations had been implemented as macros or as inline functions, they would automatically be reduced after expansion, since passed constant parameters would allow moving run-time tests into the compile phase, but these vector functions are in a stand-alone library that can be used and tested independently of the bc and dc programs. LTO allows the compiler to inline optimized trivial fractions of complex vector operations in many places (including inner loops), leading to a measured reduction of the run-time of complex operations of about 30%.
I'm all for the introduction of framework support for LTO, which probably should add the required options for LTO to CFLAGS, but also provide a macro that can be tested in the source code, e.g. in case LTO on some compiler/architecture causes issues with specific code sequences.
Given the advantages offered by "thin" LTO without the high impact on link times of "full" LTO, it might be possible to build most of the userland code on most platforms with -flto=thin by default, BTW. This program is small enough to be compiled with -flto=full without, though. Therefore, it might be useful to support both thin and full LTO in the framework (as hints to the build system, whether and which LTO method to use).
Thanks for the explanation, if the program is structured in such a way that LTO is needed for acceptable performance, that sounds like a good reason to default to LTO to me. I would love to see a global switch to build all base system programs with LTO. If you get around to working on this please add me as a reviewer :)