-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building the Linux kernel using LTO #90
Comments
@InBetweenNames I would use it on my router too, I think at the time the gcc LTO toolchain wasn't very mature and few were able too make much use of it, particularly embedded* Linux where there would be most interest. Without that buy-in the kernel devs weren't going to let the patches in. Perhaps resurrecting the patch set and getting it working again could be successful now that lto support is pretty ubiquitous in distros and most embedded devs must be using it by now for their user space.
|
Seems some remnants of those patches are still in the kernel (notably DISABLE_LTO so it doesn't use it for vdso), so I tried with 4.19.1. Formerly used scripts/gcc-ld but didn't work for me so I used gold. I doubt it's accomplishing anything built this way (size barely changed with other defaults). Despite using gcc-ar, was also complaining about the lto plugin unless -ffat-lto. Patchset used to use -fwhole-program too but that didn't work. Nonetheless, thought I'd do the crazy thing and build the kernel with:
Which.. worked.. and booted fine. I am now the proud owner of a kernel that 30% bigger than before, probably not faster, and set out to kill my dog, but thankfully running in QEMU away from my dog. |
It might be interesting to compare the speed of some syscall- / kernel-bound workloads when successfully built with LTO. Anyone with an idea on how to start benchmarking our gains or losses? |
Not sure, but if you check the kernel mailing list plenty of those benchmarks have been done in the past. I remember seeing pretty big gains with LTO, but not sure if those reflected into any gain for daily usage. |
One thing about LTO is you have to build as many of your models into the kernel as possible... so it knows what it can eliminate when linking... so you get the biggest gains on a completely static kernel (this of course breaks somethings that load firmware etc... some of that you can work around by building in the blobs though). |
Andi Kleen rebased his LTO patches for the Kernel on 4.20 recently. I've tried it out but had no luck and several module errors along the way. Nevertheless, you can find these patches here: https://github.com/andikleen/linux-misc/tree/lto-420-1 |
^ Didn't experiment much but gave it a quick try and it built fine for me with my configuration and Looks like it's using the gcc-ld script and working properly. I do have gold as my default linker (been using it even for kernel). I imagine it may make more of a difference on a less-lean kernel, but my resulting 4.20 kernel is about 1% smaller than my old, didn't try to boot and also no idea for any performance gains. |
@ionenwks I'm trying to replicate the steps on a gentoo system to build an LTO'd kernel. However, I always error out on the linking portion: |
Hmm... I tried again both with the lto-420-1 branch from back then along with same configuration and the newer lto-5.1-3, and I'm getting the same errors as you now (using gcc 8.3.0 and ld.bfd 2.32). Not sure what I was using back then but looking at the date I assume I was on gcc 8.2 and binutils 2.30 I think? It's only something I tried real quick, I had no intention to stick with that for now (or boot it). Edit: Retried with gold as default (switched back to bfd a while ago), doesn't work either, not with current toolchain anyway. |
@ionenwks thank you for taking the time to check through the issue! I was afraid it was a toolchain version issue, so I wonder if this is a reportable bug? |
I was able to build 5.0-1 successfully, however I did not test it and the system it was on it now gone.
|
@jiblime You can find his patchset on the kernel mailing list but it won't really help: https://lkml.org/lkml/2017/11/27/1052 |
@Promaethius Thanks for the link. I'm currently trying to edit arch/x86/entry/vdso/Makefile to work. At the very bottom you can try appending flags after ${LD} but nothing has worked for me, even the options to specifically suppress the error. I went and checked a regular kernel and I noticed that it's normal(?) for a hidden symbol to be there. Both comands ran were 5.1-3 LTO:
5.2.8 kernel:
|
@jiblime I found this on the gcc site today: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fuse-linker-plugin-916
I've witnessed Andi Kleen's patchset passing |
That explains why he uses -fwhole-program and and -fipa-cp-clone, since collect2 would be used instead of a linker. I'm assuming he's doing that for compatibility, as GCC documentation claims it's likely to increase code size vs. bfd/gold. I wonder if GentooLTO would be able to do something better... I believe it's a glibc issue. I've upgraded to sys-libs/glibc-2.30::gentoo and have been able to get past it. Currently recompiling since paravirtualization options, not sure which, causes it to error. https://sourceware.org/ml/libc-alpha/2019-08/msg00029.html
It emits a warning, I'm still not sure why since Andi Kleen filters LTO out of it from what I can tell. Warnings emitted with V=2
CC arch/x86/entry/vdso/vdso32-setup.o - due to target missing So as I understand, it would be a huge issue to have a textrel in a/the vdso because it'd be a vulnerability in a security feature. Gentoo's wiki actually has a guide on finding and fixing textrels: https://wiki.gentoo.org/wiki/Hardened/Textrels_Guide But hopefully there's no need to recreate anything. While the vdso*.so files have a textrel flag marked on them, Glibc 2.29, GCC 9.1.0
It did also emit this, though:
So it looks like it can be possible, but definitely experimental and not a daily driver for myself. I'm going to be grabbing GCC 9.2 now so I won't be getting to it anytime soon (btw, I added 20G of swap with -j5 and it still failed, dammit), but if Glibc 2.30 is the fix, I think it'd be worth a shot to try using this kernel for testing. If you were to use a linker instead of collect2 you can run replace What's interesting is that his newest version (as far as I can tell) lacks explicit linker usage but his older versions use |
Andi Kleen's lto-5.7-2 branch branch builds and I am currently running it. I've applied the 5.7.14 patch, Gentoo distro patches, and a few other misc. patches with no rejects. Notes:
The size of my LTO'd kernel is 22M, modules folder is 800K. Vs. my normal kernel at 11M and modules folder at 71M
Semi-related: GCC 10's -O2 might be slightly slower than GCC 9's -O2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96337#c15
If you have a -march=znver1/znver2 processor and run x86_64 multilib, rebuilding the current GCC 10.2.0 would mean a nice performance boost with this patch: Refer to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95435 Correction 1: I incorrectly assumed modules weren't supported with |
Can you describe in what way? Cheers for the gcc links too |
oooh, imma test |
@jiblime Could you list the patches applied? All are from gentoo's ebuild? |
I haven't built it yet, but this patch applies fine to gentoo-sources-5.8 (just a diff from the lto-5.8.0-1 branch) https://gist.github.com/telans/728b63dd07c41c9ca6e2ca3d4431db8e Doesn't build for me unfortunately, lots of:
|
https://cdn.kernel.org/pub/linux/kernel/v5.x/patch-5.7.14.xz |
that patch is already applied... nvm, messed up something, ended with upstream master somehow... |
@telans
Are you using ld.gold as your default linker? The Linux kernel needs either GCC/ld.bfd or Clang/ld.lld. #338
@barolo I generally download a vanilla tarball from kernel.org (v5.7, v5.8, etc) and apply the Gentoo patches and incremental patches afterwards. That way I don't have to worry about rejected patches as often |
@jiblime thanks for the branch, made it much easier for me. Compiling |
compiled almost cleanly for me, didn't take that long too, had a bunch of "-Wstringop-overflow" warnings for Bluetooth module. Didn't boot for me with error related to scsi. |
Narrowed it down, hidpp/logitech's stuff makes it crash, and it doesn't switch to amdgpu output @jiblime it seems like you\ve had similar issues, how did you solve them? |
Nope, using ld.bfd ( or at least I haven't changed it.)
Forcing Same issue with |
Update, managed to run it and reach the desktop. The issue was with building all modules in. Edit. It seems that all of those are modules that weren't built in, so it seems that initramfs isn't working for me Can't really compare it yet, since it seems to use diff schedulers than I had with zen kernel, and spends more time at lower frequencies, would have to bench it properly to test it seriously. I can already tell though that building that kernel is significantly faster under it |
My gut tells me it has something to do with the -fPIE flag
…On Sun, Aug 9, 2020, 3:27 AM Greg Shuiske ***@***.***> wrote:
Update, managed to run it and reach the desktop. The issue was with
building all modules in.
So I took my working config as base, used genkernel and made sure that it
runs without LTO enabled first, then enabled LTO.
Ended with a bunch of drivers disabled, most importantly for network and
sata, luckily my main is a pcie one.
Each module had disagrees about version of symbol module_layout in dmesg,
gonna investigate it now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#90 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFFNN3MKRMCYOAXWIEI7BLR7ZTWVANCNFSM4EN5L3PQ>
.
|
@Promaethius I've solved that by having those with warnings changed to built-in, It's running fine so far, gonna bench it with something now. |
By the way, the patch from Sami Tolvanan over here for Clang (ClangBuiltLinux/linux#1369 (comment)) [I've modified it to use with CONFIG_LTO_GCC] also helps GCC LTO a bit to bring TRIM_UNUSED_KSYMS forward, but it still errors out later at the end of the build process with the following errors (just grabbed the last ones from the console, there are more Module-License errors): ERROR: modpost: missing MODULE_LICENSE() in drivers/hwmon/pmbus/ltc3815.o |
@ms178 i am not getting any of the errors you're describing with andikleen's patchset, though i have no modules. also, i don't think what you're doing is a healthy approach as the way gcc and clang does lto are fundamentally different so i doubt you can succeed that way. i don't know if it helps you in any way, shape or form but these are my 0 ~: gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-apathy-linux-musl/11.1.0/lto-wrapper
Target: x86_64-apathy-linux-musl
Configured with: ../configure --prefix=/usr --libexecdir=/usr/libexec --mandir=/usr/share/man --infodir=/usr/share/info --host=x86_64-apathy-linux-musl --build=x86_64-apathy-linux-musl --target=x86_64-apathy-linux-musl --with-pkgversion=apathy --enable-checking=release --enable-languages=c,c++,lto --enable-cxx-flags='-w -pipe -O2 -mtune=native -march=native' --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-bootstrap --enable-linker-build-id --enable-lto --enable-plugin --enable-shared --enable-threads=posix --enable-tls --without-included-gettext --without-isl --disable-default-pie --disable-default-ssp --disable-fixed-point --disable-libmpx --disable-libmudflap --disable-libsanitizer --disable-libstdcxx-pch --disable-multilib --disable-nls --disable-rpath --disable-static --disable-symvers --disable-werror
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.0 (apathy)
0 ~: ld.bfd -v
GNU ld (GNU Binutils) 2.36.1 you can try doing a |
@mssx86 Well, problems were to be expected with my experiment of today which I just wanted to share with the community here. The result was interesting as it brought some progress even with GCC while not fixing the issue entirely (trimmed KSYSMS are a known issue with both Clang LTO and GCC LTO, just Clang's ThinLTO does work reliably with Sami's patch and that option now, FullLTO ran into the same boot issue). Regarding my boot problems, Andi just told me that he was able to reproduce it and has it on his radar to debug further. My Kernel is already trimmed down heavily and way slimmer than anything Ubuntu ships with at default and building without modules had no effect, building it with less aggressive compiler flags had no effects either. |
What's the easiest way of applying LTO patches to a gentoo-src kernel? |
@barolo not aware whether the current andikleen lto branches apply to current kernels but what i do is cloning the repo with |
Do they apply cleanly to a corresponding gentoo-src kernel? |
only one way to find out like i said in the beginning of my first post. |
Your way worked beautifully, applied to 5.12.10-gentoo with everything built-in, GCC 11. |
glad to hear.
that is also the case with my kernel config w/ gcc 11.1.0 stable, 5.8 patchset reduced ~1.5 megs iirc. |
Unfortunately that kernel is unstable for me, I get occasional hard locks, possibly related to gpu drivers [AMD], gonna try clang. |
Does anyone have idea how to enable clang build for Andi's kernel? I have only options for GCC |
@barolo you don't need andikleen's patchset for clang lto, clang lto patchset is the work of sami tolvanen from google and 5.12+ already have the support for it in the tree. to enable it, you need to invoke all |
Clang one seems to be stable, but again, it's even bigger. |
So what about gcc 11. I have fully ltoized system with 1304 packages (kde plasma desktop). Now, from what I read here I am not much clever how to do ltoized kernel with GCC 11, is there a way? |
i like to know the state to . last time i tried , external modules was a no go... |
@hedmo - can you share your command to compile kernel with LTO? |
last time i tested it was : make -j8 AR=gcc-ar NM=gcc-nm KCFLAGS="-march=skylake -O3 -falign-functions=32 -fipa-pta -fno-semantic-interposition -fgraphite-identity -floop-nest-optimize -flto=8 -ffat-lto-objects" be ware of my : march=skylake |
@hedmo it's not that simple, you need certain modifications to the linux kernel tree, i don't know if andikleen still ports it to newer kernels but check his work. linked multiple times above. |
Clang one is Google backed and is upstreamed, there's no money behind GCC one. |
@mssx86 . I am just wondering about THE state of LTO and gcc . As i Said last time i was testing , external modules was No go. That was with kernel 5.8.x . |
Can the kernel 5.10 be started normally now build with gcc? |
There's no effort whatsoever for GCC LTO support, there's no config for doing that with recent ones even, Clang one is maintained by Google It will get even worse with the inclusion of rust in kernel, almost everything that requires cross-lto, needs to be compiled with clang nowadays. |
I tried to use linux kernel5.11.0 in vmware to enable LTO to compile and install. After reboot, it stuck in Booting the kernel. Does anyone know the reason? GCC 10.3.1 |
https://lore.kernel.org/lkml/[email protected]/T/#md8014ad799b02221b67f33584002d98ede6234eb New patches are out for GCC LTO (FULL) Did tried a 6.1 rc5 compilation, but sadly it failed. |
I got a 6.1rc5 Kernel with the patchset above build. |
So, the same we observed on latest patches, but earlier patch versions gave smaller kernels somehow (at least for me). @ptr1337 |
Thanks for your answer. I will test this. |
Im running currently into issues, when DEBUG/BTF (CONFIG_DEBUG_INFO_BTF=y) is enabled.
I have tested different pahole version (1.24, 1.23, latest git) but they have not helped. |
I find it interesting that there hasn't been more push to build the kernel using LTO. I've found a couple of mailing list threads about it, including a patchset to let it happen, but there wasn't a lot of interest upstream. I've created this issue as a way to track what the current LTO progress in the kernel is, and possibly even add some patchsets to let it happen. I know I'd for sure use it on my router if I could with OpenWRT.
The text was updated successfully, but these errors were encountered: