Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86_64-pc-linux-gnu-ar issue with LTO and gcc10 #490

Open
StefanSalewski opened this issue Mar 10, 2020 · 35 comments
Open

x86_64-pc-linux-gnu-ar issue with LTO and gcc10 #490

StefanSalewski opened this issue Mar 10, 2020 · 35 comments

Comments

@StefanSalewski
Copy link

Six weeks ago I cloned my harddisk partition and started using GentooLTO and gcc10, gcc compiled with lto and pgo.

Until yesterday it was working not bad, but now emerge of basic packages like libinput or dev-libs/ico fail with messages like

x86_64-pc-linux-gnu-ar: creating uconvmsg/libuconvmsg.a
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
config.status: creating extra/uconv/uconv.1
-- return status = 139
Error generating library file. Failed command: x86_64-pc-linux-gnu-ar r uconvmsg/libuconvmsg.a uconvmsg/uconvmsg_dat.o
Error generating assembly code for data.

The core message is "Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!" and is generated by ar tool.

Ar is from binutils, and installing a different binutils version fails with the same message.

I tried switching back to gcc 9.2, but I got the same issue.

Currently I have no idea about the cause of the problem. May it be the ar program itself? I did

nuc /tmp/portage/dev-libs/libinput-1.14.3/work/libinput-1.14.3-build # x86_64-pc-linux-gnu-ar csrD liblibinput-util.a 'libinput-util@sta/src_libinput-util.c.o'
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
Segmentation fault (core dumped)

Current binutils is

nuc /home/stefan # emerge -pv binutils

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R   ~] sys-devel/binutils-2.34:2.34::gentoo  USE="gold nls plugins -default-gold -doc -multitarget -static-libs -test" 0 KiB

I can not use a copy of arm on my original partition, as that is version 2.32 and it wants to load a matching lib.

But maybe the cause of the problem is not ar at all. I may switch back to gcc 9.2 and try to emerge all the packages which I emerge in the last weeks without LTO, maybe that will help.

@StefanSalewski
Copy link
Author

I have indeed the feeling that my ar is broken.

stefan@nuc /tmp/www $ ls -lt
total 0
stefan@nuc /tmp/www $ echo "xxx 123" > xxx.o
stefan@nuc /tmp/www $ cat xxx.o 
xxx 123
stefan@nuc /tmp/www $ ar -q yyy.a xxx.o 
ar: creating yyy.a
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
Segmentation fault (core dumped)

I am not really sure, as I have never used ar myself before. If it is broken, then first question is why, and next question is how I can fix it.

@StefanSalewski
Copy link
Author

Well I just discovered that a gcc-ar exists, and that seems to work. So I hope I can link ar to gcc-ar to fix it.

stefan@nuc /tmp/www $ lt
total 0
stefan@nuc /tmp/www $ echo "xxx 123" > xxx.o
stefan@nuc /tmp/www $ cat xxx.o 
xxx 123
stefan@nuc /tmp/www $ gcc-ar -q yyy.a xxx.o 
/usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/ar: creating yyy.a
stefan@nuc /tmp/www $ ls -lt
total 8
-rw-r--r-- 1 stefan stefan 76 Mar 10 12:50 yyy.a
-rw-r--r-- 1 stefan stefan  8 Mar 10 12:49 xxx.o
stefan@nuc /tmp/www $ which gcc-ar
/usr/bin/gcc-ar
stefan@nuc /tmp/www $ ls -lt /usr/bin/gcc-ar
lrwxrwxrwx 1 root root 46 Mar  9 19:34 /usr/bin/gcc-ar -> /usr/x86_64-pc-linux-gnu/gcc-bin/10.0.1/gcc-ar

@StefanSalewski
Copy link
Author

StefanSalewski commented Mar 10, 2020

I had to do the same for ranlib manually (maybe for other binutils tools too?)

/usr/bin # ls -lt

x86_64-pc-linux-gnu-ranlib -> x86_64-pc-linux-gnu-gcc-ranlib
x86_64-pc-linux-gnu-ar -> /usr/bin/gcc-ar

I guess that these links got broken somehow (not pointing to the gcc version) and it seems that gcc-config or eselect gcc do not fix the links when broken.

At least now it seems to work again, I was able to emerge net-misc/openssh again!

Well, the nm link is still wrong:

31 Feb 15 11:39 /usr/bin/x86_64-pc-linux-gnu-nm -> /usr/x86_64-pc-linux-gnu/bin/nm

So the accident happened on Feb 15 -- but I still wonder why.

@StefanSalewski
Copy link
Author

Well, seems that the problem was and still is

# emerge -av binutils
[ebuild   R    ] sys-devel/binutils-2.33.1-r1:2.33::gentoo  USE="gold nls plugins -default-gold -doc -multitarget -static-libs -test" 0 KiB

which results again in

Mar 11 06:45  x86_64-pc-linux-gnu-ar -> /usr/x86_64-pc-linux-gnu/bin/ar

and ar stops working.

"eselect binutils set" does not fix the issue, it creates the links to the non gcc versions too.

@Peter-Levine
Copy link

Peter-Levine commented Mar 12, 2020

I ran into this while building both clang and binutils. -amdgpu-argument-reg-usage-info appears to be an LLVM flag, presumably when LLVM is built with LLVM_TARGETS="AMDGPU". But the active toolchain was built using entirely GNU. When I unmerged llvm, all build problems disappeared. Maybe binutils or gcc components are somehow dynamically linking against llvm libraries?

@Peter-Levine
Copy link

Running the offending ar command in gdb shows that a function in /usr/lib64/binutils/x86_64-pc-linux-gnu/2.33.1/libbfd-2.33.1.gentoo-sys-devel-binutils-st.so is calling a function in /usr/x86_64-pc-linux-gnu/binutils-bin/2.33.1/../2.33.1/../lib/bfd-plugins/LLVMgold.so. I don't have gold set as my linker and nothing was built with llvm. Building with -fuse-ld=bfd has no effect.

@StefanSalewski
Copy link
Author

Thank you very much for your investigations. I can not comment on the core of this issue as I do know not much about binutils and ar internals. But I am happy that my box is running well again after manually setting the ar link to gcc-ar.

@Peter-Levine
Copy link

The problem appears fixed in git HEAD with binutils-9999.

@StefanSalewski
Copy link
Author

Great. Then I will close this issue in the next week.

@Peter-Levine
Copy link

Never mind. I spoke too soon. It popped up again. I unmerged llvm-10/clang-10/llvmgold-10 and have no problems with llvm-9/clang-9/llvmgold-9.

@StefanSalewski
Copy link
Author

I have a new problem now, emerging sys-libs/glibc-2.30-r6 fails. Reason is a different ar call as

/usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/ar

which is

cd /usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/
nuc /usr/x86_64-pc-linux-gnu/bin # pwd
/usr/x86_64-pc-linux-gnu/bin

ar -> /usr/x86_64-pc-linux-gnu/binutils-bin/2.33.1/ar

And fixing this link manually does not work, I think I get a loop of symlinks when I try to fix it.

I assume that gcc-ar is not an executable for it own, but calls this link too.

I think I have completely removed clang10, but that is not enough. I guess I have to reemerge some tools, maybe emerge binutils-9999? May that help? Or better reemerge binutils without LTO? It is a bit dangerous of course, I may get a situation where all is completely broken, and I would have to switch back to my backup partition without LTO.

@StefanSalewski
Copy link
Author

emerge -av binutils-libs binutils

for version 2.34 does not fix the problem. But what is interesting is that ar is working fine when we give it --plugin argument:

stefan@nuc /tmp $ /usr/x86_64-pc-linux-gnu/binutils-bin/2.34/ar -q yyy.a xxx.o 
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
Segmentation fault (core dumped)
stefan@nuc /tmp $ /usr/x86_64-pc-linux-gnu/binutils-bin/2.34/ar --plugin=/usr/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/liblto_plugin.so -q yyy.a xxx.o 
stefan@nuc /tmp $ 

@StefanSalewski
Copy link
Author

Maybe related, and there is a suggested fix:

void-linux/void-packages#18725

@StefanSalewski
Copy link
Author

I think finally I found the cause for the real problem:

$ ls -lt /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins
total 8
lrwxrwxrwx 1 root root 60 Mar 10 16:29 liblto_plugin.so -> /usr/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/liblto_plugin.so
lrwxrwxrwx 1 root root 41 Mar  9 13:49 LLVMgold.so -> ../../../../lib/llvm/10/lib64/LLVMgold.so

So on 9 MAR an failed attempt to install clang10 created a link in /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins to LLVMgold.so of version 10, which was not working. And all my tries to uninstall clang10 have not reset that link. I have now manually reset it to clang9, and now I was able to install glibc again, and I hope my whole box works again.

Maybe a reinstall of clang9 and llvm9 would have fixed that automatically?

I was not aware that clang can break gcc, I have considered both indepantly in the past.

@wolfwood
Copy link
Contributor

wolfwood commented Mar 24, 2020 via email

@Peter-Levine
Copy link

Rebuilding llvm:10 without lto seems to have fixed the issue for me. Also, compiler-rt-sanitizers:10 won't build correctly with lto.

@wolfwood
Copy link
Contributor

wolfwood commented Mar 25, 2020 via email

@InBetweenNames
Copy link
Owner

FYI: sys-devel/llvmgold is what installs that symlink. I remember having a concern about that in the past, too, because you can have multiple clang slots on your system but the LLVMgold.so plugin will just use the highest available slot, no functionality to switch it around unlike eselect gcc. And LLVM/Clang don't bother to guarantee ABI compatibility across different versions. Even LLVM IR itself is unstable between different LLVM versions -- learned that one the hard way.

Now, this issue seems to pertain to GCC 10 which I haven't tested outside of some sandboxes yet. GCC shouldn't be touching LLVMgold.so regardless, as that was only really used for LTO with Clang before they switched to lld. So, it sounds like before we migrate to GCC 10 we'll need to do some more extensive testing. I think GCC 10 is -fno-common by default, for example, and that could induce a lot of breakage. Lets leave this issue open so we can refer back to it when GCC 10 reaches a stable release.

@wispoffates
Copy link

Just a note I ran into this with GCC 9.3.0 in my recent attempt to go back to LTO. I fixed it by removing sys-devel/llvmgold. I think firefox pulled it in at some point but maybe no long requires it?

@TheGreatMcPain
Copy link
Contributor

I'm running into this issue as well, but I was able to bypass it by removing AMDGPU from LLVM_TARGETS and re-building llvm/clang-10 using gcc.

I found that building llvm/clang-10 with gcc, and with LLVM_TARGETS=AMDGPU, causes

Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be
registered!

If I compile llvm/clang-10 with clang LLVM_TARGETS=AMDGPU will work, but causes other issues like www-client/firefox[pgo,lto,clang] having pgo profile merging failures.

Also, www-client/firefox[clang,lto] depends on llvmgold for some reason even though it uses lld for linking.

@ekaats
Copy link

ekaats commented May 15, 2020

For me all issues were fixed by installing =sys-devel/binutils-2.34-r1 (currently not keyworded)

That said, I still cannot properly compile Firefox but I don't think that is an LTO issue. Firefox compiles without pgo/clang but Segfaults immediately at runtime. With clang it does not even build. For now I am on firefox-bin and I'll try again after awhile. At least the rest of the system builds correctly with the latest binutils.

@Hello71
Copy link
Contributor

Hello71 commented May 15, 2020

I fixed this problem by unmerging llvm and then rebuilding it. It didn't fix the segfault compiling compiler-rt-sanitizers though.

@TheGreatMcPain
Copy link
Contributor

@Hello71 I think disabling LTO for llvm-10 fixes it.

Although, you can also use clang to compile llvm-10 which won't segfault with LTO, but will cause issues with pgo when compiling firefox. (At least this is what's happening on my system.)

@Hello71
Copy link
Contributor

Hello71 commented May 15, 2020

sure, but this way you can keep lto.

@elsandosgrande
Copy link
Contributor

If I am not mistaken, "[…] you can also use clang to compile llvm-10 which won't segfault with LTO, but will cause issues with pgo when compiling firefox. […]" also means that you keep link-time optimization, but through compiling this package with Clang instead of GCC.
From what I see, it also trades the segmentation fault when compiling compiler-rt-sanitizers in for being unable to compile Firefox with profile-guided optimizations.

@TheGreatMcPain
Copy link
Contributor

@elsandosgrande I should of mentioned that i am using clang to compile firefox, since I think the pgo and lto useflags on firefox are not compatible without the clang useflag.

I'll see if the clang pgo problem also affects other packages, like python.

@TheGreatMcPain
Copy link
Contributor

TheGreatMcPain commented May 16, 2020

So I re-emerged llvm-10, and clang-10, using clang as the compiler using this inside of my `/etc/portage/env'. (I also disable ccache on all packages that use clang due to Gentoo bug 709454)

USE="clang"
CC="clang"
CXX="clang++"
CFLAGS="${CFLAGS} -fno-math-errno -fno-trapping-math -flto=thin"
CXXFLAGS="${CXXFLAGS} -fno-math-errno -fno-trapping-math -flto=thin"
LDFLAGS="-Wl,--lto-O2 -Wl,-O2 -Wl,--as-needed -fuse-ld=lld"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

NOLDADD=1
USE_NONGNU=1

I was able to successfully emerge dev-lang/python:3.7 with pgo, and clang, using those same environment variables. I'm in the process of re-emerging firefox to see my issue got cleared up.

UPDATE: Firefox failed to build. Here's the build.log: build.log.tar.gz

@TheGreatMcPain
Copy link
Contributor

TheGreatMcPain commented May 17, 2020

I have re-emerged llvm-10 using clang as compiler and without LTO using these environment variables.

USE="clang"
CC="clang"
CXX="clang++"
CFLAGS="${CFLAGS} -fno-math-errno -fno-trapping-math"
CXXFLAGS="${CXXFLAGS} -fno-math-errno -fno-trapping-math"
LDFLAGS="-Wl,-O2 -Wl,--as-needed -fuse-ld=lld"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

NOLDADD=1
USE_NONGNU=1

I'm currently waiting for Firefox to finish re-emerging which has been going for about 2 and a half hours. Normally it will fail before the first hour, so this is a good sign.

UPDATE: Yup, Firefox successfully emerged with the useflags lto,pgo,clang. At this point I feel like I should make a new issue for this.

@BorisCarvajal
Copy link

I've disabled -fipa-pta from llvm-10 and ar is no longer crashing loading the LLVMgold.so plugin.

@Althorion
Copy link
Contributor

I stepped right into this. And I can’t rebuild the LLVM or Clang with Clang with said env variables, because even the Clang is broken.

Is there any hope for my system yet, or is it time to format?

@StefanSalewski
Copy link
Author

BorisCarvajal , your hint works great for me.

My box recently tried to update to clang 10, and I got the error "Two passes with the same argument (-amdgpu-argument-reg-usage-info)" when building some tools like compiler-rt. Your tip fixed it:

$ grep llvm /etc/portage/package.cflags/ltoworkarounds.conf
sys-devel/llvm *FLAGS-="-fipa-pta"

Then rebuild llvm and after that the other packages like compiler-rt.

Althorion, in Mar my box was also totally broken, gcc and clang refuses to work. But I got gcc to work again by manually fixing some links as described at the top of this thread.

@telans
Copy link
Contributor

telans commented Jul 20, 2020

@Althorion same as you. Can't rebuild clang or llvm at the moment. Did you end up getting it sorted? I'll give moving links around a go.

@Althorion
Copy link
Contributor

@telans unfortunately no. I’ve been trying quite a lot of things and ended up with a system so broken, it couldn’t even shut down, so I saved my @world set, blasted the whole thing and build it anew.

@telans
Copy link
Contributor

telans commented Jul 21, 2020

For me as least, all that was needed was emerge -C llvmgold & rebuilding llvm without -fipa-pta. Llvm pulls llvmgold back in after merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests