Web lists-archives.com

Re: Compiler segfault when building the kernel




On Mon, 12 Jun 2017 10:45:17 +0300
Adrian Bunk <bunk@xxxxxxxxxx> wrote:

> On Fri, Jun 09, 2017 at 07:58:12AM -0400, Celejar wrote:
> > Hi,
> > 
> > I've been building kernels (vanilla from upstream) for years with
> > kernel-package (typical command line: "time make-kpkg -j2 --initrd
> > --revision 1.custom kernel_image"; .kernel-pkg.conf contains just the
> > line "root_cmd = fakeroot") without problem. Recently, the builds have
> > begun to fail with messages like these:

...

> > > ./include/linux/rcu_sync.h:29:48: internal compiler error: Segmentation fault
> > >  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> > >                                                 ^
> > > Please submit a full bug report,
> > > with preprocessed source if appropriate.
> > > See <file:///usr/share/doc/gcc-4.9/README.Bugs> for instructions.
> > >   CC      fs/posix_acl.o
> > > The bug is not reproducible, so it is likely a hardware or OS problem.

...

> > This occurred immediately following a cleaning of the source tree
> > ("make-kpkg ... clean"), the first one I've done in quite some time, so
> > I'm pretty sure that that's what triggered this, whatever the
> > underlying problem actually is.
> > 
> > Googling suggests that this sort of thing can be triggered by race
> > conditions caused by build systems improper handling of
> > concurrency,e.g.:
> > 
> > https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem
> 
> That is just an incorrect answer from some random person.
> 
> Missing dependencies produce different kinds of errors,
> never internal compiler errors.

The suggestion is not that the bug is caused by the missing
dependencies, but rather that there's an underlying bug getting hit,
and the fact that it's not reporducible is due to a race condition
caused by improper concurrency handling.

> > For the last year or so, I've been building with -j2, so I tried again
> > without it. I still got the same error, but when I once again did a
> > clean and then rebuilt without -j2, the build succeeded.
> > 
> > Any ideas? Is this a bug I should be filing against kernel-package (or
> > anywhere else)?
> 
> Based on what you describe (the problem is not reproducible and the 
> problem started recently), there is a nearly 100% chance that it is

As I mentioned, the one significant recent change of which I am aware
is the fact that this is the first time in a very long time that I've
done a full kernel build (i.e., one preceded by a 'clean' of the source
tree). I am aware of no other changes.

> caused by a hardware defect on your machine.

I suppose that's always possible. This is a Lenovo W550s workstation,
purchased refurbished about a year ago.

> Were there any hardware changes or was there a a move of the machine recently?

No hardware changes. The machine is a laptop - it's moved often.

> Are all fans still working?

There's only one fan, and it seems to be working fine. It goes up to
several thousand RPM, to a maximum of about 4K when under heavy
(artificial) load, e.g. sysbench. This has been the behavior for as
long as I've had the system.

> Do all temperatures look normal?

Yes. Mid 40s C when at rest, climbing as high as the high 70s when under
full load. Once again, this has always been the behavior of this system.

> Do all capacitors on the mainboard look OK?

Haven't opened it.

> Does a RAM testing tool like memtest86 succeed?

Will test when I have a chance - perhaps overnight tonight.

Celejar