Web lists-archives.com

Re: Stretch stuck on boot




Thank you for the advice,


tv.debian@xxxxxxxxxxxxxx writes:

> On 28/01/2018 20:25, tomas@xxxxxxxxxx wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On Sun, Jan 28, 2018 at 02:04:54PM +0000, Daniel Nemenyi wrote:
>>> Dear all,
>>>
>>> My laptop has for the second time got stuck on boot, and a seemingly
>>> random number of hard reboots has been necessary to get it running. This
>>> obviously worries me. I'm running Debian Stretch with full disk
>>> encryption and, after grub, this is what happens:
>>>
>>>   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
>>
>> FWIW, I think this one is normal: the LVM metadata daemon isn't up yet
>> at this early stage (I can observe that warning in my boot process too).
>>
>>>   Volume group "hostname-vg" not found
>>>   Cannot process volume group hostname-vg
>>
>> Normal too: they will be there after cryptsetup unlocks things down
>> there:
>>
>>> Please unlock sda3_crypt: # password inserted
>>>   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
>>>   Reading all physical volumes. This may take a while...
>>>   Found volme group "hostname-vg" using metadata type lvm2
>>>   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
>>>   2 logical volume(s) in volume group "hostname-vg" now active.
>>> cryptsetup (sda3_crypt): set up successfully
>>> /dev/mapper/hostname--vg-root: clean 927829/14712832, 50294056/58823680
>>> blocks.
>>
>> Up to this, things look pretty normal.
>>
>>> At that point it gets stuck. The other time this happened the numbers
>>> were different (lower)
>>>
>>> When it does eventually boot, the next lines are:
>>> [48.862647] nouveau 0000:04:00.0: bus: MMIO write of 0000807f FAULTat 100c18
>>> [48.945694] nouveau 0000:04:00.0: bus: MMIO write of 0000807e FAULTat 100c1c
>>> ... and then normal boot.
>>>
>>> Is this a sign of the end of my laptop or SSD or is there something I
>>> can do?
>>
>> Given the intermittent nature, I'd lean towards flaky hardware (not
>> necessarily the SSD). What happens if you try to boot from an external
>> medium (e.g. a rescue system on a stick)?

It's an intermitent problem, sometimes it boots fine, so it's difficult
to test. But that's interesting to hear it might be more than just the
SSD, as I was thinking of just replacing it. This isn't the newest
machine...

>>
>> Ah, and while you are at it (and get your box to boot once more), make
>> a backup :)

Given that it coud be an array of problems, and that I'm kneck deep in
work and can't risk a complete boot failure at the moment for time
reasons, I think I might make two, and look into buying a new box ;)

>>
>> Cheers
>> - -- tomás
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.12 (GNU/Linux)
>>
>> iEYEARECAAYFAlpt5HAACgkQBcgs9XrR2kb29wCfT7l80UBcO9mMQJeJd31w2F18
>> 7OQAniAYtvGoz7FCa+zBahM/pqjMHJK9
>> =a2E1
>> -----END PGP SIGNATURE-----
>>
>
> Hi, looks like a Nouveau gpu driver problem, do you have a Nvidia
> graphic card in this laptop ? Do you use Nvidia frivers or the free
> Nouveau ?

Well spotted, yes I do have a Nvidia graphics card and I'm using the
Nouveau.

>  >> When it does eventually boot, the next lines are:
>  >> [48.862647] nouveau 0000:04:00.0: bus: MMIO write of 0000807f
> FAULTat >>100c18
>  >> [48.945694] nouveau 0000:04:00.0: bus: MMIO write of 0000807e
> FAULTat >>100c1c
>  >> ... and then normal boot.
>
> Unfortunatly I don't have a fix to offer outside of trying to switch to
> Nvidia proprietary drivers and see if it helps. If you are already using
> Nvidia drivers then you need to blacklist Nouveau, and maybe also add
> "nomodeset" to your kernel boot parameters.

Ok, at this rate I'm thinking of abandoning ship but good to know in
case that doesn't happen or in case I repurpose this machine for another use.

> As Tomás said trying a live usb system, one with different kernel and
> Nouveau versions could help you pinpoint the origin.

Thanks a lot for the advice.

> Hope it helps.