Web lists-archives.com

Re: Heck, why did the server freeze?

On Fri 26/Oct/2018 11:27:36 +0200 Reco wrote:
> On Fri, Oct 26, 2018 at 11:23:39AM +0200, Alessandro Vesely wrote:
>> The problem is that the server froze.  I don't think that's what it is supposed
>> to do when a card fails.
> It's my impression too.

In general, it is too difficult to know if a link is good, at least on the
local side.  I found nothing better than running pings by cron.

>> Contrast that with log lines about anything else, from non-redundant power
>> supplies to failed GPG signatures.  In part, the missing precise diagnosis must
>> be a shortcoming on part of the card vendor.  However, how come the kernel
>> didn't realize that the link had to go down, log something, and just fail any
>> subsequent call on that interface, instead of freezing?  Or did it freeze for
>> an unrelated reason?
> I believe that it's impossible to answer this question. It's highly
> likely that it was kernel panic. Whenever it was related to failed NIC,
> or no - it's impossible to tell since there's no kernel backtrace.

Right.  I should have tried Ctrl-Alt-F1 or some of the SysRq hacks[*], but I
was too upset by services not responding...

[*] https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html

> I'd install, say, kdump-tools for the future incidents like this.

Just installed, thank you!  (I'll reboot when the new card arrives).