Re: Heck, why did the server freeze?
- Date: Fri, 26 Oct 2018 19:38:56 +0200
- From: Alessandro Vesely <vesely@xxxxxxx>
- Subject: Re: Heck, why did the server freeze?
On Fri 26/Oct/2018 11:27:36 +0200 Reco wrote:
> On Fri, Oct 26, 2018 at 11:23:39AM +0200, Alessandro Vesely wrote:
>> The problem is that the server froze. I don't think that's what it is supposed
>> to do when a card fails.
> It's my impression too.
In general, it is too difficult to know if a link is good, at least on the
local side. I found nothing better than running pings by cron.
>> Contrast that with log lines about anything else, from non-redundant power
>> supplies to failed GPG signatures. In part, the missing precise diagnosis must
>> be a shortcoming on part of the card vendor. However, how come the kernel
>> didn't realize that the link had to go down, log something, and just fail any
>> subsequent call on that interface, instead of freezing? Or did it freeze for
>> an unrelated reason?
> I believe that it's impossible to answer this question. It's highly
> likely that it was kernel panic. Whenever it was related to failed NIC,
> or no - it's impossible to tell since there's no kernel backtrace.
Right. I should have tried Ctrl-Alt-F1 or some of the SysRq hacks[*], but I
was too upset by services not responding...
> I'd install, say, kdump-tools for the future incidents like this.
Just installed, thank you! (I'll reboot when the new card arrives).