Web lists-archives.com

Re: Heck, why did the server freeze?


On Fri, Oct 26, 2018 at 11:23:39AM +0200, Alessandro Vesely wrote:
> On Thu 25/Oct/2018 20:30:27 +0200 Brian wrote:
> > On Thu 25 Oct 2018 at 19:53:26 +0200, Alessandro Vesely wrote:
> > 
> >> Hi all,
> >> early this morning a network card burned out.  A few hours later, the server
> >> was not responding on any network address, nor on the system console.  I had to
> >> power it down.
> >> 
> >> Upon rebooting, network errors were detected an I arranged the server to work
> >> with the available hardware.  The last line logged was an incoming email from a
> >> spammer in Brazil.  It shouldn't have triggered any severe damage.  I found no
> >> breakdown hint in the logs.
> >> 
> >> My theory is that the system didn't realize that the card was broken, didn't
> >> turn the interface down, and kept storing outgoing stuff until it blew off.  Is
> >> that reasonable or should I be more paranoid?
> > 
> > You have given an exact diagnosis of your problem - the network
> > card failed. What's your problem? Replace it instead of agonising
> > and theorising.
> The problem is that the server froze.  I don't think that's what it is supposed
> to do when a card fails.

It's my impression too.

> Contrast that with log lines about anything else, from non-redundant power
> supplies to failed GPG signatures.  In part, the missing precise diagnosis must
> be a shortcoming on part of the card vendor.  However, how come the kernel
> didn't realize that the link had to go down, log something, and just fail any
> subsequent call on that interface, instead of freezing?  Or did it freeze for
> an unrelated reason?

I believe that it's impossible to answer this question. It's highly
likely that it was kernel panic. Whenever it was related to failed NIC,
or no - it's impossible to tell since there's no kernel backtrace.
I'd install, say, kdump-tools for the future incidents like this.