Web lists-archives.com

Re: How do I stop system hangs?




> Date: Thu, 2 Nov 2017 05:11:34 -0400
> From: Gene Heskett <gheskett@xxxxxxxxxxx>
>
>> What kernel or other settings can I set to let me keep control of my
>> computer during a runaway process? Basically, how do I tell Linux to
>> keep just enough resources free so I can drop into a shell terminal
>> and figure out what's going wrong?
>>
>> In context, this evening my computer hung for 30 minutes. The hard
>> drive activity light went solid and it took about 10 minutes after
>> hitting CTRL + ALT + F1 for a bash shell to appear. It didn't matter
>> anyway, though, since the login process timed out if I attempted to
>> log in.
>>
>> Unfortunately, there's a 30-minute gap in journalctl, so I can forget
>> about figuring out what caused the hang or filing a bug report to the
>> maintainers' satisfaction. Therefore, I'm more interested in keeping
>> control of my computer in future.
>>
>> With thanks,
>
> I haven't had that happen in ages.  But because linux is a time sharing
> system, capable of running hundreds of tasks, 168 ATM, the first thing I
> start after a reboot is a root session of htop. I use a multiple tab
> shell on workspace 1, out of ten, with a tail -fn50 on the system log on
> the 2nd tab and a few tails on other background activities on more tabs
> of that shell. That way I can have a pretty close to realtime view of
> whats going on. And it doesn't normally have a huge lag in calling up
> that workspace and shell tab to see htop. Why a root session?  Easy, so
> it has the rights to kill an errant process.

Decent advice. I may log into a basic shell. Of course, that assumes
that I can CTRL + ALT + F# into such a shell when my desktop
environment starts fritzing on me. In these lock-ups, I can't, which
brings me to my original question: how do I tell Linux not to dole out
so many resources that I can't drop into this shell when I need to?

> Date: Thu, 2 Nov 2017 14:24:43 +0500
> From: "Alexander V. Makartsev" <avbetev@xxxxxxxxx>
>
> This sounds like hard drive problem. If your hdd hits bad block its
> firmware could stick in basically
> "re-read\try-to-recover\mark-as-bad-sector" loop for quite some time.
> This will hang your system, especially if this drive holds root partition.
> You can check if this the case with "smartctl" utility from
> "smartmontools" package. Information SMART table should provide answers.
> Also you can check surface of your hdd for bad sectors with tool from
> "badblocks" package.
> If that is not the case then post more information about your hardware
> and software setup.

I checked the hard drive using the manufacturer's diagnostics and it's
fine. Even if the hard drive were failing, I'm not sure how it would
address my question. Shouldn't enough of the operating system be
running in RAM (I have 4 gigs of it and no swap) so that my computer
can switch to and log into a terminal without needing to access the
hard drive?

The situation you describe would seem like a perfect use case for the
solution I'm looking for: hard drive is going fritzy, so instead of
yanking the plug or letting Linux torture the drive to death, let the
user go into a shell and stabilise the system long enough to shut it
down, back up data or do something else.

So, back to my original question: How do I tell Linux to keep
resources available, ideally in RAM, so I can switch into a different
terminal if my desktop hangs and recover the system?

With thanks,