Web lists-archives.com

Re: all files moved to lost+found




On 10/1/18 9:07 PM, Abdullah Ramazanoğlu wrote:
On Mon, 1 Oct 2018 21:31:27 -0300 Beco said:

I've done the bootable seagate HD test. It came up 100% ok, all tests. The
HD is new, I wouldn't expect different, but it is always reassuring to do
the real test and see the results.

I am not familiar with seatools, but there are usually destructive and
non-destructive test suites in such tools. I hope you have run a destructive
one (which destroys data on disk) as some marginal errors can only be caught by
destructive tests.

The only destructive test I recall from SeaTools is writing zeros to the disk. Of course, you don't want to do that until the data recovery effort is complete. And, the OP seems to be ignoring recovery and still trying to use the drive (!).


Is the "swap" partition something that could cause that if turned off by
"swapoff"?

I don't think so.

+1


I read above from Abdullah, that it is very unlike to have such major FS
fail, and it looks like it didn't "flush" the night I turned off. This is
also my guess, it is the only thing that make sense. Maybe the inode table
was in the memory and got corrupted. But it is strange to figure that,
since the filesystem is EXT4, and it is very stable nowadays.

I'm not sure how much isolation the various parts of the Linux kernel now have from each other. In the bad old days of monolithic kernels, everything inside the kernel could touch everything else in the kernel. A buggy driver in one subsystem could wreak havoc in any other subsystem.


But the journal is passing through on-disk controller, too. If the drive is
mishandling its on-drive cache, then an FS corruption is still possible. Even
if journal writes are "direct", the drive could be ignoring that (i.e. caches
it nevertheless) without flushing it prior to power-off, corrupting the journal
as well.

If the FS survives through reboots, but falters when the laptop is power
cycled, then a cache flush issue is still probable.

Another test method might be hibernation. If resume from hibernation works,
then that rules out on-disk caching problem.

Regards

Hypotheses require experiments to confirm or refute them.


The chicken-and-egg problem is to get some solid data to start with. We'll see if the OP responds to my shopping list of requested commands.


David