Web lists-archives.com

Re: SSD's and many edits of a single file




On Monday 09 April 2018 21:25:34 David Christensen wrote:

> On 04/09/18 07:30, Gene Heskett wrote:
> > On Monday 09 April 2018 09:51:37 Greg Wooledge wrote:
> >> On Mon, Apr 09, 2018 at 09:46:07AM -0400, rhkramer@xxxxxxxxx
> >>
> >> wrote:
> >>> To your original problem, have you tried going to a command line
> >>> and throwing in a couple =sync=s?  I would try that, maybe after
> >>> saving in your editor, and again maybe after open and / or saving
> >>> in the cnc program.
> >>
> >> As others have explained, the OS (Linux) keeps a cache of file
> >> contents that have been written by applications, but not yet
> >> committed to permanent storage.  If you "save" from within the text
> >> editor, then the saved contents should be immediately visible to
> >> other processes reading the file, regardless of whether it has been
> >> synced to disk. They'll simply get the cached version.
> >
> > Which is not happening after several hours and a hundred or more
> > edits. Which is why its so intermittent.
>
> On 04/09/18 02:53, Gene Heskett wrote:
> > Lots of people seem to like gedit, but its saves are the cause of
> > important configuration files being written back to disk with the
> > line order totally trashed, as if you had thrown it on the floor in
> > 512 byte pieces, then picked it back up and reassembled it in random
> > order. Then try to recover a 1400 LOC configuration file...
> >
> > Thats happened using gedit enough, on several different machines ...
>
> One editor failing would also make me suspect the editor.
>
>
> But two failing editors would make me suspect some common factor, such
> as a shared library and/or the kernel.
>
I'd buy that if the failures were similar.  They are not.
>
> Have you tested your hardware -- power supply, memory, and SSD?  Be
> sure to test the SSD before you touch any cables.  If it fails,
> re-seat and/or replace cables and test again.

I saw this once when the drive was a 2T piece of spinning rust. Way 
overkill for that machine, but a 1T is being used for amanda, here on 
this machine and its around 90% right now.  So at some 
point "/amandatapes" is going to find another terabyte.
>
> Have you tested your SSD hypothesis?  Say, by removing the SSD,
> cloning the SSD to another device, installing the other device, and
> then testing for the bug while running the other device?

I have a twin to that one, but only one handy sata port. So it would be 
an extended project to clone it to the other drive.  And since I saw it 
on spinning rust too, I'm not inclined to sharpen that finger just yet.

That machine has also ran about 7 cycles of memtest86 with no errors 
since I noticed it the first time.

> Can you reproduce the bug on hardware with ECC memory with no memory
> errors reported?

I don't have anything with ECC memory in it. That usually runs the mobo 
cost up quite a ways.

> Can you reproduce the bug on RAID1 with no RAID errors reported?

That would also demand another handy sata port.  I'll see if I can cobble 
something up if and when it does it again. Possible if it has a back 
panel sata connector I think.
>
> If you want to make it possible for others to find the bug, I would
> suggest:
>
> 1.  Build a machine with the minimum hardware, software, and
> configuration required to demonstrate the bug.  Document everything.
>
> 2.  Write a program or script that invokes the bug every time it runs
> on the demonstration machine.  If a program, include a Makefile.
>
> 3.  Post the demonstrator document, program, script, Makefile, etc.,
> to the relevant support communities.
>
> David

Thanks David.

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>