Web lists-archives.com

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

>> Not completely sure if "you assume" or "you know" it to be the case.
> Sorry, I should have tried to be more clear -- sort of a digression, but I 
> came from an environment where anytime someone used the word assume, someone 
> else would point out what (they thought) that meant (it makes an ass out of 
> [yo]u and me).

[ I think I understand what you mean.  In French, "assume" means something
  quite different from the use we're discussing so I use "présume"
  instead, which also works in English (modulo the accent, obviously).  ]

> I thought "edit" would be pretty clear (implying a text editor / word
> processor), but, to get more specific, depending on the file I use
> kate, nedit, or kwrite, and I am starting to migrate toward using any
> scintilla based editor.

I don't specifically know what those do, but at least I know Emacs tries
to avoid "unnecessary" modifications when *re-reading* a file, but makes
no such effort when writing it: it would just blindly write the 100MB on
top of the old content.

So I would not be surprised if those other text editors do likewise.

> To (try to) be clear, I am not sure whether only the changed part (or
> from the  changed part of the file to the end is written).  I can
> imagine that is  reasonably possible -- I mean, the file is stored in
> blocks on the disk, and  some of those blocks are not changed, so why
> rewrite them.

Indeed, it's definitely possible.  But there can be various reasons not
to do that:
- it's simpler to send the 100MB and forget about it than having to
  first read (the beginning of) those 100MB to see which part was
  left unchanged.
- in order to make the save atomic, the editor may prefer to write the
  100MB to another file and only when that's done rename that file to
  overwrite the old file.
- ... probably other reasons ...

> Well, with atsar or smartctl, I anticipate some experiments that might
> confirm that for me.

Sounds like a better approach, indeed (after all, you don't really care
about what your tool does so much as you care about the resulting amount
of writes that gets sent to the disk).