Web lists-archives.com

debmirror --checksums - when is this necessary?




debmirror has the --checksums option.

Running --checksums on a full x86-64 Debian repo mirror takes a few
hours on my humble box.

Q: does debmirror check the md5sum of a file, immediately when it is
downloaded but before (or immediately after) it is written to disk?

This would seem the most logical thing to do - check the MD5SUM as
written to disk, immediately after writing to disk - which would
spreda the MD5SUM checking load over the download time period, AND
would take advantage (in general) of the most recently downloaded and
saved .deb file still being in the disk/page cache.

The debmirror man page says inter alia the following:

--checksums
  Use checksums to determine if files on the local mirror that
  are the correct size actually have the correct content. Not
  enabled by default, because it is too paranoid, and too slow.
  
  When the state cache is used, debmirror will only check
  checksums during runs where the cache has expired or been
  invalidated, so it is worth considering to use these two
  options together.


The reason this is relevant is that a competent filesystem such as
ZFS, does disk-layer data checksumming anyway, so as long as the
checksum for a particular .deb is right the first time it's written
to disk, debmirror has no need to do such checks thereafter (just use
e.g. zfs scrub etc).

TIA,