Web lists-archives.com

Re: [PATCH 6/7] rev-list: let traversal die when --missing is not in use

On Thu, Apr 04, 2019 at 08:37:54PM -0700, Taylor Blau wrote:

>  3. have the traversal machinery communicate the failure to the caller,
>     so that it can decide how to proceed without re-evaluting the object
>     itself.
> Of those, I think (3) is probably the best path forward. However, this
> patch does none of them. In the name of expediently fixing the
> regression to a normal "rev-list --objects" that we use for connectivity
> checks, this simply restores the pre-7c0fe330d5 behavior of having the
> traversal die as soon as it fails to load a tree (when --missing is set
> to MA_ERROR, which is the default).

I think this is worth doing, as it restores the earlier behavior. But a
few general thoughts (which I've shared already with you, but for the
benefit of the list):

 - actually doing the "communicate failure to the caller" would probably
   not be too bad as a single-bit PARSE_FAILED flag in obj->flags. But
   it does require the caller understanding which objects the traversal
   would try to parse (i.e., rev-list would have to understand that it
   is on its own to check blobs, even if they don't have a PARSE_FAILED

 - speaking of blobs, this series does not help rev-list find a
   mis-typed or bit-rotted blob at all, because it never opens the
   blobs. Does that mean my expectations for rev-list are simply too
   high, and that we should be expecting fsck-like checks to catch
   these? I dunno.

   It would not be too expensive to convert the existing "do we have the
   blob" check in rev-list to "do we have it, and is its type correct?".
   But obviously finding bitrot would be super-expensive. Which leads me

 - there actually _is_ a --verify-objects option, which would check even
   blobs for bitrot. It was added long ago in 5a48d24012 (rev-list
   --verify-object, 2011-09-01) for use with check_connected(). But it
   was deemed too slow for normal use, and ripped out in d21c463d55
   (fetch/receive: remove over-pessimistic connectivity check,

That last one implies that we're OK relying on the incoming index-pack
to catch these cases (which is going to do a sha1 over each object).

It does seem like we should bother to notice failures when it's _free_
to do so, which is the case with these tree-loading failures. Which is
basically what this patch is doing.