Web lists-archives.com

Re: [PATCH] rev-list: allow cached objects in existence check




> This patch makes the minimal fix, which is to swap out a direct call to
> oid_object_info_extended(), minus the SKIP_CACHED flag, instead of
> calling has_object_file(). This is all that has_object_file() is doing
> under the hood. And there's little danger of unrelated fallout from
> other unexpected "cached" objects, since there's only one call site that
> ends such a cached object, and it's in git-blame.

Thanks - this patch looks good to me.

> I think we might also consider just having has_object_file() respect
> cached objects. The SKIP_CACHED flag comes from Jonathan Tan's
> e83e71c5e1 (sha1_file: refactor has_sha1_file_with_flags, 2017-06-21).
> But it was just matching the old behavior; it's not clear to me that we
> particularly care about that, and it wasn't simply that nobody bothered
> to put the cached-object check into has_sha1_file().

Making has_object_file() respect cached objects sounds good to me too.

> Some concerns/arguments against it:
> 
>   - we probably would want to make sure we do not short-cut
>     write_sha1_file(). I.e., we should still write it to disk when
>     somebody wants it. But I think that works, because that function
>     uses its own check-and-freshen infrastructure.

write_object_file() (formerly write_sha1_file()) indeed uses its own
check-and-freshen mechanism.

>   - some callers of has_sha1_file() might care about durability between
>     processes. Because it's baked in, the empty tree is safe for that
>     (whatever follow-on process runs, it will also be baked in there).
>     But that's not necessarily true for other "cached" objects. I'm not
>     really that worried about it because we use it sparingly (the only
>     call to pretend_sha1_file() is in git-blame, and if it ever did ask
>     "do we have this object", I actually think the right answer would be
>     "yes").
> 
>     But if this is a concern, we could perhaps have two levels of flags:
>     SKIP_CACHED and SKIP_INTERNAL.

Or teach git-blame to have its own pretend mechanism, and remove the
pretend mechanism from sha1-file.c.

The last time I deeply thought of this was during the partial clone
implementation, so I am probably not completely up-to-date, but it seems
to me that ideally, for reading, we would remove SKIP_CACHED completely
(and always consult the cache), and also remove completely the ability
to pretend (blame will have to do it by itself); and for writing, we
would write the empty tree whenever we do now (for backwards
compatibility with old versions of Git that read what we write). Both
the approach in this patch and making has_object_file() respect cached
objects are steps in that direction, so I'm OK with both.