Web lists-archives.com

Re: [PATCH v2 0/6] Partial clone part 1: object filtering

On Fri, 3 Nov 2017 14:34:39 -0400
Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote:

> > Assuming we eventually get promisor support working, would there be
> > any use case where "any missing is OK" mode would be useful in a
> > sense more reasonable than "because we could have such a mode" and
> > "it is not our business to prevent users from playing with fire"?
> > 
> For now, I'd like to keep my "any missing is OK" option.
> I do think it has value all by itself.
> We are essentially using something like that now with our GVFS
> users on the gigantic Windows repo and haven't had any issues.
> But yes, when we get promisor support working, we could revisit
> the need for this parameter.

Well, it's probably not a good idea to include this parameter, and then
subsequently remove it.

> However, I do have some scaling concerns here.  If for example,
> I have 100M missing blobs (because we did an only commits-and-trees
> clone), the cost to compute "promisor missing" vs "just missing"
> might be prohibitively expensive.  It could be something we want
> fsck/gc to be aware of, but other commands may want to just assume
> any missing objects are expected and continue.
> Hopefully, we won't have a scale problem, but we just don't know
> yet.

I can see some use for this parameter - for example, when doing a report
for statistical purposes (percentage of objects missing, for example) or
for a background task that downloads missing objects into a cache. Also,
power users who know what they're doing (or normal users in an
emergency) can use this option when they have no network connection if
they really need to find something out from the local repo.

In these cases, the promisor check (after detecting that the object is
missing) is indeed not so useful, I think. (Or we can do the
--exclude=missing and --exclude=promisor idea that Jeff mentioned -
--exclude=missing now, and --exclude=promisor after we add promisor

This is conceptually different from gc's use of
--exclude-promisor-objects (in my patch set), which does not intend to
touch promisor objects (objects that are known to be in the promisor
remote), whether they are present or not.

Having said that, I would be OK if we didn't have tolerance (and/or
reporting) of missing objects right now. As far as I know, for the
initial implementation of partial clone, only the server performs any
filtering, and we assume that the server possesses all objects (so it
does not need to filter out any missing objects).