Web lists-archives.com

Re: [PATCH 2/8] upload-pack: implement ref-in-want




On Wed, Jun 06 2018, Brandon Williams wrote:

> On 06/05, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Tue, Jun 05 2018, Brandon Williams wrote:
>>
>> > +uploadpack.allowRefInWant::
>> > +	If this option is set, `upload-pack` will support the `ref-in-want`
>> > +	feature of the protocol version 2 `fetch` command.
>> > +
>>
>> I think it makes sense to elaborate a bit on what this is for. Having
>> read this series through, and to make sure I understood this, maybe
>> something like this:
>>
>>    This feature is intended for the benefit of load-balanced servers
>>    which may not have the same view of what SHA-1s their refs point to,
>>    but are guaranteed to never advertise a reference that another server
>>    serving the request doesn't know about.
>>
>> I.e. from what I can tell this gives no benefits for someone using a
>> monolithic git server, except insofar as there would be a slight
>> decrease in network traffic if the average length of refs is less than
>> the length of a SHA-1.
>
> Yeah I agree that the motivation should probably be spelled out more,
> thanks for the suggestion.
>
>>
>> That's fair enough, just something we should prominently say.
>>
>> It does have the "disadvantage", if you can call it that, that it's
>> introducing a race condition between when we read the ref advertisement
>> and are promised XYZ refs, but may actually get ABC, but I can't think
>> of a reason anyone would care about this in practice.
>>
>> The reason I'm saying "another server [...] doesn't know about" above is
>> that 2/8 has this:
>>
>> 	if (read_ref(arg, &oid))
>> 		die("unknown ref %s", arg);
>>
>> Doesn't that mean that if server A in your pool advertises master, next
>> & pu, and you then go and fetch from server B advertising master & next,
>> but not "pu" that the clone will die?
>>
>> Presumably at Google you either have something to ensure a consistent
>> view, e.g. only advertise refs by name older than N seconds, or globally
>> update ref name but not their contents, and don't allow deleting refs
>> (or give them the same treatment).
>>
>> But that, and again, I may have misunderstood this whole thing,
>> significantly reduces the utility of this feature for anyone "in the
>> wild" since nothing shipped with "git" gives you that feature.
>>
>> The naïve way to do slave mirroring with stock git is to have a
>> post-receive hook that pushes to your mirrors in a for-loop, or has them
>> fetch from the master in a loop, and then round-robin LB those
>> servers. Due to the "die on nonexisting" semantics in this extension
>> that'll result in failed clones.
>>
>> So I think we should either be really vocal about that caveat, or
>> perhaps think of how we could make that configurable, e.g. what happens
>> if the server says "sorry, don't know about that one", and carries on
>> with the rest it does know about?
>
> Jonathan actually pointed this out to me earlier and I think the best
> way to deal with this is to just ignore the refs that the server doesn't
> know about instead of dying here. I mean its no worse than what we
> already have and we shouldn't hit this case too often.  And that way the
> fetch can still proceed.
>
>>
>> Is there a way for client & server to gracefully recover from that?
>> E.g. send "master" & "next" now, and when I pull again in a few seconds
>> I get the new "pu"?
>
> I think in this case the client would just need to wait for some amount
> of replication delay and attempt fetching at a later point.
>
>>
>> Also, as a digression isn't that a problem shared with protocol v2 in
>> general? I.e. without this extension isn't it going to make another
>> connection to the naïve LB'd mirroring setup described above and find
>> that SHA-1s as well as refs don't match?
>
> This is actually an issue with fetch using either v2 or v0.  Unless I'm
> misunderstanding what you're asking here.

Isn't the whole dialog in v1 guaranteed to be with one server from
intial ref advertisement to the client saying have/want, or is that just
with ssh?

In any case the reason the above is an issue here is because you're
getting the advertisement from a different server than you're
negotiating the pack with, right?

>>
>> BREAK.
>>
>> Also is if this E-Mail wasn't long enough, on a completely different
>> topic, in an earlier discussion in
>> https://public-inbox.org/git/87inaje1uv.fsf@xxxxxxxxxxxxxxxxxxx/ I noted
>> that it would be neat-o to have optional wildmatch/pcre etc. matching
>> for the use case you're not caring about here (and I don't expect you
>> to, you're solving a different problem).
>>
>> But let's say I want to add that after this, and being unfamiliar with
>> the protocol v2 conventions. Would that be a whole new
>> ref-in-want-wildmatch-prefix capability with a new
>> want-ref-wildmatch-prefix verb, or is there some less verbose way we can
>> anticipate that use-case and internally version / advertise
>> sub-capabilities?
>>
>> I don't know if that makes any sense, and would be fine with just a
>> ref-in-want-wildmatch-prefix if that's the way to do it. I just think
>> it's inevitable that we'll have such a thing eventually, so it's worth
>> thinking about how such a future extension fits in.
>
> Yes back when introducing the server-side ref filtering in ls-refs we
> originally talked about included wildmatch or other forms of pattern
> matching.  We opted to not over complicate things and favored prefix
> matching because it didn't bake in some subset of globbing or regex and
> it was easier to compute on the server side.
>
> Anyway back to your question.  Yes if at some point in the future we
> wanted to add in wildmatch/pcre to the protocol for ls-refs or for
> ref-in-want then it could be added as a feature or capability.  I don't
> think it would require adding a whole new verb (it probably would for
> the ls-refs case since the verb used there is "ref-prefix") but the
> capability could mean that the "want-ref" verb now understands wildmatch
> patterns in addition to fully qualified refs.

Probably still makes sense to have it be a different verb since some
things in wildmatch / regex are metachars but may be valid in ref names.

Thanks!