Web lists-archives.com

Re: Fetch-hooks




On 02/09/2018 11:04 PM, Ævar Arnfjörð Bjarmason wrote:>>> You could also
have some intermediate step between these two, where
>>> e.g. your refspec for "origin" is
>>> "+refs/heads/*:refs/remotes/origin-untrusted/*" instead of the default
>>> "+refs/heads/*:refs/remotes/origin/*", you fetch all refs to that
>>> location, then you move them (with some alias/hook) to
>>> "refs/remotes/origin/*" once they're seen to be "OK".
>>
>> That is indeed another possibility, but then the idea is to make things
>> as transparent as possible for the end-user, not to completely change
>> their git workflow. As such, considering only signed commits to be part
>> of the upstream seems to make sense to me?
> 
> I mean this would be something that would be part of a post-fetch hook,
> so it would be as transparent as what you're doing to the user, with the
> difference that it doesn't need to involve changes to what you slurp
> down from the server.
> 
> I.e. we'd just fetch into refs/remotes/origin-untrusted/, then we run
> your post-fetch hook and you go over the new refs, and copy what you
> like (usually everything) to refs/remotes/origin/*.

Hmm... but that would then require a post-fetch hook, wouldn't it? And
about a post-fetch hook, if I understood correctly, Junio in [1] had a
quite nice argument against it:

    Although I do not deeply care between such a "trigger to only
    notify, no touching" hook and a full-blown "allow hook writers to
    easily lie about what happened in the fetch" hook, I was hoping that
    we would get this right and useful if we were spending our brain
    bandwidth on it. I am not very fond of an easier "trigger to only
    notify" hook because people are bound to misuse the interface and
    try updating the refs anyway, making it easy to introduce
    inconsistencies between refs and FETCH_HEAD that will confuse the
    later "merge" step.

Otherwise, if it doesn't require a post-fetch hook, then it would
require the end-user to first fetch, then run the
`copy-trusted-refs-over` script, which would add stuff to the user's
workflow.

Did I miss another possibility?

> [...]
> 
> One thing that's not discussed yet, and I know just enough about for it
> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff &
> Brandon who know more) is that this feature once shipped might cause
> higher load on git hosting providers.
> 
> This is because people will inevitably use it in popular projects for
> some custom filtering, and because you're continually re-fetching and
> inspecting stuff what used to be a really cheap no-op "pull" most of the
> time is a more expensive negotiation every time before the client
> rejects the refs again, and worse for hosting providers because you have
> bespoke ref fetching strategies you have less odds of being able to
> cache both the negotiation and the pack you serve.
> 
> I.e. you want this for some security feature where 99.99% of the time
> you accept all refs, but most people will probably use this to implement
> dynamic Turing-complete refspecs.
> 
> Maybe that's worrying about nothing, but worth thinking about.

Well... First, I must say I didn't really understand your last paragraph
about Turing-complete refspecs.

But my understanding of how the fetch-hook patchset I sent this evening
works is that it first receives all the objects from the hosting
provider, then locally moves the refs, but never actually discards the
downloaded objects (well, until a `git gc` I guess).

So I don't think the network traffic with the provider would be any
different wrt. what it is now, even if a tweak-fetch hook rejects some
commits? Then again I don't know git's internals enough to be have even
a bit of certainty about what I'm saying right now, so...


[1] https://marc.info/?l=git&m=132480559712592&w=2