Re: How hard would it be to implement sparse fetching/pulling?
- Date: Fri, 1 Dec 2017 10:16:58 -0800
- From: Jonathan Nieder <jrnieder@xxxxxxxxx>
- Subject: Re: How hard would it be to implement sparse fetching/pulling?
Jeff Hostetler wrote:
> On 11/30/2017 3:03 PM, Jonathan Nieder wrote:
>> One piece of missing functionality that looks intereseting to me: that
>> series batches fetches of the missing blobs involved in a "git
>> checkout" command:
>> But if doesn't batch fetches of the missing blobs involved in a "git
>> diff <commit> <commit>" command. That might be a good place to get
>> your hands dirty. :)
> Jonathan Tan added code in unpack-trees to bulk fetch missing blobs
> before a checkout. This is limited to the missing blobs needed for
> the target commit. We need this to make checkout seamless, but it
> does mean that checkout may need online access.
Just to clarify: other parts of the series already fetch all missing
blobs that are required for a command. What that bulk-fetch patch
does is to make that more efficient, by using a single fetch request
to grab all the blobs that are needed for checkout, instead of one
fetch per blob.
This doesn't change the online access requirement: online access is
needed if and only if you don't have the required objects already
available locally. For example, if at clone time you specified a
sparse checkout pattern and you haven't changed that sparse checkout
pattern, then online access is not needed for checkout.
> I've also talked about a pre-fetch capability to bulk fetch missing
> blobs in advance of some operation. You could speed up the above
> diff command or back-fill all the blobs I might need before going
> offline for a while.
In particular, something like this seems like a very valuable thing to
have when changing the sparse checkout pattern.