Re: submodule support in git-bundle
- Date: Fri, 2 Nov 2018 19:34:45 +0100
- From: Duy Nguyen <pclouds@xxxxxxxxx>
- Subject: Re: submodule support in git-bundle
On Fri, Nov 2, 2018 at 6:09 PM Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Fri, Nov 2, 2018 at 9:10 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> > I use git-bundle today and it occurs to me that if I want to use it to
> > transfer part of a history that involves submodule changes, things
> > aren't pretty. Has anybody given thought on how to do binary history
> > transfer that contains changes from submodules?
> > Since .bundle files are basically .pack files, i'm not sure if it's
> > easy to bundle multiple pack files (one per repo)...
> That is a really good discussion starter!
> As bundles are modeled after the fetch protocol, I would
> redirect the discussion there.
> The new fetch protocol could support sending more than
> one pack, which could be for both the superproject as
> well as the relevant submodule updates (i.e. what is recorded
> in the superproject) based on a new capability.
> We at Google have given this idea some thought, but from a
> different angle: As you may know currently Android uses the
> repo tool, which we want to replace with Gits native submodules
> eventually. The repo tool tests for each repository to clone if
> there is a bundle file for that repository, such that instead of
> cloning the repo, the bundle can be downloaded and then
> a catch-up fetch can be performed. (This helps the Git servers
> as well as the client, the bundle can be hosted on a CDN,
> which is faster and cheaper than a git server for us).
> So we've given some thought on extending the packfiles in the
> fetch protocol to have some redirection to a CDN possible,
> i.e. instead of sending bytes as is, you get more or less a "todo"
> list, which might be
> (a) take the following bytes as is (current pack format)
> (b) download these other bytes from $THERE
> (possibly with a checksum)
> once the stream of bytes is assembled, it will look like a regular
> packfile with deltas etc.
> This offloading-to-CDN (or "mostly resumable clone" in the
> sense that the communication with the server is minimal, and
> you get most of your data via resumable http range-requests)
> sounds like complete offtopic, but is one of the requirements
> for the repo to submodule migration, hence I came to speak of it.
Hm.. so what you're saying is, we could have a pack file that lists
other (real) pack files and for the bundle case they are all in the
same file. And "download from $THERE" in this case is "download at
this file offset"? That might actually work.
> Did you have other things in mind, on a higher level?
> e.g. querying the bundle and creating submodule bundles
> based off the superproject bundle? 'git bundle create' could
> learn the --recurse-submodules option, which then produces
> multiple bundle files without changing the file formats.
This is probably the simplest way to support submodules. I just
haven't really thought much about it (the problem just came up to me
like 2 hours ago). Two problems with this are convenience (I don't
want to handle multiple files) and submodule info (which pack should
be unbundled on which submodule?). But I suppose if "git bundle"
produces a tarball of these bundle files then you solve both.
But of course there may be other and better options like what you
described above. If in long term we have "pack with hyperlinks" anyway
for resumable clone and other fancy stuff then reusing the same
mechanism for bundles makes sense, less maintenance burden.