Web lists-archives.com

Re: [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")

From: "Jonathan Tan" <jonathantanmy@xxxxxxxxxx>
Sent: Tuesday, July 11, 2017 8:48 PM
These patches are part of a set of patches implementing partial clone,
as you can see here:


In that branch, clone with batch checkout works, as you can see in the
README. The code and tests are generally done, but some patches are
still missing documentation and commit messages.

These 3 patches implement the foundational concept - formerly known as
"missing blobs" in the "missing blob manifest", I decided to call them
"promised blobs". The repo knows their object names and sizes. It also
does not have the blobs themselves, but can be configured to know how to
fetch them.

If I understand correctly, this method doesn't give any direct user visibility of missing blobs in the file system. Is that correct?

I was hoping that eventually the various 'on demand' approaches would still allow users to continue to work as they go off-line such that they can see directly (in the FS) where the missing blobs (and trees) are located, so that they can continue to commit new work on existing files.

I had felt that some sort of 'gitlink' should be present (huma readable) as a place holder for the missing blob/tree. e.g. 'gitblob: 1234abcd' (showing the missing oid, jsut like sub-modules can do - it's no different really.

I'm concerned that the various GVFS extensions haven't fully achieved a separation of concerns surrounding the DVCS capability for on-line/off-line conversion as comms drop in and out. The GVFS looks great for a fully networked, always on, environment, but it would be good to also have the sepration for those who (will) have shallow/narrow clones that may also need to work with a local upstream that is also shallow/narrow.

I wanted to at least get my thoughts into the discussion before it all passes by.

An older version of these patches was sent as a single demonstration
patch in versions 1 to 3 of [1]. In there, Junio suggested that I have
only one file containing missing blob information. I have made that
suggested change in this version.

One thing remaining is to add a repository extension [2] so that older
versions of Git fail immediately instead of trying to read missing
blobs, but I thought I'd send these first in order to get some initial

[1] https://public-inbox.org/git/cover.1497035376.git.jonathantanmy@xxxxxxxxxx/
[2] Documentation/technical/repository-version.txt

Jonathan Tan (3):
 promised-blob, fsck: introduce promised blobs
 sha1-array: support appending unsigned char hash
 sha1_file: add promised blob hook support

Documentation/config.txt               |   8 ++
Documentation/gitrepository-layout.txt |   8 ++
Makefile                               |   1 +
builtin/cat-file.c                     |   9 ++
builtin/fsck.c                         |  13 +++
promised-blob.c | 170 +++++++++++++++++++++++++++++++++
promised-blob.h                        |  27 ++++++
sha1-array.c                           |   7 ++
sha1-array.h                           |   1 +
sha1_file.c                            |  44 ++++++---
t/t3907-promised-blob.sh               |  65 +++++++++++++
t/test-lib-functions.sh                |   6 ++
12 files changed, 345 insertions(+), 14 deletions(-)
create mode 100644 promised-blob.c
create mode 100644 promised-blob.h
create mode 100755 t/t3907-promised-blob.sh