Re: [PATCH 00/39] per-repository object store, part 1
- Date: Wed, 30 Aug 2017 16:07:55 -0700
- From: Brandon Williams <bmwill@xxxxxxxxxx>
- Subject: Re: [PATCH 00/39] per-repository object store, part 1
On 08/29, Jonathan Nieder wrote:
> Most of the credit for this series should go to Stefan Beller. I just
> decided to pull the trigger on sending out what we have so far.
> This series is about API. It makes no functional change yet.
> Today, when a git command wants to operate on some objects from another
> repository (e.g., a submodule), it has two choices:
> A. Use run_command to operate on that repository in a separate process.
> B. Use add_to_alternates_memory to pretend the repository is an
> alternate. This has a number of downsides. Aside from aesthetics,
> one particularly painful consequence is that as alternates
> accumulate, the number of packs git has to check for objects
> increases, which can cause significant slowdowns.
> Brandon Williams's recent work to introduce "struct repository" points
> to a better way. Encapsulating object access in struct repository
> would mean:
> i. The API for accessing objects in another repository becomes more
> simple and familiar (instead of using the CLI or abusing alternates).
> ii. Operations on one repository do not interfere with another,
> neither in semantics (e.g. replace objects do not work correctly
> with the approach (B) above) nor performance (already described
> iii. Resources associated with access to a repository could be freed
> when done with that repo.
> iv. Thread-safe multiple readers to a single repository also become
> straightforward, by using multiple repository objects for the same
> This series is a small step in that direction.
> At the end of this series, sha1_loose_object_info takes a repository
> argument and can be independently called for multiple repositories.
> Not incredibly useful on its own, but a future series will do the same
> for sha1_object_info, which will be enough to migrate a caller in
> submodule.c (which uses the object store for commit existence checks).
> This series has a few phases:
> 1. Patch 1 is a cleanup that made some of the later patches easier.
> 2. Patches 2-6 create a struct object_store field inside struct
> repository and move some globals to it.
> 3. Patches 7-24 are mechanical changes that update some functions to
> accept a repository argument. The only goal is to make the later
> patches that teach these functions to actual handle a repository
> other than the_repository easier to review. The patches enforce
> at compile time that no caller passes a repository other than
> the_repository --- see patch 7 in particular for details on how
> that works.
> 4. Patches 25-39 update the implementations of those functions to
> handle a repository other than the_repository. This means the
> safety check introduced in phase 3 goes away completely --- all
> functions that gained a repository argument are safe to use with
> a repository argument other than the_repository.
> Patches 2-6 and 25-39 should be the most interesting to review. I'd
> particularly appreciate if people can look over 25-39 carefully. We
> were careful not to leave any calls to functions that assume they are
> operating on the_repository, but a triple-check is always welcome.
> Thanks as well to brian m. carlson, who showed us how such a long and
> potentially tedius series can be made bearable for reviewers.
> Thoughts of all kinds welcome, as always.
Just finished looking through the series. Thanks for keeping each
commit very short and to the point, it made reviewing it much easier. I
couldn't see anything wrong these transformations and I am very happy to
see this work getting done.
One thing that needs to be noted is that currently the object_store is
only really being used by the_repository so this series didn't need to
create any object_store_init() or object_store_clear() type functions.
So these types of functions will need to be added once submodules are
using their own object store, in their own struct repository.
> Jonathan Nieder (24):
> pack: make packed_git_mru global a value instead of a pointer
> object-store: move packed_git and packed_git_mru to object store
> pack: move prepare_packed_git_run_once to object store struct
> pack: move approximate object count to object store struct
> pack: add repository argument to install_packed_git
> pack: add repository argument to prepare_packed_git_one
> pack: add repository argument to rearrange_packed_git
> pack: add repository argument to prepare_packed_git_mru
> pack: add repository argument to prepare_packed_git
> pack: add repository argument to reprepare_packed_git
> pack: add repository argument to sha1_file_name
> pack: add repository argument to map_sha1_file
> pack: allow install_packed_git to handle arbitrary repositories
> pack: allow rearrange_packed_git to handle arbitrary repositories
> pack: allow prepare_packed_git_mru to handle arbitrary repositories
> pack: allow prepare_packed_git_one to handle arbitrary repositories
> pack: allow prepare_packed_git to handle arbitrary repositories
> pack: allow reprepare_packed_git to handle arbitrary repositories
> pack: allow sha1_file_name to handle arbitrary repositories
> pack: allow stat_sha1_file to handle arbitrary repositories
> pack: allow open_sha1_file to handle arbitrary repositories
> pack: allow map_sha1_file_1 to handle arbitrary repositories
> pack: allow map_sha1_file to handle arbitrary repositories
> pack: allow sha1_loose_object_info to handle arbitrary repositories
> Stefan Beller (15):
> repository: introduce object store field
> object-store: move alt_odb_list and alt_odb_tail to object store
> sha1_file: add repository argument to alt_odb_usable
> sha1_file: add repository argument to link_alt_odb_entry
> sha1_file: add repository argument to read_info_alternates
> sha1_file: add repository argument to link_alt_odb_entries
> sha1_file: add repository argument to stat_sha1_file
> sha1_file: add repository argument to open_sha1_file
> sha1_file: add repository argument to map_sha1_file_1
> sha1_file: add repository argument to sha1_loose_object_info
> object-store: add repository argument to prepare_alt_odb
> object-store: add repository argument to foreach_alt_odb
> sha1_file: allow alt_odb_usable to handle arbitrary repositories
> object-store: allow prepare_alt_odb to handle arbitrary repositories
> object-store: allow foreach_alt_odb to handle arbitrary repositories
> builtin/count-objects.c | 10 ++-
> builtin/fsck.c | 15 ++--
> builtin/gc.c | 8 +-
> builtin/index-pack.c | 1 +
> builtin/pack-objects.c | 23 +++--
> builtin/pack-redundant.c | 8 +-
> builtin/receive-pack.c | 4 +-
> builtin/submodule--helper.c | 4 +-
> bulk-checkin.c | 3 +-
> cache.h | 50 ++---------
> contrib/coccinelle/packed_git.cocci | 15 ++++
> fast-import.c | 10 ++-
> fetch-pack.c | 3 +-
> http-backend.c | 8 +-
> http-push.c | 1 +
> http-walker.c | 4 +-
> http.c | 9 +-
> mru.h | 1 +
> object-store.h | 71 ++++++++++++++++
> pack-bitmap.c | 6 +-
> pack-check.c | 1 +
> pack-revindex.c | 1 +
> packfile.c | 94 ++++++++++----------
> packfile.h | 6 +-
> reachable.c | 1 +
> repository.c | 4 +-
> repository.h | 7 ++
> server-info.c | 8 +-
> sha1_file.c | 165 ++++++++++++++++++++----------------
> sha1_name.c | 11 ++-
> streaming.c | 5 +-
> transport.c | 4 +-
> 32 files changed, 344 insertions(+), 217 deletions(-)
> create mode 100644 contrib/coccinelle/packed_git.cocci
> create mode 100644 object-store.h