Web lists-archives.com

[PATCH 00/39] per-repository object store, part 1




Hi,

Most of the credit for this series should go to Stefan Beller.  I just
decided to pull the trigger on sending out what we have so far.

This series is about API.  It makes no functional change yet.

Today, when a git command wants to operate on some objects from another
repository (e.g., a submodule), it has two choices:

 A. Use run_command to operate on that repository in a separate process.

 B. Use add_to_alternates_memory to pretend the repository is an
    alternate.  This has a number of downsides.  Aside from aesthetics,
    one particularly painful consequence is that as alternates
    accumulate, the number of packs git has to check for objects
    increases, which can cause significant slowdowns.

Brandon Williams's recent work to introduce "struct repository" points
to a better way.  Encapsulating object access in struct repository
would mean:

  i. The API for accessing objects in another repository becomes more
     simple and familiar (instead of using the CLI or abusing alternates).

  ii. Operations on one repository do not interfere with another,
      neither in semantics (e.g. replace objects do not work correctly
      with the approach (B) above) nor performance (already described
      above).

 iii. Resources associated with access to a repository could be freed
      when done with that repo.

  iv. Thread-safe multiple readers to a single repository also become
      straightforward, by using multiple repository objects for the same
      repo.

This series is a small step in that direction.

At the end of this series, sha1_loose_object_info takes a repository
argument and can be independently called for multiple repositories.
Not incredibly useful on its own, but a future series will do the same
for sha1_object_info, which will be enough to migrate a caller in
submodule.c (which uses the object store for commit existence checks).

This series has a few phases:

 1. Patch 1 is a cleanup that made some of the later patches easier.

 2. Patches 2-6 create a struct object_store field inside struct
    repository and move some globals to it.

 3. Patches 7-24 are mechanical changes that update some functions to
    accept a repository argument. The only goal is to make the later
    patches that teach these functions to actual handle a repository
    other than the_repository easier to review.  The patches enforce
    at compile time that no caller passes a repository other than
    the_repository --- see patch 7 in particular for details on how
    that works.

 4. Patches 25-39 update the implementations of those functions to
    handle a repository other than the_repository.  This means the
    safety check introduced in phase 3 goes away completely --- all
    functions that gained a repository argument are safe to use with
    a repository argument other than the_repository.

Patches 2-6 and 25-39 should be the most interesting to review.  I'd
particularly appreciate if people can look over 25-39 carefully.  We
were careful not to leave any calls to functions that assume they are
operating on the_repository, but a triple-check is always welcome.

Thanks as well to brian m. carlson, who showed us how such a long and
potentially tedius series can be made bearable for reviewers.

Thoughts of all kinds welcome, as always.

Thanks,
Jonathan Nieder (24):
  pack: make packed_git_mru global a value instead of a pointer
  object-store: move packed_git and packed_git_mru to object store
    struct
  pack: move prepare_packed_git_run_once to object store struct
  pack: move approximate object count to object store struct
  pack: add repository argument to install_packed_git
  pack: add repository argument to prepare_packed_git_one
  pack: add repository argument to rearrange_packed_git
  pack: add repository argument to prepare_packed_git_mru
  pack: add repository argument to prepare_packed_git
  pack: add repository argument to reprepare_packed_git
  pack: add repository argument to sha1_file_name
  pack: add repository argument to map_sha1_file
  pack: allow install_packed_git to handle arbitrary repositories
  pack: allow rearrange_packed_git to handle arbitrary repositories
  pack: allow prepare_packed_git_mru to handle arbitrary repositories
  pack: allow prepare_packed_git_one to handle arbitrary repositories
  pack: allow prepare_packed_git to handle arbitrary repositories
  pack: allow reprepare_packed_git to handle arbitrary repositories
  pack: allow sha1_file_name to handle arbitrary repositories
  pack: allow stat_sha1_file to handle arbitrary repositories
  pack: allow open_sha1_file to handle arbitrary repositories
  pack: allow map_sha1_file_1 to handle arbitrary repositories
  pack: allow map_sha1_file to handle arbitrary repositories
  pack: allow sha1_loose_object_info to handle arbitrary repositories

Stefan Beller (15):
  repository: introduce object store field
  object-store: move alt_odb_list and alt_odb_tail to object store
    struct
  sha1_file: add repository argument to alt_odb_usable
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to sha1_loose_object_info
  object-store: add repository argument to prepare_alt_odb
  object-store: add repository argument to foreach_alt_odb
  sha1_file: allow alt_odb_usable to handle arbitrary repositories
  object-store: allow prepare_alt_odb to handle arbitrary repositories
  object-store: allow foreach_alt_odb to handle arbitrary repositories

 builtin/count-objects.c             |  10 ++-
 builtin/fsck.c                      |  15 ++--
 builtin/gc.c                        |   8 +-
 builtin/index-pack.c                |   1 +
 builtin/pack-objects.c              |  23 +++--
 builtin/pack-redundant.c            |   8 +-
 builtin/receive-pack.c              |   4 +-
 builtin/submodule--helper.c         |   4 +-
 bulk-checkin.c                      |   3 +-
 cache.h                             |  50 ++---------
 contrib/coccinelle/packed_git.cocci |  15 ++++
 fast-import.c                       |  10 ++-
 fetch-pack.c                        |   3 +-
 http-backend.c                      |   8 +-
 http-push.c                         |   1 +
 http-walker.c                       |   4 +-
 http.c                              |   9 +-
 mru.h                               |   1 +
 object-store.h                      |  71 ++++++++++++++++
 pack-bitmap.c                       |   6 +-
 pack-check.c                        |   1 +
 pack-revindex.c                     |   1 +
 packfile.c                          |  94 ++++++++++----------
 packfile.h                          |   6 +-
 reachable.c                         |   1 +
 repository.c                        |   4 +-
 repository.h                        |   7 ++
 server-info.c                       |   8 +-
 sha1_file.c                         | 165 ++++++++++++++++++++----------------
 sha1_name.c                         |  11 ++-
 streaming.c                         |   5 +-
 transport.c                         |   4 +-
 32 files changed, 344 insertions(+), 217 deletions(-)
 create mode 100644 contrib/coccinelle/packed_git.cocci
 create mode 100644 object-store.h

-- 
2.14.1.581.gf28d330327