Web lists-archives.com

Re: [PATCH 3/3] packfile: close_all_packs to close_object_store




Hi Stolee,

*really* minor nit: the commit subject probably wants to have a "rename"
after the colon ;-)

The patch looks sensible to me. Since Junio asked for a sanity check
whether all of the call sites of `close_all_packs()` actually want to
close the MIDX and the commit graph, too, I'll do the "speak out loud"
type of patch review here (spoiler: all of them check out):

On Fri, 17 May 2019, Derrick Stolee via GitGitGadget wrote:

> diff --git a/builtin/am.c b/builtin/am.c
> index 58a2aef28b..9315d32d2a 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1800,7 +1800,7 @@ static void am_run(struct am_state *state, int resume)
>  	 */
>  	if (!state->rebasing) {
>  		am_destroy(state);
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);

Here, we run `git gc --auto`, so we obviously really want to close all
read handles.

Check.

>  	}
>  }
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 50bde99618..82ce682c80 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1240,7 +1240,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	transport_disconnect(transport);
>
>  	if (option_dissociate) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		dissociate_from_references();

Here, we prepare for disassociating the reference repository specified via
`git clone --reference <directory>`. Obviously, we need to let go of all
the handles we might have open there.

Check.

>  	}
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index b620fd54b4..3aec95608f 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1670,7 +1670,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>
>  	string_list_clear(&list, 0);
>
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>
>  	argv_array_pushl(&argv_gc_auto, "gc", "--auto", NULL);

Again, a `git gc --auto` that needs closing of all read handles to the
files that might be overwritten by the garbage collection.

Check.

>  	if (verbosity < 0)
> diff --git a/builtin/gc.c b/builtin/gc.c
> index df2573f124..20c8f1bfe8 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -632,7 +632,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	gc_before_repack();
>
>  	if (!repository_format_precious_objects) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		if (run_command_v_opt(repack.argv, RUN_GIT_CMD))

Here, we want to repack. AFAICT it is the only sane thing we can do to
invalidate whatever we read from the object store into memory.

Check.

>  			die(FAILED_RUN, repack.argv[0]);
>
> @@ -660,7 +660,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	report_garbage = report_pack_garbage;
>  	reprepare_packed_git(the_repository);
>  	if (pack_garbage.nr > 0) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		clean_pack_garbage();

This wants to delete a number of files that are now obsolete, and it makes
sense to make sure that there are no open read handles to those anymore.
It is a bit unclear from just reading the code what types of files are
accumulated into the `pack_garbage` string list, but then, we're in the
last throngs of a garbage collection, and *just* about to write a new
commit graph (if `gc.writeCommitGraph=true`), so I think it is quite okay
to close not only the packs here, but everything we opened from the object
store.

So I'd give this a check mark, too.

>  	}
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index e47d77baee..72d7a7c909 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -449,7 +449,7 @@ static void finish(struct commit *head_commit,
>  			 * We ignore errors in 'gc --auto', since the
>  			 * user should see them.
>  			 */
> -			close_all_packs(the_repository->objects);
> +			close_object_store(the_repository->objects);
>  			run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);

Obviously yet another `git gc --auto`, so yes, we need to close the object
store handles we have.

Check.

>  		}
>  	}
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 7c7bc13e91..ed30fcd633 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -328,7 +328,7 @@ static int finish_rebase(struct rebase_options *opts)
>
>  	delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
>  	apply_autostash(opts);
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>  	/*
>  	 * We ignore errors in 'gc --auto', since the
>  	 * user should see them.

Yet another `git gc --auto`.

Check.

> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index d58b7750b6..92cd1f508c 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -2032,7 +2032,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  			proc.git_cmd = 1;
>  			proc.argv = argv_gc_auto;
>
> -			close_all_packs(the_repository->objects);
> +			close_object_store(the_repository->objects);
>  			if (!start_command(&proc)) {

This `proc` refers to another `git gc --auto` (see a couple lines above,
still within the hunk).

Check.

>  				if (use_sideband)
>  					copy_to_sideband(proc.err, -1, NULL);
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 67f8978043..4de8b6600c 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -419,7 +419,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	if (!names.nr && !po_args.quiet)
>  		printf_ln(_("Nothing new to pack."));
>
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>
>  	/*
>  	 * Ok we have prepared all new packfiles.

Ah, the joys of un-dynamic patch review. What you, dear reader, cannot see
in this hunk is that the code comment at the end continues thusly:

         * First see if there are packs of the same name and if so
         * if we can move them out of the way (this can happen if we
         * repacked immediately after packing fully.
         */

Meaning: we're about to rename some pack files. So the pack file handles
need to be closed, all right, but what about the other object store
handles? There is no mention of the commit graph (more on that below), but
the loop following the code comment contains this:

                        if (!midx_cleared) {
                                clear_midx_file(the_repository);
                                midx_cleared = 1;
                        }

So yes, I would give this a check.

It does puzzle me, I have to admit, that there is no (opt-in) code block
to re-write the commit graph. After all, the commit graph references the
pack files, right? So if they are repacked, it would at least be
invalidated at this point...

> diff --git a/object.c b/object.c
> index e81d47a79c..cf1a2b7086 100644
> --- a/object.c
> +++ b/object.c
> @@ -517,7 +517,7 @@ void raw_object_store_clear(struct raw_object_store *o)
>  	o->loaded_alternates = 0;
>
>  	INIT_LIST_HEAD(&o->packed_git_mru);
> -	close_all_packs(o);
> +	close_object_store(o);

We're in the middle of a function called `raw_object_store_clear()`. So...

Check.

>  	o->packed_git = NULL;
>  }
>
> diff --git a/packfile.c b/packfile.c
> index ce12bffe3e..017046fcf9 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -337,7 +337,7 @@ void close_pack(struct packed_git *p)
>  	close_pack_index(p);
>  }
>
> -void close_all_packs(struct raw_object_store *o)
> +void close_object_store(struct raw_object_store *o)
>  {
>  	struct packed_git *p;
>
> diff --git a/packfile.h b/packfile.h
> index d70c6d9afb..e95e389eb8 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -81,7 +81,7 @@ extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
>  extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
>  extern void close_pack_windows(struct packed_git *);
>  extern void close_pack(struct packed_git *);
> -extern void close_all_packs(struct raw_object_store *o);
> +extern void close_object_store(struct raw_object_store *o);
>  extern void unuse_pack(struct pack_window **);
>  extern void clear_delta_base_cache(void);
>  extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
> --
> gitgitgadget

And this concludes my review.

Thank you!
Dscho