Web lists-archives.com

Re: [PATCH 09/10] fetch: try fetching submodules if needed objects were not fetched




> But this default fetch is not sufficient, as a newly fetched commit in
> the superproject could point to a commit in the submodule that is not
> in the default refspec. This is common in workflows like Gerrit's.
> When fetching a Gerrit change under review (from refs/changes/??), the
> commits in that change likely point to submodule commits that have not
> been merged to a branch yet.
> 
> Try fetching a submodule by object id if the object id that the
> superproject points to, cannot be found.

I see that these suggestions of mine (from [1]) was implemented, but not
others. If you disagree, that's fine, but I think they should be
discussed.

[1] https://public-inbox.org/git/20181018003954.139498-1-jonathantanmy@xxxxxxxxxx/

> The try does not happen when the "git fetch" done at the
> superproject is not storing the fetched results in remote
> tracking branches (i.e. instead just recording them to
> FETCH_HEAD) in this step. A later patch will fix this.

E.g. here, I said that there was no remote tracking branch involved.

> -		if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
> -		    (recurse_submodules != RECURSE_SUBMODULES_ON))
> +		if (recurse_submodules != RECURSE_SUBMODULES_OFF)

I think the next patch should be squashed into this patch. Then you can
say that these are now redundant and can be removed.

> @@ -1218,8 +1218,12 @@ struct submodule_parallel_fetch {
>  	int result;
>  
>  	struct string_list changed_submodule_names;
> +	struct get_next_submodule_task **fetch_specific_oids;
> +	int fetch_specific_oids_nr, fetch_specific_oids_alloc;
>  };

Add documentation for fetch_specific_oids. Also, it might be better to
call these oid_fetch_tasks and the struct, "struct fetch_task".

Here, struct get_next_submodule_task is used for 2 different things:
 (1) After the first round fetch, fetch_finish() uses it to determine if
     a second round is needed.
 (2) In submodule_parallel_fetch.fetch_specific_oids, to tell the
     parallel runner (through get_next_submodule_task()) to do this
     fetch.

I think that it's better to have 2 different structs for these 2
different uses. (Note that task_cb can be NULL for the second round.
Unless we plan to recheck the OIDs? Currently we recheck them, but we
don't do anything either way.)

> +static const struct submodule *get_default_submodule(const char *path)
> +{
> +	struct submodule *ret = NULL;
> +	const char *name = default_name_or_path(path);
> +
> +	if (!name)
> +		return NULL;
> +
> +	ret = xmalloc(sizeof(*ret));
> +	memset(ret, 0, sizeof(*ret));
> +	ret->path = name;
> +	ret->name = name;
> +
> +	return (const struct submodule *) ret;
> +}

I think that this is best described as the submodule that has no entry
in .gitmodules? Maybe call it "get_non_gitmodules_submodule" and
document it with a similar comment as in get_submodule_repo_for().

> +
> +static struct get_next_submodule_task *get_next_submodule_task_create(
> +	struct repository *r, const char *path)
> +{
> +	struct get_next_submodule_task *task = xmalloc(sizeof(*task));
> +	memset(task, 0, sizeof(*task));
> +
> +	task->sub = submodule_from_path(r, &null_oid, path);
> +	if (!task->sub) {
> +		task->sub = get_default_submodule(path);
> +		task->free_sub = 1;
> +	}
> +
> +	return task;
> +}

Clearer to me would be to make get_next_submodule_task_create() return
NULL if submodule_from_path() returns NULL.

> +	if (spf->fetch_specific_oids_nr) {
> +		struct get_next_submodule_task *task = spf->fetch_specific_oids[spf->fetch_specific_oids_nr - 1];

Break lines to 80.

> +		argv_array_pushv(&cp->args, spf->args.argv);
> +		argv_array_push(&cp->args, "on-demand");

Same comment about "on-demand" as in my previous e-mail.

> +		argv_array_push(&cp->args, "--submodule-prefix");
> +		argv_array_push(&cp->args, submodule_prefix.buf);
> +
> +		/* NEEDSWORK: have get_default_remote from s--h */

Same comment about "s--h" as in my previous e-mail.

> +	commits = it->util;
> +	oid_array_filter(commits,
> +			 commit_exists_in_sub,
> +			 task->repo);
> +
> +	/* Are there commits that do not exist? */
> +	if (commits->nr) {
> +		/* We already tried fetching them, do not try again. */
> +		if (task->commits)
> +			return 0;

Same comment about "task->commits" as in my previous e-mail.

> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 6c2f9b2ba2..5a75b57852 100755

One more thing to test is the case where a submodule doesn't have a
.gitmodules entry.