Web lists-archives.com

Re: [PATCH v2 2/6] run-command: prepare command before forking


Brandon Williams wrote:

> In order to avoid allocation between 'fork()' and 'exec()' the argv
> array used in the exec call is prepared prior to forking the process.

nit: s/(the argv array.*) is prepared/prepare \1/

Git's commit messages are in the imperative mood, as if they are
ordering the code or the computer to do something.

More importantly, the commit message is a good place to explain some
of the motivation behind the patch so that people can understand what
the patch is for by reading it without having to dig into mailing list
archives and get the discussion there.

E.g. this could say

- that grep tests are intermittently failing in configurations using
  some versions of tcmalloc

- that the cause is interaction between fork and threads: malloc holds
  a lock that those versions of tcmalloc doesn't release in a
  pthread_atfork handler

- that according to [1] we need to only call async-signal-safe
  operations between fork and exec.  Using malloc to build the argv
  array isn't async-signal-safe

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

> In addition to this, the function used to exec is changed from
> 'execvp()' to 'execv()' as the (p) variant of exec has the potential to
> call malloc during the path resolution it performs.

*puzzled* is execvp actually allowed to call malloc?

Could this part go in a separate patch?  That would make it easier to

> +++ b/run-command.c
> +	/*
> +	 * If there are no '/' characters in the command then perform a path
> +	 * lookup and use the resolved path as the command to exec.  If there
> +	 * are no '/' characters or if the command wasn't found in the path,
> +	 * have exec attempt to invoke the command directly.
> +	 */
> +	if (!strchr(out->argv[0], '/')) {
> +		char *program = locate_in_PATH(out->argv[0]);
> +		if (program) {
> +			free((char *)out->argv[0]);
> +			out->argv[0] = program;
> +		}
> +	}

This does one half of what execvp does but leaves out the other half.

  There are two distinct ways in which the contents of the process
  image file may cause the execution to fail, distinguished by the
  setting of errno to either [ENOEXEC] or [EINVAL] (see the ERRORS
  section). In the cases where the other members of the exec family of
  functions would fail and set errno to [ENOEXEC], the execlp() and
  execvp() functions shall execute a command interpreter and the
  environment of the executed command shall be as if the process
  invoked the sh utility using execl() as follows:

  execl(<shell path>, arg0, file, arg1, ..., (char *)0);

I think this is what the first patch in the series was about.  Do we
want to drop that support?

I think we need to keep it, since it is easy for authors of e.g.
credential helpers to accidentally rely on it.

> @@ -437,12 +458,9 @@ int start_command(struct child_process *cmd)
>  					unsetenv(*cmd->env);
>  			}
>  		}
> -		if (cmd->git_cmd)
> -			execv_git_cmd(cmd->argv);
> -		else if (cmd->use_shell)
> -			execv_shell_cmd(cmd->argv);
> -		else
> -			sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
> +
> +		execv(argv.argv[0], (char *const *) argv.argv);

What happens in the case sane_execvp was trying to handle?  Does
prepare_cmd need error handling for when the command isn't found?

Sorry this got so fussy.

Thanks and hope that helps,