Web lists-archives.com

Re: [PATCH v2] checkout files in-place




On Mon, Jun 11 2018, Clemens Buchacher wrote:

> When replacing files with new content during checkout, we do not write
> to them in place. Instead we unlink and recreate the files in order to
> let the system figure out ownership and permissions for the new file,
> taking umask into account.

Both this summary...

> It is safe to do this on Linux file systems, even if open file handles
> still exist, because unlink only removes the directory reference to the
> file. On Windows, however, a file cannot be deleted until all handles to
> it are closed. If a file cannot be deleted, its name cannot be reused.
>
> This causes files to be deleted, but not checked out when switching
> branches. This is frequently an issue with Qt Creator, which
> continuously opens files in the work tree, as reported here:
> https://github.com/git-for-windows/git/issues/1653
>
> This change adds the core.checkoutInPlace option. If enabled, checkout
> will open files for writing the new content in place. This fixes the
> issue, but with this approach the system will not update file
> permissions according to umask. Only essential updates of write and
> executable permissions are performed.
>
> The in-place checkout is therefore optional. It could be enabled by Git
> installers on Windows, where umask is irrelevant.
>
> Signed-off-by: Clemens Buchacher <drizzd@xxxxxxx>
> Reviewed-by: Orgad Shaneh <orgads@xxxxxxxxx>
> Reviewed-by: "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx>
> Reviewed-by: Duy Nguyen <pclouds@xxxxxxxxx>
> Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
> ---
>
> Tested on Windows with Git-for-Windows and with Windows Subsystem for
> Linux.
>
>  Documentation/config.txt    | 11 ++++++
>  cache.h                     |  2 ++
>  config.c                    |  5 +++
>  entry.c                     | 18 ++++++++--
>  environment.c               |  1 +
>  t/t2031-checkout-inplace.sh | 82 +++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 116 insertions(+), 3 deletions(-)
>  create mode 100755 t/t2031-checkout-inplace.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ab641bf..0860a81 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -912,6 +912,17 @@ core.sparseCheckout::
>  	Enable "sparse checkout" feature. See section "Sparse checkout" in
>  	linkgit:git-read-tree[1] for more information.
>
> +core.checkoutInPlace::
> +	Check out file contents in place. By default Git checkout removes existing
> +	work tree files before it replaces them with different content. If this
> +	option is enabled, Git will overwrite the contents of existing files in
> +	place. This is useful on Windows, where open file handles to a removed file
> +	prevent creating new files at the same path.

...And this seems to conflict with what Junio's summarized in
xmqqvaapb3r1.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx. I.e. (if I'm reading it
correctly) it's not correct to say that we unlink the existing file,
then replace it, don't we create a new one, and then rename it in-place?

I don't know enough about this part of the code to know, but whatever it
is we should get it right here.

Also, as Junio notes that pattern will not result in a potentially
corrupt checkout where you've written 1/2 of a file, you note in
20180611174818.GA8395@Sonnenschein.localdomain that there are "no
guarantees", but I've never seen a case where being out of disk space
would cause a rename to fail, since that can happen in-place.

Whereas we definitely will end up in states where we've written 1MB of a
2MB file with this when the disk fills up, and thus when that's fixed
"git status" will show local changes, so we should note that caveat for
the user.

> +	Note that the current implementation of in-place checkout makes no effort
> +	to update read/write permissions according to umask. Permissions are
> +	however modified to enable write access and to update executable
> +	permissions.

I think we should have a paragraph break there before "Note...".

>  core.abbrev::
>  	Set the length object names are abbreviated to.  If
>  	unspecified or set to "auto", an appropriate value is
> diff --git a/cache.h b/cache.h
> index 89a107a..5b8c4d6 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -815,6 +815,7 @@ extern int fsync_object_files;
>  extern int core_preload_index;
>  extern int core_commit_graph;
>  extern int core_apply_sparse_checkout;
> +extern int checkout_inplace;
>  extern int precomposed_unicode;
>  extern int protect_hfs;
>  extern int protect_ntfs;
> @@ -1518,6 +1519,7 @@ struct checkout {
>  	unsigned force:1,
>  		 quiet:1,
>  		 not_new:1,
> +		 inplace:1,
>  		 refresh_cache:1;
>  };
>  #define CHECKOUT_INIT { NULL, "" }
> diff --git a/config.c b/config.c
> index fbbf0f8..8b35ecd 100644
> --- a/config.c
> +++ b/config.c
> @@ -1318,6 +1318,11 @@ static int git_default_core_config(const char *var, const char *value)
>  		return 0;
>  	}
>
> +	if (!strcmp(var, "core.checkoutinplace")) {
> +		checkout_inplace = git_config_bool(var, value);
> +		return 0;
> +	}
> +
>  	if (!strcmp(var, "core.precomposeunicode")) {
>  		precomposed_unicode = git_config_bool(var, value);
>  		return 0;
> diff --git a/entry.c b/entry.c
> index 2101201..a599fc1 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -78,8 +78,13 @@ static void remove_subtree(struct strbuf *path)
>
>  static int create_file(const char *path, unsigned int mode)
>  {
> +	int flags;
> +	if (checkout_inplace)
> +		flags = O_WRONLY | O_CREAT | O_TRUNC;
> +	else
> +		flags = O_WRONLY | O_CREAT | O_EXCL;

I'd find this sort of thing easier to read as:

	int flags = O_WRONLY | O_CREAT;
	if (checkout_inplace)
		flags |= O_TRUNC;
	else
		flags |= O_EXCL;

Or even:

	int flags = O_WRONLY | O_CREAT;
	flags |= checkout_inplace ? O_TRUNC : O_EXCL;

I.e. less eyeballing the two lines to see if they're the same.

>  	mode = (mode & 0100) ? 0777 : 0666;
> -	return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
> +	return open(path, flags, mode);
>  }
>
>  static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size)
> @@ -467,8 +472,15 @@ int checkout_entry(struct cache_entry *ce,
>  			if (!state->force)
>  				return error("%s is a directory", path.buf);
>  			remove_subtree(&path);
> -		} else if (unlink(path.buf))
> -			return error_errno("unable to unlink old '%s'", path.buf);
> +		} else if (checkout_inplace) {
> +			if (!(st.st_mode & 0200) ||
> +			    (trust_executable_bit && (st.st_mode & 0100) != (ce->ce_mode & 0100)))
> +				if (chmod(path.buf, (ce->ce_mode & 0100) ? 0777 : 0666))
> +					return error_errno(_("unable to change mode of '%s'"), path.buf);
> +		} else {
> +			if (unlink(path.buf))
> +				return error_errno(_("unable to unlink old '%s'"), path.buf);
> +		}
>  	} else if (state->not_new)
>  		return 0;
>
> diff --git a/environment.c b/environment.c
> index 2a6de23..5b91f30 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -68,6 +68,7 @@ char *notes_ref_name;
>  int grafts_replace_parents = 1;
>  int core_commit_graph;
>  int core_apply_sparse_checkout;
> +int checkout_inplace;
>  int merge_log_config = -1;
>  int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
>  unsigned long pack_size_limit_cfg;
> diff --git a/t/t2031-checkout-inplace.sh b/t/t2031-checkout-inplace.sh
> new file mode 100755
> index 0000000..d70ecc4
> --- /dev/null
> +++ b/t/t2031-checkout-inplace.sh
> @@ -0,0 +1,82 @@
> +#!/bin/sh
> +
> +test_description='in-place checkout'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +
> +	test_commit hello world &&
> +	git branch other &&
> +	test_commit hello-again world
> +'
> +
> +test_expect_success 'in-place checkout overwrites open file' '
> +
> +	git config core.checkoutInPlace true &&
> +	git checkout -f master &&
> +	exec 8<world &&
> +	git checkout other &&
> +	exec 8<&- &&
> +	echo hello >expect &&
> +	test_cmp expect world
> +'
> +
> +test_expect_success 'in-place checkout overwrites read-only file' '
> +
> +	git config core.checkoutInPlace true &&
> +	git checkout -f master &&
> +	chmod -w world &&
> +	git checkout other &&
> +	echo hello >expect &&
> +	test_cmp expect world
> +'
> +
> +test_expect_success 'in-place checkout updates executable permission' '
> +
> +	git config core.checkoutInPlace true &&
> +	git checkout -f master^0 &&
> +	test_chmod +x world &&
> +	git commit -m executable &&
> +	git checkout other &&
> +	test ! -x world

Worth testing switching branches here again & re-testing, since this
only tests +x -> -x, but not -x -> +x when we go back.

> +'
> +
> +test_expect_success POSIXPERM 'regular checkout respects umask' '
> +
> +	git config core.checkoutInPlace false &&
> +	git checkout -f master &&
> +	chmod 0660 world &&
> +	umask 0022 &&
> +	git checkout other &&
> +	actual=$(ls -l world) &&
> +	case "$actual" in
> +	-rw-r--r--*)
> +		: happy
> +		;;
> +	*)
> +		echo Oops, world is not 0644 but $actual
> +		false
> +		;;
> +	esac

Is that "ls" parsing portable? And also couldn't this be accomplished
with something like "stat --format"? I'm not sure how portable that is,
we just have one use of it in the test suite (on Cygwin only).

> +'
> +
> +test_expect_success POSIXPERM 'in-place checkout ignores umask' '
> +
> +	git config core.checkoutInPlace true &&
> +	git checkout -f master &&
> +	chmod 0660 world &&
> +	umask 0022 &&
> +	git checkout other &&
> +	actual=$(ls -l world) &&
> +	case "$actual" in
> +	-rw-rw----*)
> +		: happy
> +		;;
> +	*)
> +		echo Oops, world is not 0660 but $actual
> +		false
> +		;;
> +	esac
> +'
> +
> +test_done