Web lists-archives.com

Re: Line ending normalization doesn't work as expected




Junio C Hamano <gitster@xxxxxxxxx> writes:

> Both this and its "git read-tree --empty" cousin share a grave
> issue.  The "git add ." step would mean that before doing these
> commands, your working tree must be truly clean, i.e. the paths
> in the filesystem known to the index must match what is in the
> index (modulo the line-ending gotcha you are trying to correct), 
> *AND* there must be *NO* untracked paths you do not want to add
> in the working tree.
>
> That is a reason why we should solve it differently.  Perhaps adding
> a new option "git add --rehash" to tell Git "Hey, you may think some
> paths in the index and in the working tree are identical and no need
> to re-register, but you are WRONG.  For each path in the index,
> remove it and then register the object by hashing the contents from
> the filesystem afresh!" would be the best way to go.

Here is just to illustrate the direction I was heading to in the
above.  This is not even compile tested and I won't guarantee what
corner cases there are, though.

In a true production code, we shouldn't be using string-list with
two loops, but I just didn't want to spend more braincycles worrying
about removing from the list and then adding to it, both inside a
single loop that iterates over it in a mere illustration patch.

The second loop uses a simple "remove then add", but I think it
should rather be a "mark ce that it will _never_ match anything on
the working tree" followed by "add_file_to_cache()".  Currently we
do not have the "mark ce that it never matches" operation that lets
us bypass the comparison with the current cache entry (with safecrlf
thing that interferes), but we can afford to use a (in-core only)
bit in the ce flags word to represent this and plumb it through.
That way, we will still preserve the executable bit from the
original entry, hopefully ;-)


 builtin/add.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5d5773d5cd..264f84dbe7 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
 };
 static int patch_interactive, add_interactive, edit_interactive;
 static int take_worktree_changes;
+static int rehash;
 
 struct update_callback_data {
 	int flags;
@@ -121,6 +122,41 @@ int add_files_to_cache(const char *prefix,
 	return !!data.add_errors;
 }
 
+static int rehash_tracked_files(const char *prefix, const struct pathspec *pathspec,
+				int flags)
+{
+	struct string_list paths = STRING_LIST_INIT_DUP;
+	struct string_list_item *path;
+	int i, retval = 0;
+
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+
+		if (ce_stage(ce))
+			continue; /* do not touch unmerged paths */
+		if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
+			continue; /* do not touch non blobs */
+		if (pathspec && !ce_path_match(ce, pathspec, NULL))
+			continue;
+		string_list_append(&paths, ce->name);
+	}
+
+	for_each_string_list_item(path, &paths) {
+		/*
+		 * Having a blob contaminated with CR will trigger the
+		 * safe-crlf kludge, avoidance of which is the primary
+		 * thing this helper function exists.  Remove it and
+		 * then re-add it.  Note that this may lose executable
+		 * bit on a filesystem that lacks it.
+		 */
+		remove_file_from_cache(path->string);
+		add_file_to_cache(path->string, flags);
+	}
+
+	string_list_clear(&paths, 0);
+	return retval;
+}
+
 static char *prune_directory(struct dir_struct *dir, struct pathspec *pathspec, int prefix)
 {
 	char *seen;
@@ -274,6 +310,7 @@ static struct option builtin_add_options[] = {
 	OPT_BOOL('e', "edit", &edit_interactive, N_("edit current diff and apply")),
 	OPT__FORCE(&ignored_too, N_("allow adding otherwise ignored files")),
 	OPT_BOOL('u', "update", &take_worktree_changes, N_("update tracked files")),
+	OPT_BOOL(0, "rehash", &rehash, N_("really update tracked files")),
 	OPT_BOOL('N', "intent-to-add", &intent_to_add, N_("record only the fact that the path will be added later")),
 	OPT_BOOL('A', "all", &addremove_explicit, N_("add changes from all tracked and untracked files")),
 	{ OPTION_CALLBACK, 0, "ignore-removal", &addremove_explicit,
@@ -498,7 +535,10 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	plug_bulk_checkin();
 
-	exit_status |= add_files_to_cache(prefix, &pathspec, flags);
+	if (rehash)
+		exit_status |= rehash_tracked_files(prefix, &pathspec, flags);
+	else
+		exit_status |= add_files_to_cache(prefix, &pathspec, flags);
 
 	if (add_new_files)
 		exit_status |= add_files(&dir, flags);