Web lists-archives.com

Re: [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs




On Thu, Mar 14, 2019 at 01:51:35PM +0900, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > Seeing "--stale-fix" again reminded me: that may be the "oops, we can
> > spend tons of CPU walking history" bit that I mentioned in the other
> > part of the thread. But I don't think git-gc would ever run with that
> > option.
> 
> The option was a purely transitional measure to recover from a bad
> reflog state that could have been left by earlier versions of
> "prune" and "repack" that did not pay attention to the reflog.
> Perhaps we should plan to deprecate and remove it by now?

True, though I have definitely used it over the years to clear out
broken reflog entries from corrupted repositories. In most cases we
should be able to do that much more simply these days, though. Since we
try to keep whole segments of history reachable from an otherwise
unreachable object, you should in general be able to just prune entries
for which we are missing the actual object mentioned in the log.

Of course when you are talking about corruption, all rules go out the
window. So it's possible you'd still need --stale-fix to cover really
broken cases.

I think these days I'd more often just delete the whole reflog in such a
case (once upon a time GitHub tried to use never-expiring reflogs as a
sort of audit trail, but it had all sorts of complications; these days
we log to a separate file).

So I wouldn't be terribly sad to see --stale-fix go away.

All that said, I do think --expire-unreachable still has to traverse to
find out what's reachable. And I think it does it under lock. If I
instrument the tempfile code like the patch below and run:

  GIT_TRACE_TEMPFILE=1 git reflog expire --expire-unreachable=now HEAD

on a copy of my git.git repo, I get:

  15:22:12.163725 tempfile.c:127          activating tempfile '/home/peff/compile/foo/.git/HEAD.lock'
  15:22:12.163769 tempfile.c:127          activating tempfile '/home/peff/compile/foo/.git/logs/HEAD.lock'
  15:22:13.053404 tempfile.c:312          renaming tempfile '/home/peff/compile/foo/.git/logs/HEAD.lock' to '/home/peff/compile/foo/.git/logs/HEAD'
  15:22:13.053416 tempfile.c:327          deleting tempfile '/home/peff/compile/foo/.git/HEAD.lock'

We hold HEAD.lock for almost an entire second.

(Actually, it just occurred to me that "strace -tt git ... 2>&1 | grep
HEAD.lock" would produce basically the same results, no patch needed).

-Peff

---
diff --git a/tempfile.c b/tempfile.c
index d43ad8c191..f7e999d3ca 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -53,6 +53,9 @@
 #include "cache.h"
 #include "tempfile.h"
 #include "sigchain.h"
+#include "trace.h"
+
+static struct trace_key trace_tempfile = TRACE_KEY_INIT(TEMPFILE);
 
 static VOLATILE_LIST_HEAD(tempfile_list);
 
@@ -119,6 +122,9 @@ static void activate_tempfile(struct tempfile *tempfile)
 	volatile_list_add(&tempfile->list, &tempfile_list);
 	tempfile->owner = getpid();
 	tempfile->active = 1;
+
+	trace_printf_key(&trace_tempfile, "activating tempfile '%s'",
+			 tempfile->filename.buf);
 }
 
 static void deactivate_tempfile(struct tempfile *tempfile)
@@ -302,6 +308,9 @@ int rename_tempfile(struct tempfile **tempfile_p, const char *path)
 		return -1;
 	}
 
+	trace_printf_key(&trace_tempfile, "renaming tempfile '%s' to '%s'",
+			 tempfile->filename.buf, path);
+
 	deactivate_tempfile(tempfile);
 	*tempfile_p = NULL;
 	return 0;
@@ -314,6 +323,9 @@ void delete_tempfile(struct tempfile **tempfile_p)
 	if (!is_tempfile_active(tempfile))
 		return;
 
+	trace_printf_key(&trace_tempfile, "deleting tempfile '%s'",
+			 tempfile->filename.buf);
+
 	close_tempfile_gently(tempfile);
 	unlink_or_warn(tempfile->filename.buf);
 	deactivate_tempfile(tempfile);