Web lists-archives.com

[PATCH v2 05/13] midx: check both pack and index names for containment




A midx file (and the struct we parse from it) contains a list of all of
the covered packfiles, mentioned by their ".idx" names (e.g.,
"pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
callers to provide the idx name.

This works for most of the calls, but the one in open_packed_git_1()
tries to feed a packed_git->pack_name, which is the ".pack" name,
meaning we'll never find a match (even if the pack is covered by the
midx).

We can fix this by converting the ".pack" to ".idx" in the caller.
However, that requires allocating a new string. Instead, let's make
midx_contains_pack() a bit friendlier, and allow it take _either_ the
.pack or .idx variant.

All cleverness in the matching code is credited to René. Bugs are mine.

There's no test here, because while this does fix _a_ bug, it's masked
by another bug in that same caller. That will be covered (with a test)
in the next patch.

Helped-by: René Scharfe <l.s.r@xxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
---
I was tempted to suggest that the midx struct just store the base name
without ".idx" at all, but having callers pass that is no less tricky
than passing ".idx" (they still have to allocate a new string).

 midx.c | 36 ++++++++++++++++++++++++++++++++++--
 midx.h |  2 +-
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/midx.c b/midx.c
index 8a505fd423..0ceca1938f 100644
--- a/midx.c
+++ b/midx.c
@@ -307,7 +307,39 @@ int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct mu
 	return nth_midxed_pack_entry(m, e, pos);
 }
 
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
+/* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
+static int cmp_idx_or_pack_name(const char *idx_or_pack_name,
+				const char *idx_name)
+{
+	/* Skip past any initial matching prefix. */
+	while (*idx_name && *idx_name == *idx_or_pack_name) {
+		idx_name++;
+		idx_or_pack_name++;
+	}
+
+	/*
+	 * If we didn't match completely, we may have matched "pack-1234." and
+	 * be left with "idx" and "pack" respectively, which is also OK. We do
+	 * not have to check for "idx" and "idx", because that would have been
+	 * a complete match (and in that case these strcmps will be false, but
+	 * we'll correctly return 0 from the final strcmp() below.
+	 *
+	 * Technically this matches "fooidx" and "foopack", but we'd never have
+	 * such names in the first place.
+	 */
+	if (!strcmp(idx_name, "idx") && !strcmp(idx_or_pack_name, "pack"))
+		return 0;
+
+	/*
+	 * This not only checks for a complete match, but also orders based on
+	 * the first non-identical character, which means our ordering will
+	 * match a raw strcmp(). That makes it OK to use this to binary search
+	 * a naively-sorted list.
+	 */
+	return strcmp(idx_or_pack_name, idx_name);
+}
+
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name)
 {
 	uint32_t first = 0, last = m->num_packs;
 
@@ -317,7 +349,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
 		int cmp;
 
 		current = m->pack_names[mid];
-		cmp = strcmp(idx_name, current);
+		cmp = cmp_idx_or_pack_name(idx_or_pack_name, current);
 		if (!cmp)
 			return 1;
 		if (cmp > 0) {
diff --git a/midx.h b/midx.h
index 774f652530..26dd042d63 100644
--- a/midx.h
+++ b/midx.h
@@ -43,7 +43,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
 					uint32_t n);
 int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name);
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
 int write_midx_file(const char *object_dir);
-- 
2.21.0.729.g7d31bf3764