RE: [PATCH] read-cache: fix index corruption with index v4
- Date: Tue, 5 Sep 2017 19:37:28 +0000
- From: Kevin Willford <kewillf@xxxxxxxxxxxxx>
- Subject: RE: [PATCH] read-cache: fix index corruption with index v4
> From: Thomas Gummerer [mailto:t.gummerer@xxxxxxxxx]
> Sent: Monday, September 4, 2017 4:58 PM
> ce012deb98 ("read-cache: avoid allocating every ondisk entry when
> writing", 2017-08-21) changed the way cache entries are written to the
> index file. While previously it wrote the name to an struct that was
> allocated using xcalloc(), it now uses ce_write() directly. Previously
> ce_namelen - common bytes were written to the cache entry, which would
> automatically make it nul terminated, as it was allocated using calloc.
> Now we are writing ce_namelen - common + 1 bytes directly from the
> ce->name to the index. As ce->name is however not nul terminated, and
> index-v4 needs the nul terminator to split between one index entry and
> the next, this would end up in a corrupted index.
> Fix that by only writing ce_namelen - common bytes directly from
> ce->name to the index, and adding the nul terminator in an extra call to
> This bug was turned up by setting TEST_GIT_INDEX_VERSION = 4 in
> config.mak and running the test suite (t1700 specifically broke).
> Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx>
> I unfortunately didn't have more time to dig so
> > As ce->name is however not nul terminated
> just comes from my memory and from the patch below actually fixing the
> corruption, so it's really the most likely cause. Would be great if
> someone who can remember more about the index could confirm that this
> is indeed the case.
Digging into this and ce->name IS nul terminated. The issue comes in when
the CE_STRIP_NAME is set, which is only set when using a split index.
This sets the ce->ce_namelen = 0 without changing the actual ce->name buffer.
When writing the entry for the split index version 4 it was using the first character
in the ce->name buffer because of the + 1, which obviously isn't correct. Before
it was using a newly allocated name buffer from the ondisk struct which was
allocated based on the ce_namelen of zero.
> read-cache.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> diff --git a/read-cache.c b/read-cache.c
> index 40da87ea71..80830ddcfc 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2103,7 +2103,9 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct
> cache_entry *ce,
> if (!result)
> result = ce_write(c, fd, to_remove_vi, prefix_size);
> if (!result)
> - result = ce_write(c, fd, ce->name + common,
> ce_namelen(ce) - common + 1);
> + result = ce_write(c, fd, ce->name + common,
> ce_namelen(ce) - common);
> + if (!result)
> + result = ce_write(c, fd, "\0", 1);
You could use the padding variable here as well which is used in the < version 4
> strbuf_splice(previous_name, common, to_remove,
> ce->name + common, ce_namelen(ce) - common);
While looking at the code I was wondering if we could get around the
whole setting ce->ce_namelen to zero bit but that would be much bigger
patch and possibly introduce other bugs so this seems the appropriate
fix for now.
Thanks for finding this!