Web lists-archives.com

Re: [PATCH V2 1/2] Fix delta integer overflows




Martin Koegler <martin.koegler@xxxxxxxxx> writes:

> From: Martin Koegler <martin.koegler@xxxxxxxxx>

Just a nitpick on the patch title.  As "git shortlog --no-merges"
output would tell you, we try to prefix the title with a short name
of the area of codebase we are touching, followed by a colon and a
space and then remainder without extra capitalization.  Perhaps

    Subject: delta: fix enconding size larger than an "uint" can hold

> The current delta code produces incorrect pack objects for files > 4GB.
>
> Signed-off-by: Martin Koegler <martin.koegler@xxxxxxxxx>

I am a bit torn on this change.

The original is indeed bad in that the code does not guarantee that
an intermediate variable like 'l' is not large enough to hold the
true size we know in index->src_size, and in that sense this change
is an improvement.

Given that this is not merely a local storage format but it also is
an interchange format, we would probably want to make sure that the
receiving end (e.g. get_delta_hdr_size() that is used at the
beginning of patch_delta()) on a platform whose size_t is smaller
than that of a platform that produced the delta stream with this
code behaves "sensibly".

If we replaced ulong we use in create/patch delta codepaths with
uint32_t, that would be safer, just because the encoder would not be
able to emit varint that is larger than the receivers to handle.
But that defeats the whole point of using varint() to encode the
sizes in the first place.  It was partly done for space saving, but
more for allowing larger sizes and larger ulong in the future
without having to change the file format.

Perhaps we should teach the receiving end to notice that the varint
data it reads encodes a size that is too large for it to grok and
die.  With that, we can safely move forward with whatever size_t
each platform uses.

Thanks.

> ---
> For next.
>
>  diff-delta.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/diff-delta.c b/diff-delta.c
> index 3797ce6..cd238c8 100644
> --- a/diff-delta.c
> +++ b/diff-delta.c
> @@ -319,7 +319,9 @@ create_delta(const struct delta_index *index,
>  	     const void *trg_buf, unsigned long trg_size,
>  	     unsigned long *delta_size, unsigned long max_size)
>  {
> -	unsigned int i, outpos, outsize, moff, msize, val;
> +	unsigned int i, val;
> +	off_t outpos, moff;
> +	size_t l, outsize, msize;
>  	int inscnt;
>  	const unsigned char *ref_data, *ref_top, *data, *top;
>  	unsigned char *out;
> @@ -336,20 +338,20 @@ create_delta(const struct delta_index *index,
>  		return NULL;
>  
>  	/* store reference buffer size */
> -	i = index->src_size;
> -	while (i >= 0x80) {
> -		out[outpos++] = i | 0x80;
> -		i >>= 7;
> +	l = index->src_size;
> +	while (l >= 0x80) {
> +		out[outpos++] = l | 0x80;
> +		l >>= 7;
>  	}
> -	out[outpos++] = i;
> +	out[outpos++] = l;
>  
>  	/* store target buffer size */
> -	i = trg_size;
> -	while (i >= 0x80) {
> -		out[outpos++] = i | 0x80;
> -		i >>= 7;
> +	l = trg_size;
> +	while (l >= 0x80) {
> +		out[outpos++] = l | 0x80;
> +		l >>= 7;
>  	}
> -	out[outpos++] = i;
> +	out[outpos++] = l;
>  
>  	ref_data = index->src_buf;
>  	ref_top = ref_data + index->src_size;