Web lists-archives.com

Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'




> On 07 Mar 2018, at 20:59, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> lars.schneider@xxxxxxxxxxxx writes:
> 
>> +static int check_roundtrip(const char* enc_name)
> 
> The asterisk sticks to the variable, not type.

Argh. I need to put this check into Travis CI ;-)


>> +{
>> +	/*
>> +	 * check_roundtrip_encoding contains a string of space and/or
>> +	 * comma separated encodings (eg. "UTF-16, ASCII, CP1125").
>> +	 * Search for the given encoding in that string.
>> +	 */
>> +	const char *found = strcasestr(check_roundtrip_encoding, enc_name);
>> +	const char *next;
>> +	int len;
>> +	if (!found)
>> +		return 0;
>> +	next = found + strlen(enc_name);
>> +	len = strlen(check_roundtrip_encoding);
>> +	return (found && (
>> +			/*
>> +			 * check that the found encoding is at the
>> +			 * beginning of check_roundtrip_encoding or
>> +			 * that it is prefixed with a space or comma
>> +			 */
>> +			found == check_roundtrip_encoding || (
>> +				found > check_roundtrip_encoding &&
>> +				(*(found-1) == ' ' || *(found-1) == ',')
>> +			)
> 
> The second line is unneeded, as we know a non-NULL found either
> points at check_roundtrip_encoding or somewhere to the right, and
> the first test already checked the "points exactly at" case.

Eric objected that too [1], but noted the documentation value:

    Although the 'found > check_roundtrip_encoding' expression is
    effectively dead code in this case, it does document that the
    programmer didn't just blindly use '*(found-1)' without taking
    boundary conditions into consideration.

    (It's dead code because, at this point, we know that strcasestr()
    didn't return NULL and we know that 'found' doesn't equal
    'check_roundtrip_encoding', therefore it _must_ be greater than
    'check_roundtrip_encoding'.)

[1] https://public-inbox.org/git/CAPig+cR81J3fTOtrgAumAs=RC5hqYFfSmeb-ru-Yf_ahFuBiew@xxxxxxxxxxxxxx/

Although the line is unnecessary, I felt it is safer/easier to 
understand and maintain. Since both of you tripped over it, I will
remove it though.


> This is defined to be a comma separated list, so it is unnecessary
> to accept <cre,en> == <"FOO, SHIFT-JIS, BAR", "SHIFT-JIS">; if you
> allow SP, perhaps "isspace(found[-1]) || found[-1] == ','" to also
> allow HT may also be appropriate.

I don't think HT makes too much sense. However, isspace() is nice
and I will use it. Being more permissive on the inputs should hurt.


>  I think "comma or whitespace
> separated list" is fine; in any case, the comment near the beginning
> of this function does not match new text in Documentation/config.txt
> added by this patch.

Nice catch. Will fix.


Thanks,
Lars