Web lists-archives.com

Re: [RFC][PATCH] index-pack: add testcases found using AFL




On 12/03/2017 19:14, Junio C Hamano wrote:
Jeff King <peff@xxxxxxxx> writes:

One further devil's advocate:

If people really _do_ care about coverage, arguably the AFL tests are a
pollution of that concept. Because they are running the code, but doing
a very perfunctory job of testing it. IOW, our coverage of "code that
doesn't segfault or trigger ASAN" is improved, but our coverage of "code
that has been tested to be correct" is not (and since the tests are
lumped together, it's hard to get anything but one number).

So I dunno. I remain on the fence about the patch.

Yeah, I have been disturbed by your earlier remark "binary test
cases that nobody, not even the author, understands", and the above
summarizes it more clearly.

Continuously running fuzzer tests on the codebase would have value,
but how exactly are these fuzzballs generated?  Don't they depend on
the code being tested?  IOW, how effective is a set of fuzzballs
that forces the code to take more branches in the current codepath
for the purpose of testing new code that updates the codepath,
changing the structure of the codeflow?  Unless a new set of
fuzzballs to match the updated codeflow is generated, wouldn't the
test coverage with these fuzzballs erode over time, making them less
and less useful baggage we carry around, without nobody noticing that
they no longer are effective to help test coverage?

Yes, the testcases are generated based on the feedback from the
instrumented code, so they are very clearly shaped by the code itself.

However, I think it's more useful to think of these testcases not as
"binary test that nobody knows what they are doing", but as "(sometimes
invalid) packfiles which tickle interesting code paths in the packfile
parser".

With this perspective it becomes clearer that while they were generated
from the code, they also in a sense describe the packfile format itself.

I did a few experiments in changing the code of the packfile reader in
various small ways (e.g. deleting a check, reordering some code) to see
the effects of the testcases found by fuzzing, and I have to admit it
was fairly disappointing. The testcases I added did not catch a single
buggy change, whereas the other testcases did catch many of them.

So I guess you are right -- these testcases are maybe really not that
useful as part of the test suite without explicit comparison between the
expected and actual results.

Forget about the patch, I will just put the testcases in a separate repo
instead. Maybe I will convert some of them into "real" tests.

Thanks, and sorry for the noise.


Vegard