Web lists-archives.com

Re: [PATCH 0/10] towards clean leak-checker output




On 5 September 2017 at 15:01, Jeff King <peff@xxxxxxxx> wrote:
> Using leak-checking tools like valgrind or LSAN is usually not all that
> productive with Git, because the output is far from clean. And _most_ of
> these are just not interesting, as they're either:
>
>   1. Real leaks, but ones that only be triggered a small, set number of
>      times per program.
>
>   2. Intentional leaks of variables as the main cmd_* functions go out
>      of scope (as opposed to manually cleaning).
>
> I focused here on getting t0000 and t0001 to run clean against a
> leak-checking tool. That's a fraction of the total test suite, but my
> goal was have a tractable sample which could let us see if the path
> seems sane.
>
> I feel positive overall about the approach this series takes. The
> suppressions aren't too terrible to write, and I found some real (albeit
> minor) leaks in these few tests. I hope it can serve as a base for an
> ongoing effort to get the whole test suite clean.
>
> A few things to note:
>
>   - I switched from valgrind to ASAN/LSAN early on. It's just way
>     faster, which makes the compile-test-fix cycle a lot more pleasant.
>     With these patches, you should be able to do:
>
>       make SANITIZE=leak && (cd t && ./t1234-* -v -i)
>
>     and get a leak report for a single script dumped to stderr.
>
>     If you want to try it with t0000, you'll need the lock-file series I
>     just posted at:
>
>       https://public-inbox.org/git/20170905121353.62zg3mtextmq5zrs@xxxxxxxxxxxxxxxxxxxxx/
>
>     and the fix from Martin at:
>
>       https://public-inbox.org/git/CAN0heSqn59sFF3A-eQ593G+ZDWnO9pKM5F=sgiSQk+prUr-nSQ@xxxxxxxxxxxxxx/
>
>     to get a clean run.
>
>   - I'm using LSAN instead of the full-on ASAN because it's faster. The
>     docs warn that it's a bit experimental, and I did notice a few funny
>     behaviors (like truncated backtraces), but overall it seems fine.
>     You can also do:
>
>       make SANITIZE=address &&
>         (cd t && ASAN_OPTIONS=abort_on_error=1 ./t1234-* -v -i)
>
>     to do a full ASAN run (the extra environment setting is necessary to
>     override test-lib's defaults).
>
>   - gcc's leak-checker doesn't seem to handle reachability correctly (or
>     maybe I'm holding it wrong). As a simple case, building with ASAN
>     and running git-init complains:
>
>       $ make SANITIZE=address && ./git init foo
>       ...
>       Direct leak of 2 byte(s) in 1 object(s) allocated from:
>           #0 0x7f011dc699e0 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd99e0)
>           #1 0x558eeedbdab5 in do_xmalloc /home/peff/compile/git/wrapper.c:60
>           #2 0x558eeedbdbab in do_xmallocz /home/peff/compile/git/wrapper.c:100
>           #3 0x558eeedbdd0d in xmallocz /home/peff/compile/git/wrapper.c:108
>           #4 0x558eeedbdd0d in xmemdupz /home/peff/compile/git/wrapper.c:124
>           #5 0x558eeedbdd0d in xstrndup /home/peff/compile/git/wrapper.c:130
>           #6 0x558eeea0507a in main /home/peff/compile/git/common-main.c:39
>           #7 0x7f011cf612e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
>
>     That line is the setting of argv0_path, which is a global (and thus
>     shouldn't be marked as leaking). Interestingly, it only happens with
>     -O2. Compiling with -O0 works fine. I'm not sure if it's a bug or
>     what.
>
>     I did most of my testing with clang-6.0, which gets this case right.

Hmmm, I got the same wrong results (IMHO) from Valgrind, which
classified this as "definitely lost". Like you I found that -O0 helped.
And yes, that was with gcc. Maybe gcc with optimization somehow manages
to hide the pointers from these tools. I know too little about the
technical details to have any real ideas, though. My searches did not
bring up anything useful. (gcc 5.4.0)

I guess clang will be my next attempt. And asan/lsan. Valgrind is slow..

I'll look through this series and get back if I have any input.

>   - I have no idea who close or far this is to covering the whole suite.
>     Only 98 test scripts complete with no leaks. But many of those
>     failures will be hitting the same leaks over and over. It looks like
>     running "git show' hits a few, which is going to affect a lot of
>     scripts. But bear in mind when reading the patches that this might
>     be the first step on a successful road, or it could be a dead end. :)
>
> Most of this is actual leak fixes. The interesting part, I think, is the
> UNLEAK annotation in patch 10.
>
>   [01/10]: test-lib: --valgrind should not override --verbose-log
>   [02/10]: test-lib: set LSAN_OPTIONS to abort by default
>   [03/10]: add: free leaked pathspec after add_files_to_cache()
>   [04/10]: update-index: fix cache entry leak in add_one_file()
>   [05/10]: config: plug user_config leak
>   [06/10]: reset: make tree counting less confusing
>   [07/10]: reset: free allocated tree buffers
>   [08/10]: repository: free fields before overwriting them
>   [09/10]: set_git_dir: handle feeding gitdir to itself
>   [10/10]: add UNLEAK annotation for reducing leak false positives
>
>  Makefile               |  3 +++
>  builtin/add.c          |  3 +++
>  builtin/commit.c       |  1 +
>  builtin/config.c       | 11 +++++++++--
>  builtin/init-db.c      |  2 ++
>  builtin/ls-files.c     |  1 +
>  builtin/reset.c        | 24 +++++++++++++++++-------
>  builtin/update-index.c |  4 +++-
>  builtin/worktree.c     |  2 ++
>  environment.c          |  4 +++-
>  git-compat-util.h      |  7 +++++++
>  repository.c           | 14 +++++++-------
>  setup.c                |  5 -----
>  t/test-lib.sh          |  7 ++++++-
>  usage.c                | 13 +++++++++++++
>  15 files changed, 77 insertions(+), 24 deletions(-)
>
> -Peff