Web lists-archives.com

Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives




On Tue, Sep 5, 2017 at 6:05 AM, Jeff King <peff@xxxxxxxx> wrote:

>   int main(void)

nit of the day:
  s/void/int argc, char *argv/ or in case we do not
  want to emphasize the argument list s/void//
  as that adds no uninteresting things.


>
> In other words, you can do:
>
>   int main(void)
>   {
>         char *p = some_function();
>         printf("%s", p);
>         UNLEAK(p);
>         return 0;
>   }
>
> to annotate "p" and suppress the leak report.

This sounds really cool so far.

After having a sneak peak at the implementation
it is O(1) in runtime for each added element, and the
space complexity is O(well).

> But wait, couldn't we just say "free(p)"? In this toy
> example, yes. But using UNLEAK() has several advantages over
> actually freeing the memory:

This is indeed the big question, that I have had.

>
>   1. It can be compiled conditionally. There's no need in
>      normal runs to do this free(), and it just wastes time.
>      By using a macro, we can get the benefit for leak-check
>      builds with zero cost for normal builds (this patch
>      uses a compile-time check, though we could clearly also
>      make it a run-time check at very low cost).
>
>      Of course one could also hide free() behind a macro, so
>      this is really just arguing for having UNLEAK(), not
>      for its particular implementation.

This is only a real argument in combination with (2), or in other
words you seem to hint at situations like these:

  struct *foo = obtain_new_foo();
  ...
  #if FREE_ANNOTATED_LEAKS
    /* special free() */
    release_foo(foo);
  #endif

With UNLEAK this situation works out nicely as we just
copy over all memory, ignoring elements allocated inside
foo, but for free() we'd have issues combining the preprocessor
magic with the special free implementation.

So how would we use syntactic sugar to made this
more comfortable? Roughly like

    MAYBE(release_foo(foo))

  #if (FREE_ANNOTATED_LEAKS)
  /* we rely on strict text substitution */
  /* as the function signature may change */
  #define MAYBE(fn) fn;
  #else
  #define MAYBE(fn)
  #endif

Me regurgitating this first argument is just a long way of saying
that it put me off even more after reading only the first argument.
Maybe reorder this argument to show up after the current second
argument, so the reader is guided better?

>   2. It's recursive across structures. In many cases our "p"
>      is not just a pointer, but a complex struct whose
>      fields may have been allocated by a sub-function. And
>      in some cases (e.g., dir_struct) we don't even have a
>      function which knows how to free all of the struct
>      members.
>
>      By marking the struct itself as reachable, that confers
>      reachability on any pointers it contains (including those
>      found in embedded structs, or reachable by walking
>      heap blocks recursively.
>
>   3. It works on cases where we're not sure if the value is
>      allocated or not. For example:
>
>        char *p = argc > 1 ? argv[1] : some_function();
>
>      It's safe to use UNLEAK(p) here, because it's not
>      freeing any memory. In the case that we're pointing to
>      argv here, the reachability checker will just ignore
>      our bytes.

This argument demonstrates why the MAYBE above is
inferior.

>
>   4. Because it's not actually freeing memory, you can
>      UNLEAK() before we are finished accessing the variable.
>      This is helpful in cases like this:
>
>        char *p = some_function();
>        return another_function(p);
>
>      Writing this with free() requires:
>
>        int ret;
>        char *p = some_function();
>        ret = another_function(p);
>        free(p);
>        return ret;
>
>      But with unleak we can just write:
>
>        char *p = some_function();
>        UNLEAK(p);
>        return another_function(p);

  5. It's not just about worrying if we can call UNLEAK
      once (in 4), but we also do not have to worry about
      calling it twice, or recursively. (This argument can be bad
      for cargo cult programmers, but we don't have these ;-)



> +#ifdef SUPPRESS_ANNOTATED_LEAKS
> +extern void unleak_memory(const void *ptr, size_t len);
> +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));

As always with macros we have to be careful about its arguments.

  UNLEAK(a++)
  UNLEAK(baz())

won't work as intended.