Web lists-archives.com

Re: git archive generates tar with malformed pax extended attribute




On Sat, May 25, 2019 at 03:26:53PM +0200, René Scharfe wrote:

> We could truncate symlink targets at the first NUL as well in git
> archive -- but that would be a bit sad, as the archive formats allow
> storing the "real" target from the repo, with NUL and all.  We could
> make git fsck report such symlinks.

This is a little tricky, because fsck generally looks at individual
objects, and the bad pattern is a combination of a tree and a blob
together. I think you could make it work by reusing some of the code and
patterns from 9e84a6d758 (Merge branch 'jk/submodule-fsck-loose' into
maint, 2018-05-22).

> Can Unicode symlink targets contain NULs?  We wouldn't want to damage
> them even if we decide to truncate.

On Windows, I suppose, where pathnames can be UTF-16? I don't know how
any of that works with Git. I guess we'd always have to assume the
filenames in Git are UTF-8 or at least some ASCII-superset, since we
cannot encode NULs; and presumably that would extend to link
destinations, too. So I doubt it's a problem in practice. Personally,
I'd wait until somebody with such a system cares enough to suggest a new
behavior, rather than trying to guess. :)

Likewise, I think at this point with Keegan's original report that Git
is doing something reasonable with a lousy input. Unless something
interesting comes out of the golang/go bug report discussion (thank you
for opening that!), it's probably not worth chasing hypotheticals.

-Peff