RE: [Question] Signature calculation ignoring parts of binary files
- Date: Wed, 12 Sep 2018 16:53:43 -0400
- From: "Randall S. Becker" <rsbecker@xxxxxxxxxxxxx>
- Subject: RE: [Question] Signature calculation ignoring parts of binary files
> -----Original Message-----
> From: git-owner@xxxxxxxxxxxxxxx <git-owner@xxxxxxxxxxxxxxx> On Behalf
> Of Johannes Sixt
> Sent: September 12, 2018 4:48 PM
> To: Randall S. Becker <rsbecker@xxxxxxxxxxxxx>
> Cc: git@xxxxxxxxxxxxxxx
> Subject: Re: [Question] Signature calculation ignoring parts of binary files
> Am 12.09.18 um 21:16 schrieb Randall S. Becker:
> > I feel really bad asking this, and I should know the answer, and yet.
> > I have a binary file that needs to go into a repo intact (unchanged).
> > I also have a program that interprets the contents, like a textconv,
> > that can output the relevant portions of the file in whatever format I
> > like - used for diff typically, dumps in 1K chunks by file section.
> > What I'm looking for is to have the SHA1 signature calculated with
> > just the relevant portions of the file so that two actually different
> > files will be considered the same by git during a commit or status. In
> > real terms, I'm trying to ignore the Creator metadata of a JPG because
> > it is mutable and irrelevant to my repo contents.
> > I'm sorry to ask, but I thought this was in .gitattributes but I can't
> > confirm the SHA1 behaviour.
> You are looking for a clean filter. See the 'filter' attribute in gitattributes(5).
> Your clean filter program or script should strip the unwanted metadata or set
> it to a constant known-good value.
> (You shouldn't need a smudge filter.)
> -- Hannes
Thanks Hannes. I thought about the clean filter, but I don't actually want to modify the file when going into git, just for SHA calculation. I need to be able to keep some origin metadata that might change with subsequent copies, so just cleaning the origin is not going to work - actually knowing the original author is important to our process. My objective is to keep the original file 100% exact as supplied and then ignore any changes to the metadata that I don't care about (like Creator) if the remainder of the file is the same.