Web lists-archives.com

Re: Finer timestamps and serialization in git




Hi Eric,

On 19/05/2019 05:09, Eric S. Raymond wrote:
Philip Oakley <philipoakley@xxxxxxx>:
But I don't quite understand your claim that there's no format
breakage here, unless you're implying to me that timestamps are already
stored in the git file system as variable-length strings.  Do they
really never get translated into time_t?  Good news if so.
Maybe just take some of the object ID bits as being the fractional time
timestamp. They are effectively random, so should do a reasonable job of
distinguishing commits in a repeatable manner, even with full round tripping
via older git versions (as long as the sha1 replicates...)
Huh.  That's an interesting idea.  Doesn't absolutely guarantee uniqueness,
but even with birthday effect the probability of collisions could be pulled
arbitrarily low.
depends how many bits are in the 'nano-second' resolution long word ;-)
see also

As I understand it the commit timestamp is actually free text within the
commit object (try `git cat-file -p <commit_object>), so the issue is
whether the particular git version is ready to accept the additional 'dot'
factional time notation (future versions could be extended, but I think old
ones would reject them if I understand the test up thread - which would
compromise backward compatibility and round tripping).
Nobody seems to want to grapple with the fact that changing hash formats is
as large or larger a problem in exactly the same way.

I'm not saying that changing the timestamp granularity justifies a format
break.  I'm saying that *since you're going to have one anyway*, the option
to increase timestamp precision at the same time should not be missed.
It is probably the round tripping issue with a non-fixed format (for the time string) that will scupper the idea, plus the focus being primarily on the DAG as the fundamental lineage (which only gives partial order, which can be an issue for other VCS systems that are based on incremental changes rather than snapshots) The transition is well underway see thread: https://public-inbox.org/git/20190212012256.1005924-1-sandals@xxxxxxxxxxxxxxxxxxxx/ for a patch series.

The plan is at: https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt <https://github.com/git/git/blob/v2.19.0-rc0/Documentation/technical/hash-function-transition.txt>, some discussions at thread: https://public-inbox.org/git/878t4xfaes.fsf@xxxxxxxxxxxxxxxxxxx/ etc.

The timestamp problem is known see yesterdays thread: https://public-inbox.org/git/20190518005412.n45pj5p2rrtm2bfj@xxxxxxxxxxxx/

Given that the object ID should be immutable for a round trip, using 64bits from the sha1-oid as notional 'nano-second' time does give a reasonable birthday attack resistance of ~32 bits (i.e. >1M commits with identical whole second timestamps). [or choose the sha-256 once the transition is well underway]
--
Philip