Web lists-archives.com

Re: Finer timestamps and serialization in git




On Wed, May 15 2019, Derrick Stolee wrote:

> On 5/15/2019 4:28 PM, Jason Pyeron wrote:
>> (please don’t cc me)
>
> Ok. I'll "To" you.

I'm a rebel!

>> and we follow the rule that:
>>
>> 1. any trailing zero after the decimal point MUST be omitted
>> 2. if there are no digits after the decimal point, it MUST be omitted
>>
>> This would allow:
>>
>> committer Name <user@domain> 1557948240 -0400
>> committer Name <user@domain> 1557948240.12 -0400
>
> This kind of change would probably break old clients trying to read
> commits from new clients. Ævar's suggestion [1] of additional headers
> should not create incompatibilities.

Yes, exactly. Obviously patching git to do this is rather easy, here's
an initial try:

    diff --git a/date.c b/date.c
    index 8126146c50..0a97e1d877 100644
    --- a/date.c
    +++ b/date.c
    @@ -762,3 +762,3 @@ static void date_string(timestamp_t date, int offset, struct strbuf *buf)
            }
    -       strbuf_addf(buf, "%"PRItime" %c%02d%02d", date, sign, offset / 60, offset % 60);
    +       strbuf_addf(buf, "%"PRItime".12345 %c%02d%02d", date, sign, offset / 60, offset % 60);
     }
    diff --git a/usage.c b/usage.c
    index 2fdb20086b..7760b78cb6 100644
    --- a/usage.c
    +++ b/usage.c
    @@ -267,2 +267,3 @@ NORETURN void BUG_fl(const char *file, int line, const char *fmt, ...)
            va_list ap;
    +       return;
            va_start(ap, fmt);

We don't need BUG() right? :)

Now let's commit with that git, that gives me a commit object with a
sub-second timestamp like:

    $ git cat-file -p HEAD
    tree 4d5fcadc293a348e88f777dc0920f11e7d71441c
    author Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> 1557955656.12345 +0200
    committer Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> 1557955656.12345 +0200

Works so far, yay!

And now fsck fails:

    error in commit 31b3e9b88c36f75b3375471d9f5b449165c9ff93: badDate: invalid author/committer line - bad date

And any sane git hosting site will refuse this, e.g. trying to push this
to github:

    remote: error: object 31b3e9b88c36f75b3375471d9f5b449165c9ff93: badDate: invalid author/committer line - bad date
    remote: fatal: fsck error in packed object

And that's *just* dealing with the git.git client, any such format
changes also need to consider what happens to jgit, libgit2 etc. etc.

Once you make such changes to the format you've created your own
version-control system. It's no longer git.

>> By following these rules, all previous commits' hash are unchanged. Future commits made on the top of the second will look like old commit formats. Commits coming from "older" tools will produce valid and mergeable objects. The loss precision has frustrated us several times as well.
>
> What problem are you trying to solve where commit date is important?
> The only use I have for them is "how long has it been since someone
> made this change?" A question like "when was this change introduced?"
> is much less important than "in which version was this first released?"
> This "in which version" is a graph reachability question, not a date
> question.
>
> I think any attempt to understand Git commits using commit date without
> using the underling graph topology (commit->parent relationships) is
> fundamentally broken and won't scale to even moderately-sized teams.
> I don't even use "git log" without a "--topo-order" or "--graph" option
> because using a date order puts unrelated changes next to each other.
> --topo-order guarantees that a path of commits with only one parent
> and only one child appears in consecutive order.
>
> Thanks,
> -Stolee
>
> P.S. All of my (overly strong) opinions on using commit date are made
> more valid when you realize anyone can set GIT_COMMITTER_DATE to get
> an arbitrary commit date.
>
> [1] https://public-inbox.org/git/871s0zwjv0.fsf@xxxxxxxxxxxxxxxxxxx/T/#t