Re: reftable [v6]: new ref storage format
- Date: Tue, 8 Aug 2017 15:30:28 -0700
- From: Shawn Pearce <spearce@xxxxxxxxxxx>
- Subject: Re: reftable [v6]: new ref storage format
On Tue, Aug 8, 2017 at 12:25 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Shawn Pearce <spearce@xxxxxxxxxxx> writes:
>> For `log_type = 0x4..0x7` the `log_chained` section is used instead to
>> compress information that already appeared in a prior log record. The
>> `log_chained` always includes `old_id` for this record, as `new_id` is
>> implied by the prior (by file order, more recent) record's `old_id`.
>> The `not_same_committer` block appears if `log_type & 0x1` is true,
>> `not_same_message` block appears if `log_type & 0x2` is true. When
>> one of these blocks is missing, its values are implied by the prior
>> (more recent) log record.
> Two comments.
> * not-same-committer would be what I would use when I switch
> timezones, even if I stay to be me, right?
Correct. This is based on the theory that the timezone in a reflog is
actually the system timezone, not your timezone. If you push to a
remote system, that system's reflog will be using that system's
timezone, not your timezone. So you aren't really that different, and
we can compress the timezone part away. Also, if you do move
timezones, you are likely to remain in that timezone for some period
of time, and such we can compress many log records again with the same
Its ancient history from my research with "pack v4", but people don't
really change timezones very often in the Git committer data. I
suspect its even more true with reflog data.
> I am just wondering
> if it is clear to everybody that "committer" in that phrase is a
> short-hand for "committer information other than the timestamp".
Maybe not. I will try to come up with another shorthand name for this.
> * Should the set of entries that are allowed to use of "chained"
> log be related to the set of entries that appear in the restart
> table in any way? For a reader that scans starting at a restart
> point, it would be very cumbersome if the entry were chained from
> the previous entry, as it would force it to backtrack entries to
> find the first non-chained log entry. A simple "log_chained must
> not be used for an entry that appear in the restart table" rule
> would solve that, but I didn't see it in the document.
Good catch! This is implemented as you described in JGit (for the
reasons you described), but not documented. I'll fix it.