Web lists-archives.com

Re: [PATCH v2 7/7] trace2: make SIDs more unique






On 3/29/2019 6:12 PM, Ævar Arnfjörð Bjarmason wrote:

On Fri, Mar 29 2019, Jeff Hostetler via GitGitGadget wrote:

From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>

Update SID component construction to use the current UTC datetime
and a portion of the SHA1 of the hostname.

Use an simplified date/time format to make it easier to use the
SID component as a logfile filename.
[...]
+static void tr2_sid_append_my_sid_component(void)
+{
+	const struct git_hash_algo *algo = &hash_algos[GIT_HASH_SHA1];
+	struct tr2_tbuf tb_now;
+	git_hash_ctx ctx;
+	unsigned char hash[GIT_MAX_RAWSZ + 1];
+	char hex[GIT_MAX_HEXSZ + 1];
+	char hostname[HOST_NAME_MAX + 1];
+
+	tr2_tbuf_utc_datetime_for_filename(&tb_now);
+	strbuf_addstr(&tr2sid_buf, tb_now.buf);
+	strbuf_addch(&tr2sid_buf, '-');
+
+	if (xgethostname(hostname, sizeof(hostname)))
+		xsnprintf(hostname, sizeof(hostname), "localhost");
+
+	algo->init_fn(&ctx);
+	algo->update_fn(&ctx, hostname, strlen(hostname));
+	algo->final_fn(hash, &ctx);
+	hash_to_hex_algop_r(hex, hash, algo);
+	strbuf_add(&tr2sid_buf, hex, 8);
+
+	strbuf_addch(&tr2sid_buf, '-');
+	strbuf_addf(&tr2sid_buf, "%06"PRIuMAX, (uintmax_t)getpid());
+}
+

Thanks for turning my shitty half-formed idea into a patch :)

I wrote this on top to bikeshed this a bit further, wonder what you
think:
https://github.com/gitgitgadget/git/compare/pr-169/jeffhostetler/core-tr2-startup-and-sysenv-v2...avar:pr-169/jeffhostetler/core-tr2-startup-and-sysenv-v2

So e.g.:

     Before: 20190329-220413-446441-c2f5b994-018702
     After:  20190329T220431.244562Z-Hc2f5b994-P19812F

I.e:

  * Using <date>T<time> as is ISO 8601 convention/easier to read

  * <dateime>.<microseconds>Z, so seperating with "." to indicate it's
    the same value + add "Z" for "it's UTC". I'm least sure about the
    ".". Is that going to cause issues on Windows these days (the rest
    being the "extension"...).

I had a version that did just that.  I checked the various ISO and RFCs
and it seems like the fractional seconds is usually allowed between the
whole seconds and the "Z".

I've not seen any problems with that format.

I think I got spooked by your original suggestion to put the fraction in
a term after the whole "<date>T<time>Z" term.

I'll convert it back to match your suggestion.


  * I changed the hostname discovery so if gethostbyname() fails we'll
    print "-H00000000-" for that part, instead of "-H<first 8 chars of
    the sha1 for 'localhost'>-". Also prefix with "H" for "Host".

I like that.


  * Wrap pids to 0xffff, prefix with "P" (Pid)" and trail with either "F"
    = Full or "W" = Wrapped (not the real PID).

I could see the "P".  I'm not sure about the hex -- I sometimes want to
do a "ps" or watch the processes go by in TaskManager and friends and
they all print the pid in decimal.  But it's not that big a deal.


  * I didn't add "T<datetime>" like "H" and "P" for the rest, since it's
    obvious what sort of value it is.

Maybe this is going a bit overboard, but I think it's easier to read at
a glance for humans, and since it's meant to be opaque to machines
anyway and the length is simliar enough not to matter I figure it's
worth it.


I'll re-roll.
Thanks
Jeff