Web lists-archives.com

Re: [PATCH v3 5/5] fast-export: do automatic reencoding of commit messages only if requested




On Sat, May 11, 2019 at 2:07 PM Torsten Bögershausen <tboegi@xxxxxx> wrote:
> On Fri, May 10, 2019 at 01:53:35PM -0700, Elijah Newren wrote:

> This one is good:
> > +     if (unset || !strcmp(arg, "abort"))
> > +             reencode_mode = REENCODE_ABORT;
>
> But here: does it make sense to use REENCODE_YES/NO to be more consistant ?
> > +     else if (!strcmp(arg, "yes"))
> > +             reencode_mode = REENCODE_PLEASE;
> > +     else if (!strcmp(arg, "no"))
> > +             reencode_mode = REENCODE_NEVER;

Didn't realize there was any such convention, and even have difficulty
finding it with grep (CONTAINS_{YES,NO} appears to be the only example
I can find), but the alternate wording seems fine; I'm happy to adopt
it.

> > +             case REENCODE_ABORT:
> > +                     die("Encountered commit-specific encoding %s in commit "
> > +                         "%s; use --reencode=<mode> to handle it",
> Should we be more helpfull and say !use --reencode=[yes|no] to handle it ?

Sounds like a good idea; I'll adjust it.


> > +     sed "s/wer/i18n-no-recoding/" iso-8859-7.fi |
> > +             (cd new &&
> > +              git fast-import &&
> > +              # The commit object, if not re-encoded, is 240 bytes.
> > +              # Removing the "encoding iso-8859-7\n" header would drops 20
> > +              # bytes.  Re-encoding the Pi character from \xF0 in
> > +              # iso-8859-7 to \xCF\x80 in utf-8 would add a byte.  I would
> > +              # grep for the specific bytes, but Windows lamely does not
> This is somewhat unclear to me. What does Windows not allow ?
> > +              # allow that, so just search for the expected size.
> > +              test 240 -eq "$(git cat-file -s i18n-no-recoding)" &&
> > +              # Also make sure the commit has the "encoding" header
> > +              git cat-file commit i18n-no-recoding >actual &&
> > +              grep ^encoding actual)
> > +'

Windows does not allow specifying the bytes I want to grep for on the
command line; it'll munge the command line, resulting in it searching
for something other than what I wanted to be searched for, and return
the wrong answer based on searching for the wrong thing.  See
https://public-inbox.org/git/f8eb246f-a936-e9df-4bb4-068b86a62baf@xxxxxxxx/
and https://public-inbox.org/git/nycvar.QRO.7.76.6.1905101551110.44@xxxxxxxxxxxxxxxxx/.

My comment was already pretty long because it looks like a crazy way
to run the test and thus it feels like I need to explain it.  And the
craziness is based on how Windows behaves; it seems insane to me that
Windows decides to munge user data (in the form of the command line
provided), so much so that it makes me wonder if I really understood
Hannes' and Dscho's explanations of what it is doing.  (How could
anyone have thought munging user data was a good idea?)  Anyway, long
story short, I'm not sure how to explain it correctly and succinctly.
Any suggestions?


Elijah