Re: Design of multiple hash support
- Date: Mon, 5 Nov 2018 10:03:21 -0800
- From: Stefan Beller <sbeller@xxxxxxxxxx>
- Subject: Re: Design of multiple hash support
On Sun, Nov 4, 2018 at 6:36 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes:
> > I'm currently working on getting Git to support multiple hash algorithms
> > in the same binary (SHA-1 and SHA-256). In order to have a fully
> > functional binary, we'll need to have some way of indicating to certain
> > commands (such as init and show-index) that they should assume a certain
> > hash algorithm.
> > There are basically two approaches I can take. The first is to provide
> > each command that needs to learn about this with its own --hash
> > argument. So we'd have:
> > git init --hash=sha256
> > git show-index --hash=sha256 <some-file
> > The other alternative is that we provide a global option to git, which
> > is parsed by all programs, like so:
> > git --hash=sha256 init
> > git --hash=sha256 show-index <some-file
> I am assuming that "show-index" above is a typo for something like
Actually both seem plausible, as both do not require
RUN_SETUP, which means they cannot rely on the
When having a global setting, would that override the configured
object format extension in a repository, or do we error out?
git -c extensions.objectFormat=sha256 init
is the way to go, for now? (Are repository format extensions parsed
just like normal config, such that non-RUN_SETUP commands
can rely on the (non-)existence to determine whether to use
the default or the given hash function?)
> It is hard to answer the question without knowing what exactly does
> "(to) support multiple hash algorithms" mean. For example, inside
> today's repository, what should this command do?
> git --hash=sha256 cat-file commit HEAD
There is a section "Object names on the command line"
and I assume that this before the "dark launch"
phase, so I would expect the latter to work (no error
but conversion/translation on the fly) eventually as a goal.
But the former might be in scope of one series.
> It can work this way:
> - read HEAD, discover that I am on 'master' branch, read refs/heads/master
> to learn the object name in 40-hex, realize that it cannot be
> sha256 and report "corrupt ref".
> Or it can work this way:
> - read repository format, realize it is a good old sha1 repository.
> - do the usual thing to get to read_object() to read the commit
> object data for the commit at HEAD, doing all of it in sha1.
> - in the commit object data, locate references to other objects
> that use sha1 name.
> - replace these sha1 references with their sha256 counterparts and
> show the result.
> I am guessing that you are doing the former as a good first step, in
> which case, as an option that changes/affects the behaviour of git
> globally, I think "git --hash=sha256" would make sense, like other
> global options like --literal-pathspecs and --no-replace-objects.