Web lists-archives.com

Re: Optimizing writes to unchanged files during merges?




On Fri, Apr 13, 2018 at 10:14 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Apr 13, 2018 at 12:02 AM, Elijah Newren <newren@xxxxxxxxx> wrote:

>> However, it turns out we have this awesome function called
>> "was_tracked(const char *path)" that was intended for answering this
>> exact question.  So, assuming was_tracked() isn't buggy, the correct
>> patch for this problem would look like:
>
> Apparently that causes problems, for some odd reason.
>
> I like the notion of checking the index, but it's not clear that the
> index is reliable in the presence of renames either.

Yes, precisely.  Checking the *current* index is not reliable in the
presence of renames.

Trying to use the current index as a proxy for what was in the index
before the merge started is a problem.  But we had a copy of the index
before the merge started; we just discarded it at the end of
unpack_trees().  We could keep it around instead.  That would also
have the benefits of making the was_dirty() checks more accurate too,
as using the mtime's in the current index as a proxy for what was in
the original index has the potential for the same kinds of problems.

>>   A big series
>> including that patch was merged to master two days ago, but
>> unfortunately that exact patch was the one that caused some
>> impressively awful fireworks[1].
>
> Yeah, so this code is fragile.
>
> How about we take a completely different approach? Instead of relying
> on fragile (but clever) tests, why not rely on stupid brute force?
>
> Yeah, yeah, it's bad to be stupid, but sometimes simple and stupid
> really does work.
>
<snip>
> Comments? Because considering the problems this code has had, maybe
> "stupid" really is the right approach...

It's certainly tempting as an interim solution.  I have an alternative
interim solution that I think explains well why the code here had been
fragile, and how to just implement what we want to know rather than
making approximations to it, which I just posted at [2].  But I can
see the draw of just gutting the code and replacing with simple brute
force.  Long term, I think the correct solution is still Junio's
suggested rewrite[1].  My alternative is slightly closer to that
end-state, so I favor it over simple brute-force, but if others have
strong preferences here, I can be happy with either.


Elijah

[1] https://public-inbox.org/git/xmqqd147kpdm.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/
[2] https://public-inbox.org/git/20180413195607.18091-1-newren@xxxxxxxxx/