Web lists-archives.com

Re: [PATCH 3/3] reset.c: update files when using sparse to avoid data loss.

On Wed, Apr 12, 2017 at 10:37 PM, Kevin Willford <kewillf@xxxxxxxxxxxxx> wrote:
>> -----Original Message-----
>> From: git-owner@xxxxxxxxxxxxxxx [mailto:git-owner@xxxxxxxxxxxxxxx] On
>> Behalf Of Duy Nguyen
>> Sent: Wednesday, April 12, 2017 7:21 AM
>> To: Kevin Willford <kewillf@xxxxxxxxxxxxx>
>> Cc: Kevin Willford <kcwillford@xxxxxxxxx>; git@xxxxxxxxxxxxxxx;
>> gitster@xxxxxxxxx; peff@xxxxxxxx
>> Subject: Re: [PATCH 3/3] reset.c: update files when using sparse to avoid
>> data loss.
>> On Wed, Apr 12, 2017 at 5:30 AM, Kevin Willford <kewillf@xxxxxxxxxxxxx>
>> wrote:
>> > The loss of the skip-worktree bits is part of the problem if you are
>> > talking about modified files.  The other issue that I was having is
>> > when running a reset and there were files added in the commit that is
>> > being reset, there will not be an entry in the index and not a file on
>> > disk so the data for that file is completely lost at that point.
>> > "status" also doesn't include anything about this loss of data.  On
>> > modified files status will at least have the file as deleted since
>> > there is still an index entry but again the previous version of the file and it's
>> data is lost.
>> Well, we could have "deleted" index entries, those marked with
>> CE_REMOVE. They are never written down to file though, so 'status'
>> won't benefit from that. Hopefully we can restore the file before the index
>> file is written down and we really lose skip-worktree bits?
> Because this is a reset --mixed it will never run through unpack_trees and
> The entries are never marked with CE_REMOVE.

I know. But in my view, it should. All updates from a tree object to
the index should happen through unpack_trees().

>> > To me this is totally unexpected behavior, for example if I am on a
>> > commit where there are only files that where added and run a reset
>> > HEAD~1 and then a status, it will show a clean working directory.
>> > Regardless of skip-worktree bits the user needs to have the data in
>> > the working directory after the reset or data is lost which is always bad.
>> I agree we no longer have a place to save the skip-worktree bit, we have to
>> restore the state back as if skip-worktree bit does not exist.
>> It would be good if we could keep the logic in unpack_trees() though.
>> For example, if the file is present on disk even if skip-worktree bit is on,
>> unpack_trees() would abort instead of silently overwriting it.
>> This is a difference between skip-worktree and assume-unchanged bits.
>> If you do explicit checkout_entry() you might have to add more checks to
>> keep behavior consistent.
>> --
>> Duy
> Because this is a reset --mixed it will follow the code path calling read_from_tree
> and ends up calling update_index_from_diff in the format_callback of the diff,
> so unpack_trees() is never called in the --mixed case.  This code change also only applies
> when the file does not exist and the skip-worktree bit is on and the updated
> index entry either will be missing (covers the added scenario) or was not missing
> before (covers the modified scenario).  If there is a better way to get the previous
> index entry to disk than what I am doing, I am happy to implement it correctly.

I think it's ok to just look at the diff (from update_index_from_diff)
and restore the on-disk version for now. I'd like to make --mixed use
unpack_trees() too but I haven't studied  this code long enough to see
why it went with "diff" instead of "read-tree" (which translates
directly to unpack_trees). Maybe there is some subtle reason for that.
Though it looks like it was more convenient to do "diff" in the
git-reset.sh version, and that got translated literally to C when the
command was rewritten.