Web lists-archives.com

Re: [PATCHv2 12/20] submodule.c: convert show_submodule_summary to use emit_line_fmt




On Wed, May 17, 2017 at 8:25 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>
>>>> +static void show_submodule_header(struct diff_options *o, const char *path,
>>>>               struct object_id *one, struct object_id *two,
>>>>               unsigned dirty_submodule, const char *meta,
>>>>               const char *reset,
>>> ...
>>> How does capturing these lines help moved line detection, by the
>>> way?  These must never be matched with any other added or removed
>>> line in the real patch output.
>>
>> Why?
>
> What are buffered are not patch text, but informational text like
> "Submodule X contains untracked content", etc.  When a text file is
> modified elsewhere and lost a line that happened to say the same
> contents, we do not want to consider that such a line was moved to
> where a submodule had an untracked file.
>
> I have a suspicion that the two-pass buffering is done at too high a
> level in this series.  Doesn't the code (I haven't reached the end
> of the series) update emit_line() to buffer the patch text and these
> non-patch text with all the coloring and resetting sequences?
> Because the "ah, this old line removed corresponds to that new line
> that appears elsewhere?" logic do not want to see these color/reset
> sequence, the buffering code needs to become quite specific to how
> the current diff code is colored (e.g. a line must be painted in a
> single color and have reset at the end) and makes future change to
> color things differently almost impossible (e.g. imagine how you
> would add a "feature" that paints certain words on added lines
> differently?).

That could be added in ws.c:ws_check_emit, as these certain words
are similar to coloring whitespace.

It depends on the precedence of such a future feature, is the move
detection or the word highlighting more important to keep its color?

> Ahh, yes, I see NEEDSWORK comment in patch 19/20.
>
> Yes, I agree that this code really should be working in terms of
> offsets into pre/post images when finding matching changes, which
> probably should happen without letting fn_out_consume produce fully
> colored textual diff output in the first pass.  For the purpose of
> "moved lines detection", the logic to match a stretch of preimage
> lines with postimage lines do not want to bother with "--- a/$path"
> headers, and it does not want to care if a line that begins with '+'
> needs to be added by calling emit_add_line() that knows how to check
> ws errors or the payload needs to be painted in green.  After the
> first pass determines which added lines are not true addition but
> merely moved, the second pass would know how that '+' line needs to
> be painted a lot better (e.g. it may not be painted in green).
> Letting fn_out_consume() call emit_add_line() only to compute
> information (e.g. "'+'? ok, green") that the first pass does not
> want to see and the second pass will compute better is probably not
> a good longer term direction to go in.
>
> Having said that, we need to start somewhere, and I think it is a
> reasonable first-cut attempt to work on top of the textual output
> like this series does (IOW, while I do agree with the NEEDSWORK and
> the way this series currently does things must be revamped in the
> longer term, I do not think we should wait until that happens to
> start playing with this topic).

Ok. I share a similar reaction to submodule diffs that we discuss above
and word coloring, that Jonathan Tan brought up off list.

Both of them are broken in this implementation, but the NEEDSWORK
would hint at how to fix them.

For word coloring, we'd invent a new state BPL_EMIT_WITH_BRACES,
that would only store the word and at output time we'd have to add
sign, braces, colors. Then a block movement detection is possible.
(and this would also work with offset/len into the files longer term)

For the submodule diffs, I am really looking forward how Brandons
current work is coming along to have a repository struct such that we
can process submodules in the same process. For this diffing the
repo object would need to learn about the attribute system of
the submodules, such that we can obtain the whitespace coloring
rules, as well as the config (submodule may be configured to use
different colors for diffs).