Web lists-archives.com

Re: [PATCH] specify encoding for sed command




> I did a little more digging into this issue today.

> The issue isn't actually caused by `sed`: it's caused by the way that
> the `set` builtin outputs characters in the Unicode Private Use Area
> (PUA) in the build of Bash 3.2.57 that macOS Sierra ships with.
> 
> Powerline uses several PUA code points to make some of its pretty text
> UI elements:
> 
> Code point (hex value): description
> U+E0A0 (0xEE82A0): Version control branch
> U+E0A1 (0xEE82A1): LN (line) symbol
> U+E0A2 (0xEE82A2): Closed padlock
> U+E0B0 (0xEE82B0): Rightwards black arrowhead
> U+E0B1 (0xEE82B1): Rightwards arrowhead
> U+E0B2 (0xEE82B2): Leftwards black arrowhead
> U+E0B3 (0xEE82B3): Leftwards arrowhead
> 
> macOS Bash 3.2.57's `set` builtin has garbled output where Powerline's
> special symbols should be in the PS1 variable, but Bash 4.4.19
> (installed on macOS via homebrew) and Bash 4.3.38 (Ubuntu 16.04) both
> display it correctly in the output of `set`. `echo $PS1` does display
> the symbols correctly on these versions of Bash.

Thanks for digging.

> I think the best way to move forward with this is a new patch that uses
> `awk` instead of `sed`: I tested several `awk` variants and the command
> was portable without requiring any changes to LANG or LC_ALL.
> 
> Does that sound like a good plan?

No ;)
Could you please give the patch below a try?

I intended it as a fix for a minor performance regression introduced in
the same commit that triggered this issue, but if it can work around
this issue, too, all the better.

I'm just not sure whether we should burden this otherwise nice and short
commit message with the details of this Powerline-macOS-Bash-sed issue...


  -- >8 --

Subject: [PATCH] completion: reduce overhead of clearing cached --options

To get the names of all '$__git_builtin_*' variables caching --options
of builtin commands in order to unset them, 8b0eaa41f2 (completion:
clear cached --options when sourcing the completion script,
2018-03-22) runs a 'set |sed s///' pipeline.  This works both in Bash
and in ZSH, but has a higher than necessasry overhead with the extra
processes.

In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
builtin command, which lists the same variables, but without a
pipeline and 'sed' it can do so with lower overhead.

Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>

TODO: as an unexpected bonus, this might work around the issue with
'sed' and invalid UTF-8 characters in the environment as well, at least
in Bash.

  https://public-inbox.org/git/0102016293c8dca7-6626fcde-548d-476e-b61f-c83ecdeedfe1-000000@xxxxxxxxxxxxxxxxxxxxxxx/
---
 contrib/completion/git-completion.bash | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index b09c8a2362..4ef59a51be 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -282,7 +282,11 @@ __gitcomp ()
 
 # Clear the variables caching builtins' options when (re-)sourcing
 # the completion script.
-unset $(set |sed -ne 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+if [[ -n ${ZSH_VERSION-} ]]; then
+	unset $(set |sed -ne 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+else
+	unset $(compgen -v __gitcomp_builtin_)
+fi
 
 # This function is equivalent to
 #
-- 
2.17.0.366.gbe216a3084