Web lists-archives.com

gitweb: HTML output is not always encoded in UTF-8 when using --fastcgi.

An old (2011) description of the problem is here:

Basically, gitweb's HTML output is not always encoded in UTF-8
when using --fastcgi.

gitweb v2.18.0
perl   v5.28.0

| echo Système >test.git/description

According to the 2011 problem report,
the problem only appears when using gitweb.cgi --fastcgi
not when gitweb.cgi is spawned by fcgiwrap.

And apparently, the text must not contain one character
which cannot be correctly converted to ISO-8859-1,
or an UTF-8 encoding is done (not sure by what);
which made this bug harder to spot.

According to Christian Hansen (chansen), the problem is that:
> FCGI streams are implemented using the older stream API,
> TIEHANDLE. Applying PerlIO layers using binmode() has no effect.

> FCGI.pm isn't Unicode aware,
> only characters within the range 0x00-0xFF are supported.

But, as stated in gitweb's to_utf8():
> gitweb writes out in utf-8
> thanks to "binmode STDOUT, ':utf8'" at beginning"

Christian Hansen suggested that:
"The proper solution would be to encode your data before outputting it,
but if thats not an option I can offer this hotpatch:"

| my $enc = Encode::find_encoding('UTF-8');
| my $org = \&FCGI::Stream::PRINT;
| no warnings 'redefine';
| local *FCGI::Stream::PRINT = sub {
|     my @OUTPUT = @_;
|     for (my $i = 1; $i < @_; $i++) {
|         $OUTPUT[$i] = $enc->encode($_[$i], Encode::FB_CROAK|Encode::LEAVE_SRC);
|     }
|     @_ = @OUTPUT;
|     goto $org;
| };

As a quick workaround this hotpatch can even be put in $GITWEB_CONFIG
by removing the `local` before `*FCGI::Stream::PRINT`.