Web lists-archives.com

[PATCH] http-backend: treat empty CONTENT_LENGTH as zero

As discussed in v2.19.0-rc0~45^2~2 (http-backend: respect
CONTENT_LENGTH as specified by rfc3875, 2018-06-10), HTTP servers such
as IIS do not close a CGI script's standard input at the end of a
request, instead expecting CGI scripts to stop reading after
CONTENT_LENGTH bytes.  That commit taught http-backend to respect this
convention except when CONTENT_LENGTH is unset, in which case it
preserved the previous behavior of reading until EOF.

RFC 3875 (the CGI specification) explains:

   The CONTENT_LENGTH variable contains the size of the message-body
   attached to the request, if any, in decimal number of octets.  If no
   data is attached, then NULL (or unset).

      CONTENT_LENGTH = "" | 1*digit


   This specification does not distinguish between zero-length (NULL)
   values and missing values.

But that specification was written before HTTP/1.1 and chunked
encoding.  With chunked encoding, the length of a request is not known
early and it is useful to start a CGI script to process it anyway, so
Apache and many other servers violate the spec: they leave
CONTENT_LENGTH unset and rely on EOF to indicate the end of request.
This is reproducible using t5510-fetch.sh, which hangs if http-backend
is patched to treat a missing CONTENT_LENGTH as zero.

So we are in a bind: to support HTTP servers that don't produce EOF,
http-backend should respect an unset or empty CONTENT_LENGTH that
represents zero, and to support chunked encoding, http-backend should
respect an unset CONTENT_LENGTH that represents "read until EOF".

Fortunately, there's a way out.  Use the HTTP_TRANSFER_ENCODING
environment variable to distinguish the two cases.

Reported-by: Jeff King <peff@xxxxxxxx>
Helped-by: Max Kirillov <max@xxxxxxxxxx>
Signed-off-by: Jonathan Nieder <jrnieder@xxxxxxxxx>
How about this?

 http-backend.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 458642ef72..7902eeb0b3 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -350,10 +350,25 @@ static ssize_t read_request_fixed_len(int fd, ssize_t req_len, unsigned char **o
 static ssize_t get_content_length(void)
-	ssize_t val = -1;
+	ssize_t val;
 	const char *str = getenv("CONTENT_LENGTH");
-	if (str && *str && !git_parse_ssize_t(str, &val))
+	if (!str || !*str) {
+		/*
+		 * According to RFC 3875, an empty or missing
+		 * CONTENT_LENGTH means "no body", but RFC 3875
+		 * precedes HTTP/1.1 and chunked encoding. Apache and
+		 * its imitators leave CONTENT_LENGTH unset for
+		 * chunked requests, for which we should use EOF to
+		 * detect the end of the request.
+		 */
+		str = getenv("HTTP_TRANSFER_ENCODING");
+		if (str && !strcmp(str, "chunked"))
+			return -1;
+		return 0;
+	}
+	if (!git_parse_ssize_t(str, &val))
 		die("failed to parse CONTENT_LENGTH: %s", str);
 	return val;