[WIP RFC 2/5] Documentation: add Packfile URIs design doc
- Date: Mon, 3 Dec 2018 15:37:35 -0800
- From: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
- Subject: [WIP RFC 2/5] Documentation: add Packfile URIs design doc
Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
Documentation/technical/packfile-uri.txt | 83 ++++++++++++++++++++++++
Documentation/technical/protocol-v2.txt | 6 +-
2 files changed, 88 insertions(+), 1 deletion(-)
create mode 100644 Documentation/technical/packfile-uri.txt
diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt
new file mode 100644
@@ -0,0 +1,83 @@
+This feature allows servers to serve part of their packfile response as URIs.
+This allows server designs that improve scalability in bandwidth and CPU usage
+(for example, by serving some data through a CDN), and (in the future) provides
+some measure of resumability to clients.
+This feature is available only in protocol version 2.
+The server advertises `packfile-uris`.
+If the client replies with the following arguments:
+ * packfile-uris
+ * thin-pack
+ * ofs-delta
+when the server sends the packfile, it MAY send a `packfile-uris` section
+directly before the `packfile` section (right after `wanted-refs` if it is
+sent) containing HTTP(S) URIs. See protocol-v2.txt for the documentation of
+Clients then should understand that the returned packfile could be incomplete,
+and that it needs to download all the given URIs before the fetch or clone is
+complete. Each URI should point to a Git packfile (which may be a thin pack and
+which may contain offset deltas).
+The server can be trivially made compatible with the proposed protocol by
+having it advertise `packfile-uris`, tolerating the client sending
+`packfile-uris`, and never sending any `packfile-uris` section. But we should
+include some sort of non-trivial implementation in the Minimum Viable Product,
+at least so that we can test the client.
+This is the implementation: a feature, marked experimental, that allows the
+server to be configured by one or more `uploadpack.blobPackfileUri=<sha1>
+<uri>` entries. Whenever the list of objects to be sent is assembled, a blob
+with the given sha1 can be replaced by the given URI. This allows, for example,
+servers to delegate serving of large blobs to CDNs.
+While fetching, the client needs to remember the list of URIs and cannot
+declare that the fetch is complete until all URIs have been downloaded as
+The division of work (initial fetch + additional URIs) introduces convenient
+points for resumption of an interrupted clone - such resumption can be done
+after the Minimum Viable Product (see "Future work").
+The client can inhibit this feature (i.e. refrain from sending the
+`packfile-urls` parameter) by passing --no-packfile-urls to `git fetch`.
+The protocol design allows some evolution of the server and client without any
+need for protocol changes, so only a small-scoped design is included here to
+form the MVP. For example, the following can be done:
+ * On the server, a long-running process that takes in entire requests and
+ outputs a list of URIs and the corresponding inclusion and exclusion sets of
+ objects. This allows, e.g., signed URIs to be used and packfiles for common
+ requests to be cached.
+ * On the client, resumption of clone. If a clone is interrupted, information
+ could be recorded in the repository's config and a "clone-resume" command
+ can resume the clone in progress. (Resumption of subsequent fetches is more
+ difficult because that must deal with the user wanting to use the repository
+ even after the fetch was interrupted.)
+There are some possible features that will require a change in protocol:
+ * Additional HTTP headers (e.g. authentication)
+ * Byte range support
+ * Different file formats referenced by URIs (e.g. raw object)
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 345c00e08c..2cb1c41742 100644
@@ -313,7 +313,8 @@ header. Most sections are sent only when the packfile is sent.
output = acknowledgements flush-pkt |
[acknowledgments delim-pkt] [shallow-info delim-pkt]
- [wanted-refs delim-pkt] packfile flush-pkt
+ [wanted-refs delim-pkt] [packfile-uris delim-pkt]
+ packfile flush-pkt
acknowledgments = PKT-LINE("acknowledgments" LF)
(nak | *ack)
@@ -331,6 +332,9 @@ header. Most sections are sent only when the packfile is sent.
wanted-ref = obj-id SP refname
+ packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
+ packfile-uri = PKT-LINE("uri" SP *%x20-ff LF)
packfile = PKT-LINE("packfile" LF)