Web lists-archives.com

.deb format: let's use 0.939, zstd, drop bzip2

I've recently did some research on how can we improve the speed of unpacking
packages.  There's a lot of other stages that can be improved, but let's
talk about the .deb format.

First, the 0.939 format, as described in "man deb-old".  While still being
accepted by dpkg, it had been superseded before even the very first stable
release.  Why?  It has at least two upsides over 2.0:

* there's no 10¹⁰ bytes (~9.31GB) limit
  While no package this big is in the archive _yet_ (max being 1⎖652⎖244⎖360
  bytes), both storage sizes and software bloat grow pretty fast, thus we'll
  break this barrier in a few years.  And there's a world outside the
  official archive -- I bet someone already has been burned by this limit.
* it's faster by a small but non-negligible factor.  A task "unpack all
  packages in default XFCE GUI install" gets done by stock dpkg (after
  repacking everything as gzip) 3% faster.

Obviously, 3% is not worth fighting for, but as the size limit needs fixing

Alas, while current dpkg handles 0.939 archives well, it supports only two
compressors: gzip and cat.  Neither of them is adequate these days.  Thus,
we'd need to enable others -- which means not being able to unpack new .debs
with old dpkg.  Barring ugly versioned pre-depends on dpkg, that'd require
waiting two release cycles.

So let's pick compressors to enable.  For compression ratio, xz still wins
(at least among popular compressors).  But there's a thing to say about
zstd: firefox.deb zstd -19 takes to unpack:
* 2.644s .xz, stock dpkg
* 2.532s .xz, my tool (libarchive based)
* 0.290s .zst, my tool
* 0.738s .gz, stock dpkg
* 0.729s .gz 0.939, stock dpkg
File sizes being 60628216 gz, 47959544 zstd, 44506304 xz.

XFCE install total: 723M xz, 773M zstd, 963M gzip.

Thus, even though we'd want to stick with xz for the official archive, speed
gains from zstd are so massive that it's tempting to add support for it,
at least for non-official uses, possibly also for common Build-Depends.
The usual objection, "we don't want to bloat the Essential set" doesn't hold
water because 1. libzstd is already a part of the Required set in Buster,
2. a non-default compressor can be dlopened.


But, the dlopen idea shows a potential victim: bzip2.  Let's kill it.

Stats for Buster's packages:

.deb format:
2.0:    100%

gz      11671
xz      45210

gz      966
xz      55915

With not a single package in the archive still using bz2, removing support
would be reasonable.  It'd be okay to give a clear error message telling the
user to install libbz2-1.0 (dlopen) or bzip2 (pipe) -- so folks can still
unpack historic .debs if need be.

⢀⣴⠾⠻⢶⣦⠀ .globl _start↵.data↵rc: .ascii "/etc/init.d/rcS\0"↵.text↵_start
⣾⠁⢰⠒⠀⣿⡁ mov $57,%rax↵syscall↵cmp $0,%rax↵jne child↵parent:↵mov $61,%rax
⢿⡄⠘⠷⠚⠋⠀ mov $-1,%rdi↵xor %rsi,%rsi↵xor %rdx,%rdx↵syscall↵jmp parent↵child:
⠈⠳⣄⠀⠀⠀⠀ mov $59,%rax↵mov $rc,%rdi↵xor %rsi,%rsi↵xor %rdx,%rdx↵syscall