Re: .deb format: let's use 0.939, zstd, drop bzip2
- Date: Thu, 09 May 2019 09:22:56 +0200
- From: Ansgar <ansgar@xxxxxxxxxx>
- Subject: Re: .deb format: let's use 0.939, zstd, drop bzip2
Adam Borowski writes:
> On Wed, May 08, 2019 at 10:35:58PM +0200, Ansgar wrote:
>> Adam Borowski writes:
>> > I've recently did some research on how can we improve the speed of unpacking
>> > packages. There's a lot of other stages that can be improved, but let's
>> > talk about the .deb format.
>> > First, the 0.939 format, as described in "man deb-old". While still being
>> > accepted by dpkg, it had been superseded before even the very first stable
>> > release. Why? It has at least two upsides over 2.0:
>> Switching to a different binary format will break various tools.
> The 0.939 format is already supported by most tools.
> No one sane digs into insides of the format, using a small number of
> low-level tools, thus we can reuse it with little effort.
> Of course, adding a new compressor _does_ break compat, but we added four
> compressors to 2.0 over the years already, and the sky didn't fall.
Well, it causes minor breakage which is fairly easy to fix. A different
container format instead of tar would require more involved changes in
tools, so I'm not 100% convinced of my idea myself ;-) The thread just
looked like the right time to consider such changes.
>> If we want to do this, I wonder if we shouldn't take the chance to move
>> away from tar?
> Any seekable format significantly reduces compression, although this can
> be reduced by managing split points.
Well, depending on how much splitting you do, the loss in compression
should be small enough to not care about?
>> We have various applications that only want to extract single members of
>> the package (changelog, NEWS, copyright, ...); tar is a really bad
>> format for such an operation. Other formats (zip, 7z, ...) are more
>> suited for them.
> Perhaps such files could be considered metadata and moved to the control
> tarball? Or merely just moved forward -- remember that tarballs are
I don't think that is a good idea: if someone wants to use another file
in a similar way, he couldn't and would have to fall back to the worse
As an example: I have a config-diff script which compares conffiles with
the pristine version included in the *.deb; it wants to access /etc/*.
(Though ideally dpkg would keep the pristine version accessible below
/usr; that would also be useful for other uses.)
Also dpkg keeps metadata in /var, but changelogs, NEWS, copyright
documentation isn't variable state data and should be below /usr... The
same is really true for lists of files and maintainer scripts though.