Web lists-archives.com

Re: .deb format: let's use 0.939, zstd, drop bzip2




Adam Borowski writes:
> On Wed, May 08, 2019 at 10:35:58PM +0200, Ansgar wrote:
>> Adam Borowski writes:
>> > I've recently did some research on how can we improve the speed of unpacking
>> > packages.  There's a lot of other stages that can be improved, but let's
>> > talk about the .deb format.
>> >
>> > First, the 0.939 format, as described in "man deb-old".  While still being
>> > accepted by dpkg, it had been superseded before even the very first stable
>> > release.  Why?  It has at least two upsides over 2.0:
>> 
>> Switching to a different binary format will break various tools.
>
> The 0.939 format is already supported by most tools.
>
> No one sane digs into insides of the format, using a small number of
> low-level tools, thus we can reuse it with little effort.
>
> Of course, adding a new compressor _does_ break compat, but we added four
> compressors to 2.0 over the years already, and the sky didn't fall.

Well, it causes minor breakage which is fairly easy to fix.  A different
container format instead of tar would require more involved changes in
tools, so I'm not 100% convinced of my idea myself ;-)  The thread just
looked like the right time to consider such changes.

>> If we want to do this, I wonder if we shouldn't take the chance to move
>> away from tar?
>
> Any seekable format significantly reduces compression, although this can
> be reduced by managing split points.

Well, depending on how much splitting you do, the loss in compression
should be small enough to not care about?

>> We have various applications that only want to extract single members of
>> the package (changelog, NEWS, copyright, ...); tar is a really bad
>> format for such an operation.  Other formats (zip, 7z, ...) are more
>> suited for them.
>
> Perhaps such files could be considered metadata and moved to the control
> tarball?  Or merely just moved forward -- remember that tarballs are
> unordered.

I don't think that is a good idea: if someone wants to use another file
in a similar way, he couldn't and would have to fall back to the worse
solution.

As an example: I have a config-diff script which compares conffiles with
the pristine version included in the *.deb; it wants to access /etc/*.
(Though ideally dpkg would keep the pristine version accessible below
/usr; that would also be useful for other uses.)

Also dpkg keeps metadata in /var, but changelogs, NEWS, copyright
documentation isn't variable state data and should be below /usr...  The
same is really true for lists of files and maintainer scripts though.

Ansgar