[Bug 1710911] Re: apt-ftparchive does not correctly cache filesizes for packages > 4GB
Julian Andres Klode
julian.klode at gmail.com
Thu Aug 17 14:13:10 UTC 2017
Removed the patch tag here and there, as the patch is incomplete.
** Tags removed: patch
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to apt in Ubuntu.
https://bugs.launchpad.net/bugs/1710911
Title:
apt-ftparchive does not correctly cache filesizes for packages > 4GB
Status in apt package in Ubuntu:
New
Status in apt package in Debian:
Unknown
Bug description:
Release: 16.04
Version: 1.2.19
apt-ftparchive is a utility for, among other things, generating a
Packages file from a set of .deb packages.
Because generating Packages files for a large directory tree of .deb
packages is expensive, it can cache the properties of .deb packages it
has already inspected in a Berkeley database file.
Historically, apt-ftparchive stored the size of a .deb package as a
32-bit unsigned integer in network byte-order in this cache. This
field was later enlarged to 64-bits - which caused other problems;
see: LP #1274466.
However, even after this integer field was enlarged, apt-ftparchive
continued to use the htonl() and ntohl() libc functions to convert
file-sizes to and from network byte-order when reading and writing to
its cache. These functions unconditionally emit 32-bit unsigned
integers, which means that apt-ftparchive remained unable to correctly
record the file-sizes for packages > 32bits (i.e. > 4GB).
Consequently, if apt-ftparchive is asked (with caching enabled) to
generate a Packages file for a .deb package larger than 4GB, it will
produce a Packages file with the correct Size: field the first time,
but with incorrect Size: fields subsequently.
I have developed a small patch which replaces the use of the ntohl()
family of functions with suitable replacements from <endian.h>. This
produces correct output on new installations.
However, caution is necessary: the existing code is incorrectly
storing the 32 least significant bits of a 64-bit number in the upper
32-bits of a 64-bit field, in big-endian byte order. The application
of this patch will cause new values to be stored correctly, but in a
binary-incompatible way with existing caches.
For example, a package of size 7162161474 bytes will today have the
following sequence of bytes stored in its cache entry:
\xaa\xe5\xe9\x42\x00\x00\x00\x00
(When re-read, this will produce a file-size value of 7162161474 mod
32bits, i.e. 2867194178.)
With this patch applied, apt-ftparchive will correctly store this
entry:
\x00\x00\x00\x01\xaa\xe5\xe9\x42
However, this correct cache entry, when interpreted by the current
broken code, will return a file-size of 1 byte. Worse, the existing
broken entry will be interpreted by my fixed code as containing the
value 12314505225791602688.
It would be good to have this patch, or some derivative of it, applied
to the main APT code-base. Before this can happen, however, some
mechanism to detect and correct broken cache entries will be needed if
we are to avoid a repeat of LP #1274466.
I would suggest this could be done by checking the trailing four bytes
of the 64-bit filesize field: if they are all zero, then the cache
entry is broken, and should be rewritten.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/1710911/+subscriptions
More information about the foundations-bugs
mailing list