[BUG?] export to tar.gz produce strange name for inner tar
John Arbash Meinel
john at arbash-meinel.com
Thu Apr 5 18:39:15 BST 2007
Martin Pool wrote:
> https://bugs.launchpad.net/bzr/+bug/102234
>
> I think this is a bug, or at least a misbehavior that we can avoid.
> Python's tarfile.open has a parameter to set the file name separately
> from the file object, and that's probably what goes into the headers.
> We should set it to just the basename I think.
The only place that "filename" is set seems to be in tar.gz. It doesn't
exist in tar.bz2, nor does it exist in plain tar.
It seems to be done in tarfile._Stream._init_write_gz().
the code there seems to be:
self.__write("\037\213\010\010%s\002\377" % timestamp)
if self.name.endswith(".gz"):
self.name = self.name[:-3]
self.__write(self.name + NUL)
And it is possible that it should be:
self.__write("\037\213\010\010%s\002\377" % timestamp)
if self.name.endswith(".gz"):
self.name = self.name[:-3]
self.__write(os.path.basename(self.name) + NUL)
It doesn't really seem recommended to use a different name for the
TarFile object. I'm not positive, but I see some safety checks (like
making sure you don't add the tar-file to itself).
We could work around this by doing:
fileobj = open(export_path, 'wb')
tar = tarfile.open(os.path.basename(export_path), 'w:gz',
fileobj=fileobj)
I don't see any specfic bugs on this, so probably it is something that
should be brought up as a python bug.
It would seem that filename is an optional gzip header field (enabled by
setting a flag). At least, this is the code in gzip.py
def _write_gzip_header(self):
self.fileobj.write('\037\213') # magic header
self.fileobj.write('\010') # compression method
fname = self.filename[:-3]
flags = 0
if fname:
flags = FNAME
self.fileobj.write(chr(flags))
write32u(self.fileobj, long(time.time()))
self.fileobj.write('\002')
self.fileobj.write('\377')
if fname:
self.fileobj.write(fname + '\000')
John
=:->
More information about the bazaar
mailing list