Merged wrong patch

John Arbash Meinel john at arbash-meinel.com
Sat Sep 23 21:36:00 BST 2006


Alexander Belchenko wrote:
> John Arbash Meinel пишет:
> 
> Your patch is wrong:
> 

...

> I'm again think that fast merging of your patch for zip exporter is bad
> idea.
> 
> -- 
> Alexander


Obviously the test suite was incomplete, which allowed me to make a
mistake like this. (You can also just do 'cd $bzr.dev; bzr export
foo.zip; unzip foo.zip, and it will fail).

So I'm submitting the attached patch, which fixes zipfile support, so
they create correct files at least in ASCII encoding. It includes a test
case, so that we don't have regressions like this again.

Now, this works with plain 'unzip', but it still won't work with
Windows, because win32 expects other bits to be set in the
'external_file_attributes' section. Basically it expects:

#define S_IFDIR 0040000
#define ZIP_DIRECTORY_BIT       (1 << 4)
external_file_attributes = S_IFDIR | ZIP_DIRECTORY_BIT

I wrote a library once for parsing zipfiles, and I've found that without
those things, windows is really hopeless.

And just testing now, I'm not sure how you were using zip files. I think
you are on Win2000, so you probably are using some third party software
(like WinZip). But WinXP has native support, and didn't like the zip
files that we used to generate.

But I tested this one. And while for some reason WinXP is much slower at
extracting the contents than any of the other methods (7zip, cygwin
unzip, etc), it works.

Because UTF-8 is a strict superset of ASCII, zipfiles will always work
on all platforms as long as the filenames are ascii.

The problem is that if we switch to OEM encoding, then they only work on
Windows that uses the same encoding. So if you create a zipfile in
russian, I will get different filenames on my English windows. And
certainly unzipping under Linux won't really create the right names
either. (Linux is more accepting of just plain bytestreams, so it is
possible to tell it to interpret the paths differently)

Which means that if you are trying to export to .zip to share with other
people, you are pretty much universally screwed one way or another if
you have non-ascii filenames. I can understand wanting to use OEM
because then it at least works for the local machine, and people who
have similarly configured machines. I personally prefer to make an
attempt at cross-platform compatibility and being able to store any
valid unicode filenames, not just ones that fit in the current OEM
markup. (I can write bågfors.txt and حوجو.txt next to eachother, so I
feel I should be able to put them into a zipfile together)

I tested this on linux and windows and it works on both platforms.

So I'm not completely opposed to using oem encoding on windows. But I
think we need to think about it a bit. For now, I prefer being consistent.

John
=:->

PS> Not supporting directories is significant enough, and we are trying
to do rc1 by tomorrow, so I went ahead and submitted this to the pqm.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zip-directories.patch
Type: text/x-patch
Size: 18328 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060923/2839d75a/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060923/2839d75a/attachment.pgp 


More information about the bazaar mailing list