[Preview/RFC] win32 fake symlinks
Alexander Belchenko
bialix at ukr.net
Sat Nov 3 15:09:14 GMT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Arbash Meinel пишет:
> Alexander Belchenko wrote:
>> Hi,
>
>
> ...
>
>> 3) To allow bzr using fake symlinks on win32 I monkeypatching os module. I'm hardly convinced that
>> monkeypatching here is right thing, because no-one Linux developer will care to use
>> osutils.[symlink|readlink|lstat] instead of os.[symlink|readlink|lstat]. And I don't want forever go
>> behind you guys as street-cleaner an fix all places where you write win32-incompatible code.
>
>> About speed.
>
>
> One thing I noticed:
>
> +def check_fake_symlink(path):
> + """Return True if file is fake symlink"""
> + if GetFileAttributes(path) & FILE_ATTRIBUTE_SYSTEM:
> + f = file(path, 'rb')
> + data = f.read()
> + f.close()
> + if data.startswith('!<symlink>') and data.endswith('\0'):
> + return True
> + return False
>
> a) you should still use try/finally . It isn't huge when there is only 1
> command, but it it good to be in the habit.
>
> b) You read the entire content of every file that you stat. So if there is a
> 10MB file, you read 10MB just to find out it isn't a symlink. I realize you
> don't really want an arbitrary limit on the symlink path. Because most systems
> read in pages anyway, you could probably make this a 4k read. 2k or 1k would
> still be longer than Windows really supports anyway.
No, I disagree here. I read entire file *only* *if* file has SYSTEM file attribute.
It's very important part. I'm also thinking that it's maybe a bit incorrect,
because SYSTEM attribute intended to use for files that belonging to OS.
Maybe this is the reason why Cygwin now creates only windows shortcuts for symlinks.
But it still supports old-style symlinks, probably for backward compatibility reasons.
> c) Is there an encoding for these paths? How do you set a link to a Unicode
> path. I would probably recommend using UTF-8. Though on Win32 using MBCS might
> be reasonable (since that is the path encoding anyway). But that puts '\0'
> characters in your strings, and you are using that as the EOF.
> I'm guessing your current implementation is ending up using OEM encoding. But
> that only works for the subset that is in your code page.
I don't know how Cygwin deals with unicode. In my understanding
it just don't support unicode at all. I know we prefer utf-8 everywhere, so I'm OK
to use it. But if someone familiar with Cygwins internals, I'd like to know
how Cygwin supports unicode in symlinks.
> d) You could use the size of the file to filter out files that could not
> possibly be symlinks. (Too big or too small.)
Probably. But I can said above the main gate is SYSTEM attribute.
> e) You could instead just read for "!<symlink>" at the beginning, which is a
> very fixed (and short) length. However, reading the content of files usually
> means making the HD heads move to a different location on disk. The metadata is
> usually stored separately from the actual content. I'm curious if you'd see big
> differences between FAT32 and NTFS for this. Just because they lay things out
> differently on disk.
Yeah, probably it's true here. When I shows numbers I don't said about filesystem.
The numbers for IDE disk is for FAT32. Probably it explains why speed so slow.
> f) I would probably prefer it if this functionality was optional. (Provided by
> a plugin.) It doesn't make a lot of sense to penalize all the normal Win32
> users by checking every file if it is a symlink. Especially considering that
> most Win32 projects won't/can't use them.
I believe that C-reimplementation of lstat will be faster enough. Because builtin os
lstat already read file attributes, so I don't need to read it twice. And as I said
above: the file attributes is first and main gate here.
Implementation lstat_fake_symlink in C should be fast.
Another reason: if I install plugin, I usually have it loaded every time, even if it
not needed, and should pay penalties for all operations.
But have this functionality in plugin means that you'll get bug reports, about
"module os don't have symlink" all the time, again and again.
> I would actually prefer if our TreeTransform code just knew whether it could
> create symlinks, and if it got them, just have it either fail with a clean
> message, delete them automatically (a bit ugly), or have a way to set those
> files as "hidden". Considered part of the tree, but not present on disk.
>
> The last is my favorite, but we need a way to store it. Which means a format bump.
I don't have enough time and self-motivation to go so far.
>
> John
> =:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHLI8azYr338mxwCURAglTAJ0Z1H4fihrtYfRbivYD3N4aNgq9CACeJmF9
ZmA6IcIviJlU98NHtKfhCpU=
=hi+q
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list