[MERGE] Implement hard-link support for branch and checkout
John Arbash Meinel
john at arbash-meinel.com
Wed Jan 23 16:04:02 GMT 2008
Stefan Ring wrote:
> I'm sorry if this has been discussed before but it didn't immediately
> stick out when I searched the list.
>
> Anyway, wouldn't it make sense to hard-link the repository as well? I
> would very much enjoy to have this feature. For Mercurial, the
> official way to clone a directory is "cp -al". I love that! I tried it
> with bzr when I started playing around with it (0.90) but it seemed to
> corrupt the repositories. Also, with the knitpacks format being more
> or less append-only, it should be fairly safe.
>
> Or is it supported already maybe? I might have missed this one.
The problem with knits is that you end up with race conditions. We lock
at the Repository layer, which means that 2 branches could lock at the
same time, and end up writing to the same files at the same time.
Mercurial gets around this by always breaking the hard-link whenever it
is going to update a revlog file. However, you can only reliably detect
that on the local filesystem. And we support accessing branches directly
over sftp and ftp. (There is no stat that returns the number of
hardlinks on those transports.)
So we chose not to support hard-linked repositories for Knits.
We have shared repositories, which work a whole lot better in the
long-term anyway. As you commit more and more, your branches get more
and more diverged with Mercurial, rather than continuing to share the
same storage space. Also branching from remote into a shared repository
will get the same storage benefits. Rather than having to do a local "cp
-al" and then pull to get your new branch to match the remote.
On that note, if you *are* using a shared repository, you could probably
do "cp -al branch1 branch2" since it isn't hard-linking the repository.
I'm not sure if you would end up with weird issues with the
"branch.conf" files, as I haven't tested it at all. But the rest of the
files I'm sure are atomically overwritten, so they are effectively
"break hardlinks on write".
As for the new knitpack/--pack-0.92 format. They are perfectly capable
of being hardlinked, since the repository files are write-once. However,
you end up suffering the same divergence effect. As one branch decides
to create new data, the other branch won't see it. Merging between the
branches will start duplicating your data. Branching from another
upstream won't share your local storage.
Honestly, shared repositories solve the problem a lot better. It is
possible we will support a "bzr branch --hardlink" style flag for pack
repositories because we can, and it shows up on benchmarks when people
haven't taken the time to set up a shared repository. It isn't terribly
high on the requirements, though.
So if you want to try using "cp -al" with standalone pack branches, it
should be fine. And I think we'll be responsive to any bug reports you
file on it.
John
=:->
More information about the bazaar
mailing list