[Request] Remote operation need to be cached
James Blackwell
jblack at merconline.com
Tue Oct 4 09:55:08 BST 2005
On Tue, Oct 04, 2005 at 09:39:44AM +0200, Jan Hudec wrote:
> On Tue, Oct 04, 2005 at 03:25:26 -0400, James Blackwell wrote:
> > > Alexander Belchenko wrote:
> > > >I'm trying to use bzr pull because rsyncing for this BIG repository is
> > > >very long operation.
> >
> > On Mon, Oct 03, 2005 at 09:15:54AM -0500, John A Meinel wrote:
> > > rsyncing is going to be *way* faster than bzr pull. Usually 5-10x. Like
> > > rsyncing the bzr.dev tree takes < 3 min, but bzr pull it takes about
> > > 17min for me.
> > >
> > > There is some discussion about CentralizedStorage which would be sort of
> > > a local cache.
> >
> > The last time I heard this issue seriously discussed, centralized storage
> > was practically a given. Other than Doing It, the thing standing in the
> > way is defining clearly how to prune out less valuable data from the
> > cache.
>
> AIUI centrailized storage is not a cache nor anything that even
> remotedly resembles one. It also can't prune less valuable data, because
> it can't know which data that would be.
The understanding I came away with was that these two things are so
similiar that they may as well be treated as the same thing. Both of them
are a collection of a pile of changes.
In all fairness, they aren't identical. I can see these differences
between a "caching patch pool"(CPP) and a "centralized storage"(CS).
* In CS, the patches are valuable and should be retained. In CPP they are
disposable.
* In CS, the patches are intentionally there. In CPP, they happen by
happenstance
* Related to the above, but CS will be less entropic than CPP.
* In CPP one has an expectation of automatic pruning. In CS automatic
pruning gets developers pruned ("Think you can delete my code? Well,
take THAT").
* CPP is generally going to be closer than CS, which in turn will be
closer than remote.
That said, the similiarities between CS and CPP are too strong to ignore.
* Both are a collection patches for branches.
* Both serve as caches of a sort. Both will likely have similiar, if not
identical, storage formats.
* Both are likely (CPP certain, CS likely to frequently depending upon
implementation) to effectively be caches of more authoritive data
elsewhere.
* CS can be implemented within a CPP framework by setting no expiry.
> Actually, it could know, using some smart tricks with hardlinks, but it
> would only work if all working copies using that centrailized storage
> are on the same partition and that partition supports hardlinks.
> > > I'm not sure how it would work with the new weaves, though.
> >
> > Excellent question.
>
> Why it shouldn't? It should work just as the in-tree storage in .bzr,
> except it would allow multiple heads (revisions with no descendants).
My thought on this is that a file-storage mechanism is well suited towards
breaking apart previous "revisions". In a weave, these revisions are put
together in a way in which its more difficult to tease apart. I suspect
that in practice the amount of effort to tease a revision out of a weave
will be higher than just storing the entire branch.
Not too long ago I discussed at length the idea of having conflating
caches (The idea would _certainly_ work for CPP, probably not for CS)
perform conflation of older data. I was able to get the idea across to
some. I wasn't able to convince everyone though. I think the failure here
wasn't in the idea itself, but in the method of describing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051004/ac4171cb/attachment.pgp
More information about the bazaar
mailing list