[BUG] (trivial) duplicate text in "bzr pull --help"
Robert Collins
robertc at robertcollins.net
Wed Jun 28 20:45:27 BST 2006
On Wed, 2006-06-28 at 14:18 -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > The only knits for which caching is relevant during fetch
> > is the inventory knit - the revision knit does not have anything other
> > than the index retrieved until the revisions are fetched.
>
> According to my log, inventory and revision knits both use the API
> inefficiently:
...
from fetch.py, you should be seeing:
# we fetch only the referenced inventories because we do not
# know for unselected inventories whether all their required
# texts are present in the other repository - it could be
# corrupt.
to_weave.join(from_weave, pb=child_pb, msg='merge inventory',
version_ids=revs)
for revisions, KnitRepoFetcher does:
def _fetch_revision_texts(self, revs):
# may need to be a InterRevisionStore call here.
from_transaction = self.from_repository.get_transaction()
to_transaction = self.to_repository.get_transaction()
to_sf = self.to_repository._revision_store.get_signature_file(
to_transaction)
from_sf = self.from_repository._revision_store.get_signature_file(
from_transaction)
to_sf.join(from_sf, version_ids=revs, ignore_missing=True)
to_rf = self.to_repository._revision_store.get_revision_file(
to_transaction)
from_rf = self.from_repository._revision_store.get_revision_file(
from_transaction)
to_rf.join(from_rf, version_ids=revs)
that is, it does two join() calls.
For each of these three joins, the InterKnit code path should kick
in(knit.py registers this at import).
The InterKnit code path does:
for (version_id, raw_data), \
(version_id2, options, parents) in \
izip(self.source._data.read_records_iter_raw(copy_queue_records),
copy_queue):
assert version_id == version_id2, 'logic error, inconsistent
results'
count = count + 1
pb.update("Joining knit", count, total)
raw_records.append((version_id, options, parents, len(raw_data)))
raw_datum.append(raw_data)
self.target._add_raw_records(raw_records, ''.join(raw_datum))
While is designed to allow a single readv and writev.
I am guessing that your earlier optimisation work has pessimised the
readv queries that are generated, leading to you seeing what you are
seeing.
> >>But I don't think we have a cheap way of finding out
> >>whether the shortcut of copying knit records will create different
> >>annotations from those that would be produced by rediffing.
> >
> >
> > I think annotations should be considered a cache, and possibly invalid -
> > simply because different diff() routines will generate different
> > annotations, so if a specific annotation style is needed one should
> > reannotate.
>
> I think that would have some nasty side effects. We would always have
> to reannotate before performing a knit merge, because we can't expect
> good results if the annotations are invalid.
By invalid I dont mean 'wrong', I mean 'not as good as it could be'.
Consider for instance the change in diff algorithm if/when we switch the
knit sequence matching over to patience diff?
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060629/e76d0919/attachment.pgp
More information about the bazaar
mailing list