[Fwd: Re: newbie question: using bzrlib api]

Fri Mar 7 12:03:31 GMT 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Westby wrote:
> Hi,
> 
> Could someone answer Rohit's questions in this mail please?
> 
> Thanks,
> 
> James
> 
> -------- Forwarded Message --------
> From: Rohit Nayak <kharifcrop at gmail.com>
> To: James Westby <jw+debian at jameswestby.net>
> Subject: Re: newbie question: using bzrlib api
> Date: Thu, 6 Mar 2008 23:26:59 +0530
> 
> Hi James,
> 
> Thanks a lot for your detailed reply. I thought I would give you some
> detail on what I am trying to achieve so you could advise me.
> 
> I am building a collaboration application. I wanted to do version
> control on the individual text/html documents (each is like a wiki
> page: editable by people who have write access to that page). The
> version control would be a bit like google docs: versioning on
> auto-save. We won't be versioning as often as google docs though ...
> 
> When someone wants to view previous versions I want to show a table of
> previous versions and allow them to browse through each version.
> Later I will use the diff/annotate/label features as well.
> 
> So essentially I am interested only in individual file versioning:
> each revision will affect only one file.
> 
> For a given file I need to get all affected revisions and the ability
> to pull the version corresponding to a revision. I tried
> log.find_touching_revisions(b, fileid), but for some reason the
> generator seems extremely slow: several seconds for each yield.
> I guess I can write my own logformatter to append to a list.
> 
> Do you think bazaar is appropriate for this kind of use where I am
> versioning large number of small individual documents?
> 

Bazaar is certainly capable of handling it, but is not designed around
"versioning each file separately". I think a lot of our lower level
plumbing would actually handle it well, but for 'bzr' the program we
want to look at things as whole tree snapshots.

The scaling problem you would start to encounter is that changing a file
and committing 10 times, would effectively change the whole project 10
times. Do that for 1000 files and you suddenly have 10,000 whole project
changes.

In general, all of the DVCS's work in the same way.

The problem you are running into for "log" is because of a request that
we had for how it works. Specifically, if you make a change and then
merge it, we wanted both the revision that did the change, and the
revision that merged the change to show in the final "bzr log". If all
you want is to show the revisions which actually changed the file we can
be quite a bit faster.

To start with, you probably want to make sure your objects are locked
for the duration of your operations. Many objects keep an in-memory
cache that we flush when unlocked.

I think at the moment it would have to be something like:

  tree.lock_read()
  try:
    repo = tree.branch.repository
    file_id = tree.path2id('path/to/file')
    vf = repo.weave_store.get_weave(file_id, repo.get_transaction())
    # At this point if you care about all revisions then you can use
    all_revision_ids = vf.versions()

    revisions = repo.get_revisions(all_revision_ids)
    # At this point you should have the Revision objects
    # with commit messages. Alternatively you might be able to use just
    # the revision_ids with the existing log formatters, etc.
  finally:
    tree.unlock()

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH0S8SJdeBCYSNAAMRAtRZAKCW+t/jbNSPT2xPCHD48MZXS3qI5QCdF9XP
MlZIgNR7X9r+gLSVECYyyps=
=SMQ3
-----END PGP SIGNATURE-----