Extra revisions in a branch

Wed Nov 16 16:05:47 GMT 2005

Aaron Bentley <aaron.bentley at utoronto.ca> writes:

> Matthieu Moy wrote:
>> diff -r branch:XXX
>> 
>> Also fetches the revisions.
>
> Yes.  Since this discussion started off with the assertion that we
> should stop supporting branch:XXX, I thought it would be circular logic
> to use that example.

But replacing the syntax with a --branch option would probably lead to
the same result, so ...

> I agree.  We have the goals that
> 1. we want to make bzr easy for beginners, so we don't require initial
> setup of a cache directory
> 2. we want to make bzr fast, so we need to cache any data that might
> come from remote sources
> 3. it would be nice if the cache were persistent, so that repeated
> operations using the same data (e.g. pull, then merge) could be fast
> 4. it would be nice if successful merges or pulls did not leave behind
> extra cached data.

Would that be acceptable to say

- By default, everything goes in the current branch (as it does now)

- But the user can specify a cache directory, and then, this directory
  is used instead (in particular, this would allow some kind of
  centralized storage for the cache, even for someone using individual
  branches).

?

With a cache directory set, this would also allow running

  bzr diff -r branch:/path/to/branch..branch:/path/to/other/branch

from outside a branch (I'm interested in this, since I'd like to
implement a kind of "browser" in DVC that could work for remote
branches).

> Oh, but then it wouldn't be append-only...
> I understand the motivation, and such a thing could be done.  But
> realistically, the extra data is only significant when you fetch from an
> unrelated project, or when you accidentally commit confidential data
> (e.g. nuclear launch codes) to a public branch.

My concern is that I like to keep "important data" and other data
clearly distinct. Because 1) disk space is cheap, but reliable backups
are expansive, 2) I often work with important data on a not-so-fast
NFS server and unimportant data on a fast local disk.

The current way to do will accumulate accidental accesses to remote
branches in my .bzr directory, without a way to get this disk space
back. If I accidentally diff from a remote unrelated branch once, then
I'll have this information forever in my .bzr/ directory.

I agree that those accidental access will not happen often, but if you
can't cancel it, then I'm afraid the probability that such thing
happens at least once in the life of a project is close to 1.

> Actually, with knits, it would be possible to have a second set of
> directories in the repository containing unmerged knit hunks.  You'd
> need to combine the 'base' knit with the extra hunks to reconstruct the
> text.  But it would always be safe to destroy the cache directory, then.

I like this solution a lot (but I don't know how hard it is to
implement).

"push" could also take advantage of this structure to avoid pushing
useless data (but I realize that it is already clever enough to do
this with the current bzr).

-- 
Matthieu