Large Binary Files

Stephen J. Turnbull stephen at xemacs.org
Thu Oct 14 09:47:06 BST 2010


Kip Warner writes:

 > The artist might make a change to the audio and submit a new one, in
 > which case I don't need the old one but only the new one. The repository
 > would of course have both in there in the history. This is the problem.
 > =(

Well, git at least allows multiple object databases ("isn't that what
'distributed' means?"), so you could back up the old objects to (slow,
remote-mounted) filesystems, keeping only the history DAG local.

Also, in git you can use git-filter-branch to remove all record of
certain files, so that they appear as if they were just added this
time.  (*However*, git-filter-branch is not something that you really
want to mess with on a regular basis if you can possibly avoid it.)
The old objects would then be garbage-collected (eventually), and
would not be transferred on cloning.

I'm not sure how easily these ideas could be implemented in bzr.  But
for example the basic object storage in bzr is done in "packfiles"
(note, plural).  These could be searched for on a (hypothetical)
BZR_ODB_PATH.  It might even be possible to have those stored remotely
(ie, the BZR_ODB_PATH would contain base URLs for any transport bzr
groks) and downloaded only when really wanted.




More information about the bazaar mailing list