Managing a photo library with bzr
David Ingamells
david.ingamells at mapscape.eu
Tue Jul 29 07:02:08 BST 2008
This sounds more like an application for a database (e.g. MySql) that a VCS.
You could store in the database thumbnails, metadata (e.g. exif and
checksums) and the location on disk of each big picture file.
Most languages have good interfaces to MySql, so I'm sure Python has an
excellent one.
John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Sébastien Barthélemy wrote:
> | Hello everybody,
> |
> | I'm in the process of writing a bunch of python scripts to manage my
> | photo library.
> |
> | I would like it to be versioned, and thus I'm wondering of bzr would
> | be suited for that.
> | To give you an idea, I currently have 6100 pictures, "weighting" 7,2 Go.
>
> So, there are a few questions about why you want it to be versioned,
> etc. Are you modifying the pictures, such that you want to be able to
> get back to an older version of them?
>
> Is this *really* better than a real backup system?
>
> For starters, bzr is tuned around working as a version control system
> for source code (mainly text files). It probably would work under what
> you need, but you would probably have to workaround bits here and there
> because of how we've tuned the system.
>
> For example, we generally keep at least 1 full-copy of the contents of a
> file in memory while doing operations. (We try to limit it to 3 copies,
> but stuff like merge needs 3 copies to work from. And there are
> certainly "bugs" where we might hold more than that.)
>
> Our storage layer generally doesn't do very well optimizing the size of
> binary files. (Though to be fair, if you are using PNG or JPG files as
> your source, those aren't going to do very well with just about
> anything, because they are already compressed.)
>
> You probably won't be able to commit all 7GB of files in one pass. You
> probably could commit a few at a time, until all 7GB was versioned. I
> know some versions of bzr would also have problem copying this between
> repositories, as it would tend to buffer the stream before sending. We
> are trying to remove those code paths, but I would be surprised if we
> buffered less than 1 text content at a time.
>
>
> |
> | So what do you think about it ? Could bzr handle that much data on a
> | regular computer ? Let's say regular=mine ;) : 1Go RAM and a core duo
> | processor.
> |
> | Another think I would like to do is to store somewhere the md5sum of
> | each (version of each) picture in order to ease duplicate detection.
> | Is there versioned properties à la svn in Bzr ?
>
> Revisions can hold arbitrary metadata, but not really like you have in svn.
>
> However, we already store the sha1 sum of every version of every file,
> so you could just hook into the inventory logic if all you are wanting
> is to check hash collisions.
>
> |
> | There will also be some need to handle/fix the exim (and so) metadata
> | of the picture at some time. Do you think it would be wise to
> | implement this as a bzr merger plugin ?
>
> Well, if you could export it to a file, we would merge it for you.
> Otherwise, yes, you would need a custom merge algorithm.
>
> |
> | At last, it would be great to have the ability to checkout a "low disk
> | space" version of the library, with only the metadata, and low
> | resolution pictures, for instance. While it seems quite out of the
> | scope of bzr, maybe some one as a clean solution for this too.
>
> This could be done with a layering approach. So that you actually have 2
> branches/repositories. One with the full versions, and one with the
> thumbnails.
>
> |
> | That's it, feel free to criticize if the idea sounds bad to you.
> |
> | Cheers
> |
>
> I don't think bzr is specifically well suited to the task. If you have
> some development ability in you (as you are at least writing python
> scripts to manage it), we would be open to patches which make things
> work better for you. (Subject to the standard: are of good quality,
> don't reduce test coverage, don't reduce code clarity, sort of constraints.)
>
> John
> =:->
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (Cygwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkiOCpgACgkQJdeBCYSNAAO2cQCfdpPJeH9Wz2DsppCsNSnJI3VD
> fu8AnjQegx8ltVRRgb/ukpxrvF4U7Oky
> =qZnZ
> -----END PGP SIGNATURE-----
>
>
More information about the bazaar
mailing list