Google Summer of Code: Encrypted branch/repository format status

Bogdano Arendartchuk debogdano at gmail.com
Mon Jul 16 17:49:01 BST 2007


Hello,

I'm working on the encrypted repository and branch format for Bazaar.

Currently I'm coding a repository format that is intended to write in the
disk all the data slightly scrambled. This is a protype and nothing is
encrypted at all, the objective is to know better the Bazaar code/design
and also plan what can be reused and what should be reimplemented in order
to fit the application needs.

What's still missing for finish this prototype format:

- one scrambled branch format, apparently easy to achieve;
- extend BzrDir to have more configuration files (as we will need more
  configuration files containing encryption parameters [more below])
- extend VersionedFileStore, actually the TransportStore part, to generate
  scrambled file names, as we don't want to leak revision-ids

And one open question at the moment is how much should I rely on _KnitIndex
and _KnitData methods (I'm extending these classes). I'm afraid because of
these beautiful underscores at the beginning of their names the their
methods I'm extending. 

For example, the recent change in knit.py to use pyrex-generated code for
knit indexes broke my code because I was extending _KnitIndex._load_data,
and this method now is at module level. I patched knit.py (and the tests)
to still have one _load_data method and then allow me to hook in
_KnitIndex. The patch is attached, I can resubmit it if seems reasonable.

So the real question is "should I fork/branch/etc KnitRepository in order
to not depend on implementation details of upstream knit and the speed of
how Bazaar changes?".

Regarding the encryption itself, the only detail defined at the moment is
that there will be one file (in the bzrdir level) containing the symmetric
key(s) used to decrypt all the real branch and repository data. This file
would be encrypted/decrypted using the user's GnuPG keypair. I really need
to discuss in deep this topic while finishing the prototype, as I think the
great experience of the folks in the list will be really helpful :-)

-- 
Bogdano Arendartchuk
-------------- next part --------------
# Bazaar revision bundle v0.9
#
# message:
#   Bring back _KnitIndex._load_data so that it can be specialized by
#   knit-based versioned file formats.
#   
# committer: Bogdano Arendartchuk <debogdano at gmail.com>
# date: Sun 2007-07-15 17:31:09.519999981 -0300

=== modified file bzrlib/knit.py
--- bzrlib/knit.py
+++ bzrlib/knit.py
@@ -1150,7 +1150,7 @@
             try:
                 # _load_data may raise NoSuchFile if the target knit is
                 # completely empty.
-                _load_data(self, fp)
+                self._load_data(fp)
             finally:
                 fp.close()
         except NoSuchFile:
@@ -1162,6 +1162,10 @@
                 self._transport.put_bytes_non_atomic(
                     self._filename, self.HEADER, mode=self._file_mode)
 
+    def _load_data(self, fp):
+        """Read in a Knit index."""
+        _index_load_data(self, fp)
+
     def get_graph(self):
         return [(vid, idx[4]) for vid, idx in self._cache.iteritems()]
 
@@ -1900,6 +1904,6 @@
 
 
 try:
-    from bzrlib._knit_load_data_c import _load_data_c as _load_data
+    from bzrlib._knit_load_data_c import _load_data_c as _index_load_data
 except ImportError:
-    from bzrlib._knit_load_data_py import _load_data_py as _load_data
+    from bzrlib._knit_load_data_py import _load_data_py as _index_load_data

=== modified file bzrlib/tests/test_knit.py
--- bzrlib/tests/test_knit.py
+++ bzrlib/tests/test_knit.py
@@ -259,12 +259,12 @@
 class LowLevelKnitIndexTests(TestCase):
 
     def get_knit_index(self, *args, **kwargs):
-        orig = knit._load_data
+        orig = knit._index_load_data
         def reset():
-            knit._load_data = orig
+            knit._index_load_data = orig
         self.addCleanup(reset)
         from bzrlib._knit_load_data_py import _load_data_py
-        knit._load_data = _load_data_py
+        knit._index_load_data = _load_data_py
         return _KnitIndex(*args, **kwargs)
 
     def test_no_such_file(self):
@@ -832,12 +832,12 @@
     _test_needs_features = [CompiledKnitFeature]
 
     def get_knit_index(self, *args, **kwargs):
-        orig = knit._load_data
+        orig = knit._index_load_data
         def reset():
-            knit._load_data = orig
+            knit._index_load_data = orig
         self.addCleanup(reset)
         from bzrlib._knit_load_data_c import _load_data_c
-        knit._load_data = _load_data_c
+        knit._index_load_data = _load_data_c
         return _KnitIndex(*args, **kwargs)
 
 

=== modified directory  // last-changed:debogdano at gmail.com-20070715203109-g6ro
... uufdr1p5g9sm
# revision id: debogdano at gmail.com-20070715203109-g6rouufdr1p5g9sm
# sha1: 457a8a99d75add1f69a5b5a8510714294805adec
# inventory sha1: d3dd1559a248e1bf0c3a4b0407218abc7dfcb7e2
# parent ids:
#   pqm at pqm.ubuntu.com-20070713074627-93zxs9uh528y0fki
# base id: pqm at pqm.ubuntu.com-20070713074627-93zxs9uh528y0fki
# properties:
#   branch-nick: knit_index__load_data

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070716/fd9a4a2f/attachment.pgp 


More information about the bazaar mailing list