Google Summer of Code: Encrypted branch/repository format status

Sun Jul 22 02:14:40 BST 2007

On Fri, 2007-07-20 at 11:07 -0700, Martin Pool wrote:
> On 7/19/07, Robert Collins <robertc at robertcollins.net> wrote:
> 
> > > > The other question in my mind is whether this really needs to come in
> > > > at the knit level.  Could you instead interpose an encrypting
> > > > transport for access to some files?  I realize random access may be a
> > > > bit hard to get just right, but that's probably no worse than for
> > > > doing it in a knit... The transport interface is pretty stable.
> >
> > Doing it in a transport layer would essentially involve creating an
> > encrypted VFS that works smoothly over FTP etc. I think thats a harder
> > problem that working within our storage later. (It has to be safe
> > against concurrent writers because its outside the locking logic).
> 
> I don't follow you.  It's quite reasonable to have a Transport that
> doesn't support multiple concurrent writers to a file

I did not mean within a single bzr process, I meant across bzr
instances. Different bzr's need a common enough view of whats going on
to implement lock_write/unlock safely.

> , or even that
> refuses to layer on top of ftp. 

I should have said (s)ftp before. This is one of our very nice features
that we work in this situation; it would be sad to discard that when
adding another unique feature.

>  This Transport could be created by
> the Repository or even the Knit to do the encryption, and not be
> generally accessible through get_transport.

Theres a disconnect between the layers here that I don't believe will
work well with the interface we have available: Block ciphers with no
conceptual increase in size will still write more data to the knit
content files than actually requested because of rounding to the block
size. Then 'append' will actually rewrite data, and thats something we
try hard not to do. If append does not rewrite data, then parsing the
kndx files is going to be more complex, and also we will waste a great
deal of index space.

Encrypted block devices within linux work well because they are expected
to rewrite blocks as single bits change - the underlying layer and the
layer offered by the encrypted device match. Doing an encryption
facility within a filesystem can also match layers because its able to
make operations like append be done with new blocks on disk, thus
atomically preserving the old data in the event of failure. Neither of
these options are readily available to us.

OTOH if one takes the ciphering up into the knit layer it becomes fairly
trivial to invoke a (de)crpytion routine around each read/append
operation.

Possibly I'm missing something that would make it trivial to do this
differently - while being no less safe than our unencrypted format, nor
hugely slower.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070722/1f1c7714/attachment.pgp