Smart server plans for 1.7: effort tests, single RPC for branch opening, server-side autopacking.

John Arbash Meinel john at arbash-meinel.com
Wed Aug 6 23:44:08 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
| John Arbash Meinel wrote:
| [...]
|> I was actually thinking we could do a memory bound test now that we have
|> "bzr -Dmemory". My idea would be to create a say 50MB file, and then
|> commit it, and ensure that memory doesn't go about 200MB (or whatever is
|> reasonable.) And then ratchet down that upper limit as we get rid of
|> extra copies. The idea is to use a number large enough that copying the
|> text will show up in memory, but is immune to the little allocations for
|> everything else we might do.
|>
|> Or say, hardlink 10 of these files, etc. I wouldn't want to be abusive
|> on disk space for the test, but I think we could really use some amount
|> of memory consumption testing.
|
| Well, if you make a sparse file (“f = open('foo', 'w'); f.seek(50 *
1024*1024);
| f.write('x')”), then you won't actually consume significant diskspace
either.
| It's not completely portable (but you're already talking about using
hardlinks),
| but on at least Linux and ext3 that doesn't actually allocate 50MB of
space.
|
| The unwritten, unallocated bytes are assumed to be 0s.  Obviously that
only
| helps certain test cases, but it's a start.

Yeah, that would help some bits. It would fail for others. 50MB of all
0s compresses *really* well. I think gzip has an upper bound on
compression ratio, but I remember doing tests with gzip & bzip2, and you
could compress 2GB+ of 0's down to something very small with bzip2.

% dd if=/dev/zero count=1024 bs=1024 | bzip2 -v > test.bz2
23301.689:1,  0.000 bits/byte, 100.00% saved, 1048576 in, 45 out.
% dd if=/dev/zero count=10240 bs=1024 | bzip2 -v > test.bz2
213995.102:1,  0.000 bits/byte, 100.00% saved, 10485760 in, 49 out.
% dd if=/dev/zero count=102400 bs=1024 | bzip2 -v >! test.bz2
927943.363:1,  0.000 bits/byte, 100.00% saved, 104857600 in, 113 out.

So 100MB of 0s compresses down to just 113 bytes.

It is different for gzip:
% dd if=/dev/zero count=1024 bs=1024 | gzip >! test.gz && ll test.gz
1051 Aug  6 17:39 test.gz
% dd if=/dev/zero count=10240 bs=1024 | gzip >! test.gz && ll test.gz
10208 Aug  6 17:40 test.gz
% dd if=/dev/zero count=102400 bs=1024 | gzip >! test.gz && ll test.gz
101791 Aug  6 17:41 test.gz

1MB => 1K, 10MB => 10K, 100MB => 100K

Or a max of ~1000:1.

(Interestingly, this doesn't change with gzip -9).


|
| I suppose we could just skip hardlinks and sparse files altogether and
just
| implement a custom transport.  E.g. one that isn't backed by a
filesystem at
| all, and just dynamically creates file contents based on name (so
| fakefiles:///100000 could generate a 100000 byte file when read, without
| necessarily buffering 100000 bytes in memory to do so).

We could, except most of the WT code goes directly to disk. We could do
it with a MemoryTree, though that is backed only by memory:///.

|
|> Any thoughts on how to do it tastefully?
|
| Not really, well not beyond what's written above :)
|
| I agree that it would be valuable.
|
| -Andrew.
|
|

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkiaKTgACgkQJdeBCYSNAAPKpwCbB8pn3cRavbgUtsNtF6UeJzn1
yKsAmgLoNZtg851bu8pAQA1jb6vf5C8n
=hpgX
-----END PGP SIGNATURE-----



More information about the bazaar mailing list