Rev 41: Play around a bit. in http://bazaar.launchpad.net/%7Ejameinel/bzr-groupcompress/experimental
John Arbash Meinel
john at arbash-meinel.com
Thu Feb 19 20:55:49 GMT 2009
At http://bazaar.launchpad.net/%7Ejameinel/bzr-groupcompress/experimental
------------------------------------------------------------
revno: 41
revision-id: john at arbash-meinel.com-20090219205517-drw89424koe6h1da
parent: john at arbash-meinel.com-20090219204834-27ltrakcvdmlpqa8
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: experimental
timestamp: Thu 2009-02-19 14:55:17 -0600
message:
Play around a bit.
1) Empty texts are no-op inserted, to avoid ever trying to match against their text.
2) If we find a new file-id and the compressor is more than half full, we go
ahead and start a new compressor.
-------------- next part --------------
=== modified file 'groupcompress.py'
--- a/groupcompress.py 2009-02-19 20:48:34 +0000
+++ b/groupcompress.py 2009-02-19 20:55:17 +0000
@@ -204,6 +204,10 @@
key = key[:-1] + ('sha1:' + sha1,)
label = '\x00'.join(key)
# setup good encoding for trailing \n support.
+ if not lines:
+ lines_is_empty = True
+ else:
+ lines_is_empty = False
if not lines or lines[-1].endswith('\n'):
lines.append('\n')
else:
@@ -218,7 +222,11 @@
flush_range = self.flush_range
copy_ends = None
blocks = None
- if len(key) > 1:
+ if lines_is_empty:
+ # Empty texts are given a simple 'i1\n\n' insertion instruction.
+ # This prevents us from trying to match against an empty text.
+ blocks = [(0, len(lines), 0)]
+ if blocks is None and len(key) > 1:
prefix = key[0]
if prefix not in self._present_prefixes:
self._present_prefixes.add(prefix)
@@ -642,6 +650,20 @@
bytes = adapter.get_bytes(record,
record.get_bytes_as(record.storage_kind))
lines = osutils.split_lines(bytes)
+ if len(record.key) > 1:
+ prefix = record.key[0]
+ if (prefix not in self._compressor._present_prefixes
+ and basis_end > 1024 * 1024 * 10):
+ # This is a new file id we are inserting.
+ # And the file is already more than half full. This record
+ # would be added as full lines, so go ahead and start a new
+ # group
+ flush()
+ self._compressor = GroupCompressor(self._delta)
+ self._unadded_refs = {}
+ keys_to_add = []
+ basis_end = 0
+ groups += 1
found_sha1, end_point = self._compressor.compress(record.key,
lines, record.sha1)
if record.key[-1] is None:
More information about the bazaar-commits
mailing list