Rev 3773: (jam) Add a hidden 'dump-btree' command for getting the raw info out in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Fri Oct 10 21:13:52 BST 2008
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 3773
revision-id: pqm at pqm.ubuntu.com-20081010201349-ccw3kwu9fe7iaw77
parent: pqm at pqm.ubuntu.com-20081010194144-0hujuzlipigm8pbs
parent: john at arbash-meinel.com-20081010191519-jrqt2sf7jw4u392o
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Fri 2008-10-10 21:13:49 +0100
message:
(jam) Add a hidden 'dump-btree' command for getting the raw info out
of a btree index.
added:
bzrlib/tests/blackbox/test_dump_btree.py test_dump_btree.py-20081008203335-zkpcq230b6vubszz-1
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/tests/blackbox/__init__.py __init__.py-20051128053524-eba30d8255e08dc3
------------------------------------------------------------
revno: 3770.1.5
revision-id: john at arbash-meinel.com-20081010191519-jrqt2sf7jw4u392o
parent: john at arbash-meinel.com-20081010185341-bbrdlq1ydy2ovnv7
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dump_btree
timestamp: Fri 2008-10-10 14:15:19 -0500
message:
Add a trailing period for the option '--raw'
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
------------------------------------------------------------
revno: 3770.1.4
revision-id: john at arbash-meinel.com-20081010185341-bbrdlq1ydy2ovnv7
parent: john at arbash-meinel.com-20081008215612-y9v94tqxreqoangx
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dump_btree
timestamp: Fri 2008-10-10 13:53:41 -0500
message:
Clarify the help text a bit.
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
------------------------------------------------------------
revno: 3770.1.3
revision-id: john at arbash-meinel.com-20081008215612-y9v94tqxreqoangx
parent: john at arbash-meinel.com-20081008215137-wu18nhhorncyon50
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dump_btree
timestamp: Wed 2008-10-08 16:56:12 -0500
message:
Simplify the --raw mode.
I didn't realize, but the only node that is special cased is the 'root' node,
and to read it, you actually have to parse it directly, because the
compressed bytes start immediately after the end of the header, rather than
having any padding before the zlib bytes.
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/tests/blackbox/test_dump_btree.py test_dump_btree.py-20081008203335-zkpcq230b6vubszz-1
------------------------------------------------------------
revno: 3770.1.2
revision-id: john at arbash-meinel.com-20081008215137-wu18nhhorncyon50
parent: john at arbash-meinel.com-20081008204023-z1u32sjby509wl12
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dump_btree
timestamp: Wed 2008-10-08 16:51:37 -0500
message:
Add a --raw output for dump-btree.
This does the minimum it can, so that we can dump out the
raw bytes in a meaningful manner.
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/tests/blackbox/test_dump_btree.py test_dump_btree.py-20081008203335-zkpcq230b6vubszz-1
------------------------------------------------------------
revno: 3770.1.1
revision-id: john at arbash-meinel.com-20081008204023-z1u32sjby509wl12
parent: pqm at pqm.ubuntu.com-20081008020104-e68hyxx45qo19nzx
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dump_btree
timestamp: Wed 2008-10-08 15:40:23 -0500
message:
First draft of a basic dump-btree command.
Does enough for what I need with pack-names files, but I'd like it to be a
bit more 'raw'.
added:
bzrlib/tests/blackbox/test_dump_btree.py test_dump_btree.py-20081008203335-zkpcq230b6vubszz-1
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/tests/blackbox/__init__.py __init__.py-20051128053524-eba30d8255e08dc3
=== modified file 'NEWS'
--- a/NEWS 2008-10-08 01:28:40 +0000
+++ b/NEWS 2008-10-08 20:40:23 +0000
@@ -7,6 +7,12 @@
IN DEVELOPMENT
--------------
+ IMPROVEMENTS:
+
+ * ``bzr dump-btree`` is a hidden command introduced to allow dumping
+ the contents of a compressed btree file. (John Arbash Meinel)
+
+
bzr 1.8rc1 2008-10-07
---------------------
=== modified file 'bzrlib/builtins.py'
--- a/bzrlib/builtins.py 2008-10-02 17:28:44 +0000
+++ b/bzrlib/builtins.py 2008-10-10 19:15:19 +0000
@@ -29,6 +29,7 @@
from bzrlib import (
bugtracker,
bundle,
+ btree_index,
bzrdir,
delta,
config,
@@ -255,7 +256,81 @@
' revision.')
rev_id = rev.as_revision_id(b)
self.outf.write(b.repository.get_revision_xml(rev_id).decode('utf-8'))
-
+
+
+class cmd_dump_btree(Command):
+ """Dump the contents of a btree index file to stdout.
+
+ PATH is a btree index file, it can be any URL. This includes things like
+ .bzr/repository/pack-names, or .bzr/repository/indices/a34b3a...ca4a4.iix
+
+ By default, the tuples stored in the index file will be displayed. With
+ --raw, we will uncompress the pages, but otherwise display the raw bytes
+ stored in the index.
+ """
+
+ # TODO: Do we want to dump the internal nodes as well?
+ # TODO: It would be nice to be able to dump the un-parsed information,
+ # rather than only going through iter_all_entries. However, this is
+ # good enough for a start
+ hidden = True
+ encoding_type = 'exact'
+ takes_args = ['path']
+ takes_options = [Option('raw', help='Write the uncompressed bytes out,'
+ ' rather than the parsed tuples.'),
+ ]
+
+ def run(self, path, raw=False):
+ dirname, basename = osutils.split(path)
+ t = transport.get_transport(dirname)
+ if raw:
+ self._dump_raw_bytes(t, basename)
+ else:
+ self._dump_entries(t, basename)
+
+ def _get_index_and_bytes(self, trans, basename):
+ """Create a BTreeGraphIndex and raw bytes."""
+ bt = btree_index.BTreeGraphIndex(trans, basename, None)
+ bytes = trans.get_bytes(basename)
+ bt._file = cStringIO.StringIO(bytes)
+ bt._size = len(bytes)
+ return bt, bytes
+
+ def _dump_raw_bytes(self, trans, basename):
+ import zlib
+
+ # We need to parse at least the root node.
+ # This is because the first page of every row starts with an
+ # uncompressed header.
+ bt, bytes = self._get_index_and_bytes(trans, basename)
+ for page_idx, page_start in enumerate(xrange(0, len(bytes),
+ btree_index._PAGE_SIZE)):
+ page_end = min(page_start + btree_index._PAGE_SIZE, len(bytes))
+ page_bytes = bytes[page_start:page_end]
+ if page_idx == 0:
+ self.outf.write('Root node:\n')
+ header_end, data = bt._parse_header_from_bytes(page_bytes)
+ self.outf.write(page_bytes[:header_end])
+ page_bytes = data
+ self.outf.write('\nPage %d\n' % (page_idx,))
+ decomp_bytes = zlib.decompress(page_bytes)
+ self.outf.write(decomp_bytes)
+ self.outf.write('\n')
+
+ def _dump_entries(self, trans, basename):
+ try:
+ st = trans.stat(basename)
+ except errors.TransportNotPossible:
+ # We can't stat, so we'll fake it because we have to do the 'get()'
+ # anyway.
+ bt, _ = self._get_index_and_bytes(trans, basename)
+ else:
+ bt = btree_index.BTreeGraphIndex(trans, basename, st.st_size)
+ for node in bt.iter_all_entries():
+ # Node is made up of:
+ # (index, key, value, [references])
+ self.outf.write('%s\n' % (node[1:],))
+
class cmd_remove_tree(Command):
"""Remove the working tree from a given branch/checkout.
=== modified file 'bzrlib/tests/blackbox/__init__.py'
--- a/bzrlib/tests/blackbox/__init__.py 2008-06-05 16:27:16 +0000
+++ b/bzrlib/tests/blackbox/__init__.py 2008-10-08 20:40:23 +0000
@@ -62,6 +62,7 @@
'bzrlib.tests.blackbox.test_conflicts',
'bzrlib.tests.blackbox.test_debug',
'bzrlib.tests.blackbox.test_diff',
+ 'bzrlib.tests.blackbox.test_dump_btree',
'bzrlib.tests.blackbox.test_exceptions',
'bzrlib.tests.blackbox.test_export',
'bzrlib.tests.blackbox.test_find_merge_base',
=== added file 'bzrlib/tests/blackbox/test_dump_btree.py'
--- a/bzrlib/tests/blackbox/test_dump_btree.py 1970-01-01 00:00:00 +0000
+++ b/bzrlib/tests/blackbox/test_dump_btree.py 2008-10-08 21:56:12 +0000
@@ -0,0 +1,80 @@
+# Copyright (C) 2008 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+#
+
+"""Tests of the 'bzr dump-btree' command."""
+
+from bzrlib import (
+ btree_index,
+ tests,
+ )
+from bzrlib.tests import (
+ http_server,
+ )
+
+
+class TestDumpBtree(tests.TestCaseWithTransport):
+
+ def create_sample_btree_index(self):
+ builder = btree_index.BTreeBuilder(
+ reference_lists=1, key_elements=2)
+ builder.add_node(('test', 'key1'), 'value', ((('ref', 'entry'),),))
+ builder.add_node(('test', 'key2'), 'value2', ((('ref', 'entry2'),),))
+ builder.add_node(('test2', 'key3'), 'value3', ((('ref', 'entry3'),),))
+ out_f = builder.finish()
+ try:
+ self.build_tree_contents([('test.btree', out_f.read())])
+ finally:
+ out_f.close()
+
+ def test_dump_btree_smoke(self):
+ self.create_sample_btree_index()
+ out, err = self.run_bzr('dump-btree test.btree')
+ self.assertEqualDiff(
+ "(('test', 'key1'), 'value', ((('ref', 'entry'),),))\n"
+ "(('test', 'key2'), 'value2', ((('ref', 'entry2'),),))\n"
+ "(('test2', 'key3'), 'value3', ((('ref', 'entry3'),),))\n",
+ out)
+
+ def test_dump_btree_http_smoke(self):
+ self.transport_readonly_server = http_server.HttpServer
+ self.create_sample_btree_index()
+ url = self.get_readonly_url('test.btree')
+ out, err = self.run_bzr(['dump-btree', url])
+ self.assertEqualDiff(
+ "(('test', 'key1'), 'value', ((('ref', 'entry'),),))\n"
+ "(('test', 'key2'), 'value2', ((('ref', 'entry2'),),))\n"
+ "(('test2', 'key3'), 'value3', ((('ref', 'entry3'),),))\n",
+ out)
+
+ def test_dump_btree_raw_smoke(self):
+ self.create_sample_btree_index()
+ out, err = self.run_bzr('dump-btree test.btree --raw')
+ self.assertEqualDiff(
+ 'Root node:\n'
+ 'B+Tree Graph Index 2\n'
+ 'node_ref_lists=1\n'
+ 'key_elements=2\n'
+ 'len=3\n'
+ 'row_lengths=1\n'
+ '\n'
+ 'Page 0\n'
+ 'type=leaf\n'
+ 'test\0key1\0ref\0entry\0value\n'
+ 'test\0key2\0ref\0entry2\0value2\n'
+ 'test2\0key3\0ref\0entry3\0value3\n'
+ '\n',
+ out)
More information about the bazaar-commits
mailing list