Rev 3816: Merge in the latest brisbane-core, including the hash_search_key changes. in http://bzr.arbash-meinel.com/branches/bzr/brisbane/inv_as_lines

John Arbash Meinel john at arbash-meinel.com
Thu Feb 12 22:36:26 GMT 2009


At http://bzr.arbash-meinel.com/branches/bzr/brisbane/inv_as_lines

------------------------------------------------------------
revno: 3816
revision-id: john at arbash-meinel.com-20090212223553-i8x5whzol4eq1x5d
parent: john at arbash-meinel.com-20090212223304-aw6qqypkpqzc3rj6
parent: john at arbash-meinel.com-20090212223418-0srirubvn4ybz180
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: inv_as_lines
timestamp: Thu 2009-02-12 16:35:53 -0600
message:
  Merge in the latest brisbane-core, including the hash_search_key changes.
modified:
  bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
  bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
  bzrlib/chk_serializer.py       chk_serializer.py-20081002064345-2tofdfj2eqq01h4b-1
  bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
  bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
  bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 3814.1.1
    revision-id: john at arbash-meinel.com-20090212223418-0srirubvn4ybz180
    parent: john at arbash-meinel.com-20090212211000-msisyrtb3o4mawln
    parent: john at arbash-meinel.com-20090212212016-7fs1tvb30hf8omcg
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-02-12 16:34:18 -0600
    message:
      Merge the hash_search_key branch.
    modified:
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/chk_serializer.py       chk_serializer.py-20081002064345-2tofdfj2eqq01h4b-1
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
      bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 3809.1.11
    revision-id: john at arbash-meinel.com-20090212212016-7fs1tvb30hf8omcg
    parent: john at arbash-meinel.com-20090212210741-1azx1cuyvrdl2lfy
    parent: john at arbash-meinel.com-20090212211000-msisyrtb3o4mawln
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Thu 2009-02-12 15:20:16 -0600
    message:
      Bring in the latest brisbane-core, and all associated bzr.dev
    added:
      bzrlib/plugins/launchpad/test_lp_open.py test_lp_open.py-20090125174355-hxrsxh3sj84225qu-1
      bzrlib/tests/blackbox/test_filesystem_cicp.py test_filesystem_cicp-20081028010456-vclkg401m81keaxc-1
      bzrlib/tests/branch_implementations/test_dotted_revno_to_revision_id.py test_dotted_revno_to-20090121014844-6x7d9jtri5sspg1o-1
      bzrlib/tests/branch_implementations/test_iter_merge_sorted_revisions.py test_merge_sorted_re-20090121004847-to3gvjwigstu93eh-1
      bzrlib/tests/branch_implementations/test_revision_id_to_dotted_revno.py test_revision_id_to_-20090122052032-g3czslif6sdqfkh3-1
      bzrlib/tests/test_smart_request.py test_smart_request.p-20090211070731-o38wayv3asm25d6a-1
      doc/developers/case-insensitive-file-systems.txt caseinsensitivefiles-20081117224243-p84xpmqnsa1p8k91-1
      doc/developers/colocated-branches.txt colocatedbranches.tx-20090209183539-wv9upczfd8ryyfn1-1
      doc/news-template.txt          newstemplate.txt-20090113030949-kn6dn0xcj1rd6vmn-1
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzr                            bzr.py-20050313053754-5485f144c7006fa6
      bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
      bzrlib/add.py                  add.py-20050323030017-3a77d63feda58e33
      bzrlib/annotate.py             annotate.py-20050922133147-7c60541d2614f022
      bzrlib/branch.py               branch.py-20050309040759-e4baf4e0d046576e
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/bundle/__init__.py      changeset.py-20050513021216-b02ab57fb9738913
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/commands.py             bzr.py-20050309040720-d10f4714595cf8c3
      bzrlib/debug.py                debug.py-20061102062349-vdhrw9qdpck8cl35-1
      bzrlib/delta.py                delta.py-20050729221636-54cf14ef94783d0a
      bzrlib/dirstate.py             dirstate.py-20060728012006-d6mvoihjb3je9peu-1
      bzrlib/errors.py               errors.py-20050309040759-20512168c4e14fbd
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/foreign.py              foreign.py-20081112170002-olsxmandkk8qyfuq-1
      bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
      bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
      bzrlib/help_topics/en/hooks.txt hooks.txt-20070830033044-xxu2rced13f72dka-1
      bzrlib/help_topics/en/rules.txt rules.txt-20080516063844-ghr5l6pvvrhiycun-1
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/log.py                  log.py-20050505065812-c40ce11702fe5fb1
      bzrlib/merge.py                merge.py-20050513021216-953b65a438527106
      bzrlib/missing.py              missing.py-20050812153334-097f7097e2a8bcd1
      bzrlib/msgeditor.py            msgeditor.py-20050901111708-ef6d8de98f5d8f2f
      bzrlib/mutabletree.py          mutabletree.py-20060906023413-4wlkalbdpsxi2r4y-2
      bzrlib/osutils.py              osutils.py-20050309040759-eeaff12fbf77ac86
      bzrlib/patches.py              patches.py-20050727183609-378c1cc5972ce908
      bzrlib/plugins/launchpad/__init__.py __init__.py-20060315182712-2d5feebd2a1032dc
      bzrlib/plugins/launchpad/lp_registration.py lp_registration.py-20060315190948-daa617eafe3a8d48
      bzrlib/plugins/launchpad/test_lp_directory.py test_lp_indirect.py-20070126002743-oyle362tzv9cd8mi-1
      bzrlib/plugins/launchpad/test_lp_service.py test_lp_service.py-20080213034527-drf0ucr2x1js3onb-1
      bzrlib/progress.py             progress.py-20050610070202-df9faaab791964c0
      bzrlib/registry.py             lazy_factory.py-20060809213415-2gfvqadtvdn0phtg-1
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/revisionspec.py         revisionspec.py-20050907152633-17567659fd5c0ddb
      bzrlib/rules.py                properties.py-20080506032617-9k06uqalkf09ck0z-1
      bzrlib/shelf.py                prepare_shelf.py-20081005181341-n74qe6gu1e65ad4v-1
      bzrlib/shelf_ui.py             shelver.py-20081005210102-33worgzwrtdw0yrm-1
      bzrlib/smart/client.py         client.py-20061116014825-2k6ada6xgulslami-1
      bzrlib/smart/message.py        message.py-20080222013625-ncqmh3nrxjkxab87-1
      bzrlib/smart/protocol.py       protocol.py-20061108035435-ot0lstk2590yqhzr-1
      bzrlib/smart/request.py        request.py-20061108095550-gunadhxmzkdjfeek-1
      bzrlib/status.py               status.py-20050505062338-431bfa63ec9b19e6
      bzrlib/tests/__init__.py       selftest.py-20050531073622-8d0e3c8845c97a64
      bzrlib/tests/blackbox/__init__.py __init__.py-20051128053524-eba30d8255e08dc3
      bzrlib/tests/blackbox/test_add.py test_add.py-20060518072250-857e4f86f54a30b2
      bzrlib/tests/blackbox/test_annotate.py testannotate.py-20051013044000-457f44801bfa9d39
      bzrlib/tests/blackbox/test_branch.py test_branch.py-20060524161337-noms9gmcwqqrfi8y-1
      bzrlib/tests/blackbox/test_breakin.py test_breakin.py-20070424043903-qyy6zm4pj3h4sbp3-1
      bzrlib/tests/blackbox/test_init.py test_init.py-20060309032856-a292116204d86eb7
      bzrlib/tests/blackbox/test_log.py test_log.py-20060112090212-78f6ea560c868e24
      bzrlib/tests/blackbox/test_missing.py test_missing.py-20051211212735-a2cf4c1840bb84c4
      bzrlib/tests/blackbox/test_send.py test_bundle.py-20060616222707-c21c8b7ea5ef57b1
      bzrlib/tests/blackbox/test_serve.py test_serve.py-20060913064329-8t2pvmsikl4s3xhl-1
      bzrlib/tests/blackbox/test_shelve.py test_ls_shelf.py-20081202053526-thlo8yt0pi1cgor1-1
      bzrlib/tests/blackbox/test_status.py teststatus.py-20050712014354-508855eb9f29f7dc
      bzrlib/tests/blackbox/test_upgrade.py test_upgrade.py-20060120060132-b41e5ed2f886ad28
      bzrlib/tests/branch_implementations/__init__.py __init__.py-20060123013057-b12a52c3f361daf4
      bzrlib/tests/bzrdir_implementations/test_bzrdir.py test_bzrdir.py-20060131065642-0ebeca5e30e30866
      bzrlib/tests/https_server.py   https_server.py-20071121173708-aj8zczi0ziwbwz21-1
      bzrlib/tests/per_repository_chk/test_supported.py test_supported.py-20080925063728-k65ry0n2rhta6t34-1
      bzrlib/tests/test_bzrdir.py    test_bzrdir.py-20060131065654-deba40eef51cf220
      bzrlib/tests/test_commands.py  test_command.py-20051019190109-3b17be0f52eaa7a8
      bzrlib/tests/test_delta.py     test_delta.py-20070110134455-sqpd1y7mbjndelxf-1
      bzrlib/tests/test_errors.py    test_errors.py-20060210110251-41aba2deddf936a8
      bzrlib/tests/test_foreign.py   test_foreign.py-20081125004048-ywb901edgp9lluxo-1
      bzrlib/tests/test_graph.py     test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
      bzrlib/tests/test_http.py      testhttp.py-20051018020158-b2eef6e867c514d9
      bzrlib/tests/test_info.py      test_info.py-20070320150933-m0xxm1g7xi9v6noe-1
      bzrlib/tests/test_knit.py      test_knit.py-20051212171302-95d4c00dd5f11f2b
      bzrlib/tests/test_log.py       testlog.py-20050728115707-1a514809d7d49309
      bzrlib/tests/test_merge.py     testmerge.py-20050905070950-c1b5aa49ff911024
      bzrlib/tests/test_missing.py   test_missing.py-20051212000028-694fa4f658a81f48
      bzrlib/tests/test_msgeditor.py test_msgeditor.py-20051202041359-920315ec6011ee51
      bzrlib/tests/test_osutils.py   test_osutils.py-20051201224856-e48ee24c12182989
      bzrlib/tests/test_patches.py   test_patches.py-20051231203844-f4974d20f6aea09c
      bzrlib/tests/test_progress.py  test_progress.py-20060308160359-978c397bc79b7fda
      bzrlib/tests/test_read_bundle.py test_read_bundle.py-20060615211421-ud8cwr1ulgd914zf-1
      bzrlib/tests/test_remote.py    test_remote.py-20060720103555-yeeg2x51vn0rbtdp-2
      bzrlib/tests/test_revisionspec.py testrevisionnamespaces.py-20050711050225-8b4af89e6b1efe84
      bzrlib/tests/test_rules.py     test_properties.py-20080506033501-3p9kmuob25dho8xl-1
      bzrlib/tests/test_selftest.py  test_selftest.py-20051202044319-c110a115d8c0456a
      bzrlib/tests/test_sftp_transport.py testsftp.py-20051027032739-247570325fec7e7e
      bzrlib/tests/test_shelf.py     test_prepare_shelf.p-20081005181341-n74qe6gu1e65ad4v-2
      bzrlib/tests/test_shelf_ui.py  test_shelf_ui.py-20081027155203-wtcuazg85wp9u4fv-1
      bzrlib/tests/test_smart_add.py test_smart_add.py-20050824235919-c60dcdb0c8e999ce
      bzrlib/tests/test_smart_transport.py test_ssh_transport.py-20060608202016-c25gvf1ob7ypbus6-2
      bzrlib/tests/test_status.py    test_status.py-20060516190614-fbf6432e4a6e8aa5
      bzrlib/tests/test_transform.py test_transaction.py-20060105172520-b3ffb3946550e6c4
      bzrlib/tests/test_ui.py        test_ui.py-20051130162854-458e667a7414af09
      bzrlib/tests/test_versionedfile.py test_versionedfile.py-20060222045249-db45c9ed14a1c2e5
      bzrlib/tests/tree_implementations/test_get_symlink_target.py test_get_symlink_tar-20070225165554-ickod3w3t7u0zzqh-1
      bzrlib/tests/tree_implementations/test_inv.py test_inv.py-20070312023226-0cdvk5uwhutis9vg-1
      bzrlib/tests/tree_implementations/test_path_content_summary.py test_path_content_su-20070904100855-3vrwedz6akn34kl5-1
      bzrlib/transform.py            transform.py-20060105172343-dd99e54394d91687
      bzrlib/transport/__init__.py   transport.py-20050711165921-4978aa7ce1285ad5
      bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
      bzrlib/transport/http/_pycurl.py pycurlhttp.py-20060110060940-4e2a705911af77a6
      bzrlib/transport/http/_urllib.py _urlgrabber.py-20060113083826-0bbf7d992fbf090c
      bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
      bzrlib/transport/sftp.py       sftp.py-20051019050329-ab48ce71b7e32dfe
      bzrlib/tree.py                 tree.py-20050309040759-9d5f2496be663e77
      bzrlib/ui/__init__.py          ui.py-20050824083933-8cf663c763ba53a9
      bzrlib/ui/text.py              text.py-20051130153916-2e438cffc8afc478
      bzrlib/upgrade.py              history2weaves.py-20050818063535-e7d319791c19a8b2
      bzrlib/util/bencode.py         bencode.py-20070220044742-sltr28q21w2wzlxi-1
      bzrlib/util/tests/test_bencode.py test_bencode.py-20070713042202-qjw8rppxaz7ky6i6-1
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
      bzrlib/workingtree.py          workingtree.py-20050511021032-29b6ec0a681e02e3
      doc/developers/HACKING.txt     HACKING-20050805200004-2a5dc975d870f78c
      doc/developers/api-versioning.txt apiversioning.txt-20070626065626-iiihgmhgkv91uphz-1
      doc/developers/index.txt       index.txt-20070508041241-qznziunkg0nffhiw-1
      doc/developers/plugin-api.txt  pluginapi.txt-20080229110225-q2j5y4agqhlkjn0s-1
      doc/developers/ppa.txt         ppa.txt-20080722055539-606u7t2z32t3ae4w-1
      doc/developers/releasing.txt   releasing.txt-20080502015919-fnrcav8fwy8ccibu-1
      doc/en/user-guide/installing_bazaar.txt installing_bazaar.tx-20071114035000-q36a9h57ps06uvnl-4
      setup.py                       setup.py-20050314065409-02f8a0a6e3f9bc70
      tools/win32/build_release.py   build_release.py-20081105204355-2ghh5cv01v1x4rzz-1
      tools/win32/bzr.iss.cog        bzr.iss.cog-20060622100836-b3yup582rt3y0nvm-5
    ------------------------------------------------------------
    revno: 3809.1.10
    revision-id: john at arbash-meinel.com-20090212210741-1azx1cuyvrdl2lfy
    parent: john at arbash-meinel.com-20090212205535-qcqdw8xdicm5es0i
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Thu 2009-02-12 15:07:41 -0600
    message:
      Don't track state for an infrequent edge case.
      
      Almost never will all search keys be identical. So rather than always tracking
      the state, add a function which can check. It is more expensive,
      but 99.9% of the time we never need to evaluate it.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 3809.1.9
    revision-id: john at arbash-meinel.com-20090212205535-qcqdw8xdicm5es0i
    parent: john at arbash-meinel.com-20090212202555-eb721ebcbm3arun1
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Thu 2009-02-12 14:55:35 -0600
    message:
      Handle collisions.
      
      When using a hash trie, it is possible to have all keys hash to the
      same value, even though that would no-longer fit in the desired
      LeafNode maximum size.
      If this happens, we want to go ahead and just keep growing the
      LeafNode. (The alternative causes an infinite recursion as we
      try to put the keys in another node.)
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 3809.1.8
    revision-id: john at arbash-meinel.com-20090212202555-eb721ebcbm3arun1
    parent: john at arbash-meinel.com-20090121230450-rcv4y4r3wsee87r8
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Thu 2009-02-12 14:25:55 -0600
    message:
      Expose 2 new formats for 'bzr init'.
      
      We can now create dev4, dev4 + 16-way hash, and dev4 + 255-way hash
      repositories.
    modified:
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3809.1.7
    revision-id: john at arbash-meinel.com-20090121230450-rcv4y4r3wsee87r8
    parent: john at arbash-meinel.com-20090121221958-73e6ejetze235lpn
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 17:04:50 -0600
    message:
      Start parameterizing CHKInventory and CHKSerializer so that we can
      have different repository formats which use different hash keys.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/chk_serializer.py       chk_serializer.py-20081002064345-2tofdfj2eqq01h4b-1
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 3809.1.6
    revision-id: john at arbash-meinel.com-20090121221958-73e6ejetze235lpn
    parent: john at arbash-meinel.com-20090121201426-dorxs36a1djjm6a2
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 16:19:58 -0600
    message:
      Include a _search_key_plain function.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 3809.1.5
    revision-id: john at arbash-meinel.com-20090121201426-dorxs36a1djjm6a2
    parent: john at arbash-meinel.com-20090121201027-ji143it572y82ou4
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 14:14:26 -0600
    message:
      Add tests that we can lookup things after being serialized.
    modified:
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 3809.1.4
    revision-id: john at arbash-meinel.com-20090121201027-ji143it572y82ou4
    parent: john at arbash-meinel.com-20090121200529-a1di2ljywetnbaz0
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 14:10:27 -0600
    message:
      Add some tests that we can use the search keys as proper mappings.
    modified:
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 3809.1.3
    revision-id: john at arbash-meinel.com-20090121200529-a1di2ljywetnbaz0
    parent: john at arbash-meinel.com-20090121193956-ijxjv8tslli5vpir
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 14:05:29 -0600
    message:
      Add functions for _search_key_16 and _search_key_255 and some basic tests for them.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 3809.1.2
    revision-id: john at arbash-meinel.com-20090121193956-ijxjv8tslli5vpir
    parent: john at arbash-meinel.com-20090112225502-lb8om88nqe1u5o3g
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Wed 2009-01-21 13:39:56 -0600
    message:
      Start passing around the search_key_func in more places.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 3809.1.1
    revision-id: john at arbash-meinel.com-20090112225502-lb8om88nqe1u5o3g
    parent: john at arbash-meinel.com-20090112184455-cich99qy75uqt2v1
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: hash_search_key
    timestamp: Mon 2009-01-12 16:55:02 -0600
    message:
      (broken) Start tracking down more code that needs to pass around the 'search_key_func'
      and make sure that things get done correctly.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
-------------- next part --------------
=== modified file 'bzrlib/bzrdir.py'
--- a/bzrlib/bzrdir.py	2009-02-12 21:10:00 +0000
+++ b/bzrlib/bzrdir.py	2009-02-12 21:20:16 +0000
@@ -3241,5 +3241,27 @@
     hidden=True,
     experimental=True,
     )
+format_registry.register_metadir('development4-hash16',
+    'bzrlib.repofmt.pack_repo.RepositoryFormatPackDevelopment4Hash16',
+    help='1.9 with CHK inventories with parent_id index and 16-way hash trie. '
+        'Please read '
+        'http://doc.bazaar-vcs.org/latest/developers/development-repo.html '
+        'before use.',
+    branch_format='bzrlib.branch.BzrBranchFormat7',
+    tree_format='bzrlib.workingtree.WorkingTreeFormat4',
+    hidden=True,
+    experimental=True,
+    )
+format_registry.register_metadir('development4-hash255',
+    'bzrlib.repofmt.pack_repo.RepositoryFormatPackDevelopment4Hash255',
+    help='1.9 with CHK inventories with parent_id index and 255-way hash trie. '
+        'Please read '
+        'http://doc.bazaar-vcs.org/latest/developers/development-repo.html '
+        'before use.',
+    branch_format='bzrlib.branch.BzrBranchFormat7',
+    tree_format='bzrlib.workingtree.WorkingTreeFormat4',
+    hidden=True,
+    experimental=True,
+    )
 # The current format that is made on 'bzr init'.
 format_registry.set_default('pack-0.92')

=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py	2009-02-12 22:33:04 +0000
+++ b/bzrlib/chk_map.py	2009-02-12 22:35:53 +0000
@@ -41,11 +41,15 @@
 
 from bzrlib import lazy_import
 lazy_import.lazy_import(globals(), """
+import zlib
+import struct
+
 from bzrlib import versionedfile
 """)
 from bzrlib import (
     lru_cache,
     osutils,
+    registry,
     )
 
 # approx 2MB
@@ -56,19 +60,51 @@
 _page_cache = lru_cache.LRUSizeCache(_PAGE_CACHE_SIZE)
 
 
+def _search_key_plain(key):
+    """Map the key tuple into a search string that just uses the key bytes."""
+    return '\x00'.join(key)
+
+
+def _search_key_16(key):
+    """Map the key tuple into a search key string which has 16-way fan out."""
+    return '\x00'.join(['%08X' % abs(zlib.crc32(bit)) for bit in key])
+
+
+def _search_key_255(key):
+    """Map the key tuple into a search key string which has 255-way fan out.
+
+    We use 255-way because '\n' is used as a delimiter, and causes problems
+    while parsing.
+    """
+    bytes = '\x00'.join([struct.pack('>i', zlib.crc32(bit)) for bit in key])
+    return bytes.replace('\n', '_')
+
+
+search_key_registry = registry.Registry()
+search_key_registry.register('plain', _search_key_plain)
+search_key_registry.register('hash-16-way', _search_key_16)
+search_key_registry.register('hash-255-way', _search_key_255)
+
+
 class CHKMap(object):
     """A persistent map from string to string backed by a CHK store."""
 
-    def __init__(self, store, root_key):
+    def __init__(self, store, root_key, search_key_func=None):
         """Create a CHKMap object.
 
         :param store: The store the CHKMap is stored in.
         :param root_key: The root key of the map. None to create an empty
             CHKMap.
+        :param search_key_func: A function mapping a key => bytes. These bytes
+            are then used by the internal nodes to split up leaf nodes into
+            multiple pages.
         """
         self._store = store
+        if search_key_func is None:
+            search_key_func = _search_key_plain
+        self._search_key_func = search_key_func
         if root_key is None:
-            self._root_node = LeafNode()
+            self._root_node = LeafNode(search_key_func=search_key_func)
         else:
             self._root_node = self._node_key(root_key)
 
@@ -95,6 +131,8 @@
         if type(self._root_node) == tuple:
             # Demand-load the root
             self._root_node = self._get_node(self._root_node)
+            # XXX: Shouldn't this be put into _deserialize?
+            self._root_node._search_key_func = self._search_key_func
 
     def _get_node(self, node):
         """Get a node.
@@ -108,7 +146,8 @@
         """
         if type(node) == tuple:
             bytes = self._read_bytes(node)
-            return _deserialise(bytes, node)
+            return _deserialise(bytes, node,
+                search_key_func=self._search_key_func)
         else:
             return node
 
@@ -362,7 +401,8 @@
         if len(node_details) == 1:
             self._root_node = node_details[0][1]
         else:
-            self._root_node = InternalNode(prefix)
+            self._root_node = InternalNode(prefix,
+                                search_key_func=self._search_key_func)
             self._root_node.set_maximum_size(node_details[0][1].maximum_size)
             self._root_node._key_width = node_details[0][1]._key_width
             for split, node in node_details:
@@ -493,11 +533,15 @@
         the key/value pairs.
     """
 
-    def __init__(self):
+    def __init__(self, search_key_func=None):
         Node.__init__(self)
         # All of the keys in this leaf node share this common prefix
         self._common_serialised_prefix = None
         self._serialise_key = '\x00'.join
+        if search_key_func is None:
+            self._search_key_func = _search_key_plain
+        else:
+            self._search_key_func = search_key_func
 
     def _current_size(self):
         """Answer the current serialised size of this node.
@@ -522,13 +566,13 @@
             + bytes_for_items)
 
     @classmethod
-    def deserialise(klass, bytes, key):
+    def deserialise(klass, bytes, key, search_key_func=None):
         """Deserialise bytes, with key key, into a LeafNode.
 
         :param bytes: The bytes of the node.
         :param key: The key that the serialised node has.
         """
-        result = LeafNode()
+        result = LeafNode(search_key_func=search_key_func)
         # Splitlines can split on '\r' so don't use it, split('\n') adds an
         # extra '' if the bytes ends in a final newline.
         lines = bytes.split('\n')
@@ -600,6 +644,9 @@
                 + len(str(value.count('\n'))) + 1
                 + len(value) + 1)
 
+    def _search_key(self, key):
+        return self._search_key_func(key)
+
     def _map_no_split(self, key, value):
         """Map a key to a value.
 
@@ -626,7 +673,12 @@
         if (self._len > 1
             and self._maximum_size
             and self._current_size() > self._maximum_size):
-            return True
+            # Check to see if all of the search_keys for this node are
+            # identical. We allow the node to grow under that circumstance
+            # (we could track this as common state, but it is infrequent)
+            if (search_key != self._search_prefix
+                or not self._are_search_keys_identical()):
+                return True
         return False
 
     def _split(self, store):
@@ -654,7 +706,7 @@
             if len(prefix) < split_at:
                 prefix += '\x00'*(split_at - len(prefix))
             if prefix not in result:
-                node = LeafNode()
+                node = LeafNode(search_key_func=self._search_key_func)
                 node.set_maximum_size(self._maximum_size)
                 node._key_width = self._key_width
                 result[prefix] = node
@@ -707,10 +759,6 @@
         _page_cache.add(self._key, bytes)
         return [self._key]
 
-    def _search_key(self, key):
-        """Return the search key for a key in this node."""
-        return '\x00'.join(key)
-
     def refs(self):
         """Return the references to other CHK's held by this node."""
         return []
@@ -725,6 +773,22 @@
         self._search_prefix = self.common_prefix_for_keys(search_keys)
         return self._search_prefix
 
+    def _are_search_keys_identical(self):
+        """Check to see if the search keys for all entries are the same.
+
+        When using a hash as the search_key it is possible for non-identical
+        keys to collide. If that happens enough, we may try overflow a
+        LeafNode, but as all are collisions, we must not split.
+        """
+        common_search_key = None
+        for key in self._items:
+            search_key = self._search_key(key)
+            if common_search_key is None:
+                common_search_key = search_key
+            elif search_key != common_search_key:
+                return False
+        return True
+
     def _compute_serialised_prefix(self):
         """Determine the common prefix for serialised keys in this node.
 
@@ -757,12 +821,16 @@
         LeafNode or InternalNode.
     """
 
-    def __init__(self, prefix=''):
+    def __init__(self, prefix='', search_key_func=None):
         Node.__init__(self)
         # The size of an internalnode with default values and no children.
         # How many octets key prefixes within this node are.
         self._node_width = 0
         self._search_prefix = prefix
+        if search_key_func is None:
+            self._search_key_func = _search_key_plain
+        else:
+            self._search_key_func = search_key_func
 
     def __repr__(self):
         items_str = sorted(self._items)
@@ -794,14 +862,14 @@
             len(str(self._maximum_size)))
 
     @classmethod
-    def deserialise(klass, bytes, key):
+    def deserialise(klass, bytes, key, search_key_func=None):
         """Deserialise bytes to an InternalNode, with key key.
 
         :param bytes: The bytes of the node.
         :param key: The key that the serialised node has.
         :return: An InternalNode instance.
         """
-        result = InternalNode()
+        result = InternalNode(search_key_func=search_key_func)
         # Splitlines can split on '\r' so don't use it, remove the extra ''
         # from the result of split('\n') because we should have a trailing
         # newline
@@ -881,7 +949,8 @@
                 except KeyError:
                     continue
                 else:
-                    node = _deserialise(bytes, key)
+                    node = _deserialise(bytes, key,
+                        search_key_func=self._search_key_func)
                     self._items[keys[key]] = node
                     found_keys.add(key)
                     yield node
@@ -901,7 +970,8 @@
                 nodes = []
                 for record in stream:
                     bytes = record.get_bytes_as('fulltext')
-                    node = _deserialise(bytes, record.key)
+                    node = _deserialise(bytes, record.key,
+                        search_key_func=self._search_key_func)
                     nodes.append(node)
                     self._items[keys[record.key]] = node
                     _page_cache.add(record.key, bytes)
@@ -920,7 +990,8 @@
             # and then map this key into that node.
             new_prefix = self.common_prefix(self._search_prefix,
                                             search_key)
-            new_parent = InternalNode(new_prefix)
+            new_parent = InternalNode(new_prefix,
+                search_key_func=self._search_key_func)
             new_parent.set_maximum_size(self._maximum_size)
             new_parent._key_width = self._key_width
             new_parent.add_node(self._search_prefix[:len(new_prefix)+1],
@@ -971,6 +1042,7 @@
         child = klass()
         child.set_maximum_size(self._maximum_size)
         child._key_width = self._key_width
+        child._search_key_func = self._search_key_func
         self._items[search_key] = child
         return child
 
@@ -1013,7 +1085,7 @@
         """Return the serialised key for key in this node."""
         # search keys are fixed width. All will be self._node_width wide, so we
         # pad as necessary.
-        return ('\x00'.join(key) + '\x00'*self._node_width)[:self._node_width]
+        return (self._search_key_func(key) + '\x00'*self._node_width)[:self._node_width]
 
     def _search_prefix_filter(self, key):
         """Serialise key for use as a prefix filter in iteritems."""
@@ -1112,7 +1184,7 @@
         #       and cause size changes greater than the length of one key.
         #       So for now, we just add everything to a new Leaf until it
         #       splits, as we know that will give the right answer
-        new_leaf = LeafNode()
+        new_leaf = LeafNode(search_key_func=self._search_key_func)
         new_leaf.set_maximum_size(self._maximum_size)
         new_leaf._key_width = self._key_width
         # A batch_size of 16 was chosen because:
@@ -1132,12 +1204,13 @@
         return new_leaf
 
 
-def _deserialise(bytes, key):
+def _deserialise(bytes, key, search_key_func):
     """Helper for repositorydetails - convert bytes to a node."""
     if bytes.startswith("chkleaf:\n"):
-        return LeafNode.deserialise(bytes, key)
+        return LeafNode.deserialise(bytes, key, search_key_func=search_key_func)
     elif bytes.startswith("chknode:\n"):
-        return InternalNode.deserialise(bytes, key)
+        return InternalNode.deserialise(bytes, key,
+            search_key_func=search_key_func)
     else:
         raise AssertionError("Unknown node type.")
 
@@ -1157,7 +1230,9 @@
         if pb is not None:
             pb.tick()
         bytes = record.get_bytes_as('fulltext')
-        node = _deserialise(bytes, record.key)
+        # We don't care about search_key_func for this code, because we only
+        # care about external references.
+        node = _deserialise(bytes, record.key, search_key_func=None)
         if record.key in uninteresting_keys:
             if isinstance(node, InternalNode):
                 next_uninteresting.update(node.refs())
@@ -1221,7 +1296,9 @@
             else:
                 bytes = adapter.get_bytes(record,
                             record.get_bytes_as(record.storage_kind))
-            node = _deserialise(bytes, record.key)
+            # We don't care about search_key_func for this code, because we
+            # only care about external references.
+            node = _deserialise(bytes, record.key, search_key_func=None)
             if isinstance(node, InternalNode):
                 # uninteresting_prefix_chks.update(node._items.iteritems())
                 chks = node._items.values()
@@ -1289,7 +1366,9 @@
             else:
                 bytes = adapter.get_bytes(record,
                             record.get_bytes_as(record.storage_kind))
-            node = _deserialise(bytes, record.key)
+            # We don't care about search_key_func for this code, because we
+            # only care about external references.
+            node = _deserialise(bytes, record.key, search_key_func=None)
             if isinstance(node, InternalNode):
                 chks = set(node.refs())
                 chks.difference_update(all_uninteresting_chks)

=== modified file 'bzrlib/chk_serializer.py'
--- a/bzrlib/chk_serializer.py	2008-11-14 01:28:40 +0000
+++ b/bzrlib/chk_serializer.py	2009-01-21 23:04:50 +0000
@@ -46,9 +46,11 @@
         else:
             return xml6.Serializer_v6._unpack_entry(self, elt)
 
-    def __init__(self, node_size, parent_id_basename_index):
+    def __init__(self, node_size, parent_id_basename_index,
+                 search_key_name):
         self.maximum_size = node_size
         self.parent_id_basename_index = parent_id_basename_index
+        self.search_key_name = search_key_name
 
 
 class CHKSerializer(xml5.Serializer_v5):
@@ -58,12 +60,16 @@
     revision_format_num = None
     support_altered_by_hack = False
 
-    def __init__(self, node_size, parent_id_basename_index):
+    def __init__(self, node_size, parent_id_basename_index,
+                 search_key_name):
         self.maximum_size = node_size
         self.parent_id_basename_index = parent_id_basename_index
-
-
-chk_serializer_subtree = CHKSerializerSubtree(4096, False)
-chk_serializer = CHKSerializer(4096, False)
-chk_serializer_subtree_parent_id = CHKSerializerSubtree(4096, True)
-chk_serializer_parent_id = CHKSerializer(4096, True)
+        self.search_key_name = search_key_name
+
+
+chk_serializer_subtree = CHKSerializerSubtree(4096, False, 'plain')
+chk_serializer = CHKSerializer(4096, False, 'plain')
+chk_serializer_subtree_parent_id = CHKSerializerSubtree(4096, True, 'plain')
+chk_serializer_parent_id = CHKSerializer(4096, True, 'plain')
+chk_serializer_16_parent_id = CHKSerializer(4096, True, 'hash-16-way')
+chk_serializer_255_parent_id = CHKSerializer(4096, True, 'hash-255-way')

=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py	2008-12-19 23:07:32 +0000
+++ b/bzrlib/inventory.py	2009-01-21 23:04:50 +0000
@@ -1362,9 +1362,10 @@
     to reuse.
     """
 
-    def __init__(self):
+    def __init__(self, search_key_name):
         CommonInventory.__init__(self)
         self._entry_cache = {}
+        self._search_key_name = search_key_name
 
     def _entry_to_bytes(self, entry):
         """Serialise entry as a single bytestring.
@@ -1447,15 +1448,18 @@
         :param new_revision_id: The revision id of the resulting CHKInventory.
         :return: The new CHKInventory.
         """
-        result = CHKInventory()
+        result = CHKInventory(self._search_key_name)
+        search_key_func = chk_map.search_key_registry.get(self._search_key_name)
         result.revision_id = new_revision_id
         result.id_to_entry = chk_map.CHKMap(
             self.id_to_entry._store,
-            self.id_to_entry._root_node)
+            self.id_to_entry._root_node,
+            search_key_func=search_key_func)
         if self.parent_id_basename_to_file_id is not None:
             result.parent_id_basename_to_file_id = chk_map.CHKMap(
                 self.parent_id_basename_to_file_id._store,
-                self.parent_id_basename_to_file_id._root_node)
+                self.parent_id_basename_to_file_id._root_node,
+                search_key_func=search_key_func)
             parent_id_basename_delta = []
         else:
             result.parent_id_basename_to_file_id = None
@@ -1509,20 +1513,32 @@
             for.
         :return: A CHKInventory
         """
-        result = CHKInventory()
         lines = bytes.splitlines()
         if lines[0] != 'chkinventory:':
             raise ValueError("not a serialised CHKInventory: %r" % bytes)
-        result.revision_id = lines[1][13:]
-        result.root_id = lines[2][9:]
-        if lines[3].startswith('parent_id_basename_to_file_id:'):
+        revision_id = lines[1][13:]
+        root_id = lines[2][9:]
+        if lines[3].startswith('search_key_name:'):
+            search_key_name = lines[3][17:]
             next = 4
-            result.parent_id_basename_to_file_id = chk_map.CHKMap(
-                chk_store, (lines[3][31:],))
         else:
+            search_key_name = 'plain'
             next = 3
+        result = CHKInventory(search_key_name)
+        result.revision_id = revision_id
+        result.root_id = root_id
+        search_key_func = chk_map.search_key_registry.get(
+                            result._search_key_name)
+        if lines[next].startswith('parent_id_basename_to_file_id:'):
+            result.parent_id_basename_to_file_id = chk_map.CHKMap(
+                chk_store, (lines[next][31:],),
+                search_key_func=search_key_func)
+            next += 1
+        else:
             result.parent_id_basename_to_file_id = None
-        result.id_to_entry = chk_map.CHKMap(chk_store, (lines[next][13:],))
+
+        result.id_to_entry = chk_map.CHKMap(chk_store, (lines[next][13:],),
+                                            search_key_func=search_key_func)
         if (result.revision_id,) != expected_revision_id:
             raise ValueError("Mismatched revision id and expected: %r, %r" %
                 (result.revision_id, expected_revision_id))
@@ -1530,7 +1546,7 @@
 
     @classmethod
     def from_inventory(klass, chk_store, inventory, maximum_size=0,
-        parent_id_basename_index=False):
+        parent_id_basename_index=False, search_key_name='plain'):
         """Create a CHKInventory from an existing inventory.
 
         The content of inventory is copied into the chk_store, and a
@@ -1541,15 +1557,18 @@
         :param maximum_size: The CHKMap node size limit.
         :param parent_id_basename_index: If True create and use a
             parent_id,basename->file_id index.
+        :param search_key_name: The identifier for the search key function
         """
-        result = CHKInventory()
+        result = CHKInventory(search_key_name)
         result.revision_id = inventory.revision_id
         result.root_id = inventory.root.file_id
-        result.id_to_entry = chk_map.CHKMap(chk_store, None)
+        search_key_func = chk_map.search_key_registry.get(search_key_name)
+        result.id_to_entry = chk_map.CHKMap(chk_store, None, search_key_func)
         result.id_to_entry._root_node.set_maximum_size(maximum_size)
         file_id_delta = []
         if parent_id_basename_index:
-            result.parent_id_basename_to_file_id = chk_map.CHKMap(chk_store, None)
+            result.parent_id_basename_to_file_id = chk_map.CHKMap(chk_store,
+                None, search_key_func)
             result.parent_id_basename_to_file_id._root_node.set_maximum_size(
                 maximum_size)
             result.parent_id_basename_to_file_id._root_node._key_width = 2
@@ -1745,6 +1764,8 @@
         lines = ["chkinventory:\n"]
         lines.append("revision_id: %s\n" % self.revision_id)
         lines.append("root_id: %s\n" % self.root_id)
+        if self._search_key_name != 'plain':
+            lines.append('search_key_name: %s\n' % (self._search_key_name,))
         if self.parent_id_basename_to_file_id is not None:
             lines.append('parent_id_basename_to_file_id: %s\n' %
                 self.parent_id_basename_to_file_id.key())

=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py	2009-01-12 18:44:55 +0000
+++ b/bzrlib/repofmt/pack_repo.py	2009-02-12 20:25:55 +0000
@@ -872,11 +872,16 @@
             'chk_index')
         chk_nodes = self._index_contents(chk_indices, refs)
         new_refs = set()
+        # TODO: This isn't strictly tasteful as we are accessing some private
+        #       variables (_serializer). Perhaps a better way would be to have
+        #       Repository._deserialise_chk_node()
+        search_key_func = chk_map.search_key_registry.get(
+            self._pack_collection.repo._serializer.search_key_name)
         def accumlate_refs(lines):
             # XXX: move to a generic location
             # Yay mismatch:
             bytes = ''.join(lines)
-            node = chk_map._deserialise(bytes, ("unknown",))
+            node = chk_map._deserialise(bytes, ("unknown",), search_key_func)
             new_refs.update(node.refs())
         self._copy_nodes(chk_nodes, chk_index_map, self.new_pack._writer,
             self.new_pack.chk_index, output_lines=accumlate_refs)
@@ -2169,7 +2174,8 @@
         serializer = self._format._serializer
         result = CHKInventory.from_inventory(self.chk_bytes, inv,
             maximum_size=serializer.maximum_size,
-            parent_id_basename_index=serializer.parent_id_basename_index)
+            parent_id_basename_index=serializer.parent_id_basename_index,
+            search_key_name=serializer.search_key_name)
         inv_lines = result.to_lines()
         return self._inventory_add_lines(revision_id, parents,
             inv_lines, check_content=False)
@@ -3039,3 +3045,81 @@
         """See RepositoryFormat.get_format_description()."""
         return ("Development repository format, currently the same as "
             "1.9-subtree with B+Tree and chk support.\n")
+
+
+class RepositoryFormatPackDevelopment4Hash16(RepositoryFormatPack):
+    """A no-subtrees development repository.
+
+    This format should be retained until the second release after bzr 1.12.
+
+    This is pack-1.9 with CHKMap based inventories with 16-way hash tries.
+    """
+
+    repository_class = CHKInventoryRepository
+    _commit_builder_class = PackCommitBuilder
+    _serializer = chk_serializer.chk_serializer_16_parent_id
+    supports_external_lookups = True
+    # What index classes to use
+    index_builder_class = BTreeBuilder
+    index_class = BTreeGraphIndex
+    supports_chks = True
+    _commit_inv_deltas = True
+
+    def _get_matching_bzrdir(self):
+        return bzrdir.format_registry.make_bzrdir('development4-hash16')
+
+    def _ignore_setting_bzrdir(self, format):
+        pass
+
+    _matchingbzrdir = property(_get_matching_bzrdir, _ignore_setting_bzrdir)
+
+    def get_format_string(self):
+        """See RepositoryFormat.get_format_string()."""
+        return "Bazaar development format 4 hash 16 (needs bzr.dev from before 1.13)\n"
+
+    def get_format_description(self):
+        """See RepositoryFormat.get_format_description()."""
+        return ("Development repository format, currently the same as "
+            "1.9 with B+Trees and chk support and 16-way hash tries\n")
+
+    def check_conversion_target(self, target_format):
+        pass
+
+
+class RepositoryFormatPackDevelopment4Hash255(RepositoryFormatPack):
+    """A no-subtrees development repository.
+
+    This format should be retained until the second release after bzr 1.12.
+
+    This is pack-1.9 with CHKMap based inventories with 255-way hash tries.
+    """
+
+    repository_class = CHKInventoryRepository
+    _commit_builder_class = PackCommitBuilder
+    _serializer = chk_serializer.chk_serializer_255_parent_id
+    supports_external_lookups = True
+    # What index classes to use
+    index_builder_class = BTreeBuilder
+    index_class = BTreeGraphIndex
+    supports_chks = True
+    _commit_inv_deltas = True
+
+    def _get_matching_bzrdir(self):
+        return bzrdir.format_registry.make_bzrdir('development4-hash255')
+
+    def _ignore_setting_bzrdir(self, format):
+        pass
+
+    _matchingbzrdir = property(_get_matching_bzrdir, _ignore_setting_bzrdir)
+
+    def get_format_string(self):
+        """See RepositoryFormat.get_format_string()."""
+        return "Bazaar development format 4 hash 255 (needs bzr.dev from before 1.13)\n"
+
+    def get_format_description(self):
+        """See RepositoryFormat.get_format_description()."""
+        return ("Development repository format, currently the same as "
+            "1.9 with B+Trees and chk support and 255-way hash tries\n")
+
+    def check_conversion_target(self, target_format):
+        pass

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2009-02-10 04:05:50 +0000
+++ b/bzrlib/repository.py	2009-02-12 21:20:16 +0000
@@ -2573,6 +2573,18 @@
     'bzrlib.repofmt.pack_repo',
     'RepositoryFormatPackDevelopment4Subtree',
     )
+format_registry.register_lazy(
+    ('Bazaar development format 4 hash 16'
+     ' (needs bzr.dev from before 1.13)\n'),
+    'bzrlib.repofmt.pack_repo',
+    'RepositoryFormatPackDevelopment4Hash16',
+    )
+format_registry.register_lazy(
+    ('Bazaar development format 4 hash 255'
+     ' (needs bzr.dev from before 1.13)\n'),
+    'bzrlib.repofmt.pack_repo',
+    'RepositoryFormatPackDevelopment4Hash255',
+    )
 
 
 class InterRepository(InterObject):

=== modified file 'bzrlib/tests/test_chk_map.py'
--- a/bzrlib/tests/test_chk_map.py	2009-01-07 22:15:57 +0000
+++ b/bzrlib/tests/test_chk_map.py	2009-02-12 20:55:35 +0000
@@ -18,12 +18,15 @@
 
 from itertools import izip
 
-from bzrlib import chk_map, osutils
+from bzrlib import (
+    chk_map,
+    osutils,
+    tests,
+    )
 from bzrlib.chk_map import (
     CHKMap,
     InternalNode,
     LeafNode,
-    _deserialise,
     )
 from bzrlib.tests import TestCaseWithTransport
 
@@ -891,11 +894,11 @@
         ptr2 = nodes[1]
         self.assertEqual('k1', ptr1[0])
         self.assertEqual('k2', ptr2[0])
-        node1 = _deserialise(chkmap._read_bytes(ptr1[1]), ptr1[1])
+        node1 = chk_map._deserialise(chkmap._read_bytes(ptr1[1]), ptr1[1], None)
         self.assertIsInstance(node1, LeafNode)
         self.assertEqual(1, len(node1))
         self.assertEqual({('k1'*50,): 'v1'}, self.to_dict(node1, chkmap._store))
-        node2 = _deserialise(chkmap._read_bytes(ptr2[1]), ptr2[1])
+        node2 = chk_map._deserialise(chkmap._read_bytes(ptr2[1]), ptr2[1], None)
         self.assertIsInstance(node2, LeafNode)
         self.assertEqual(1, len(node2))
         self.assertEqual({('k2'*50,): 'v2'}, self.to_dict(node2, chkmap._store))
@@ -1000,6 +1003,186 @@
             chkmap._dump_tree(include_keys=True))
 
 
+def _search_key_single(key):
+    """A search key function that maps all nodes to the same value"""
+    return 'value'
+
+def _test_search_key(key):
+    return 'test:' + '\x00'.join(key)
+
+
+class TestMapSearchKeys(TestCaseWithStore):
+
+    def test_default_chk_map_uses_flat_search_key(self):
+        chkmap = chk_map.CHKMap(self.get_chk_bytes(), None)
+        self.assertEqual('1',
+                         chkmap._search_key_func(('1',)))
+        self.assertEqual('1\x002',
+                         chkmap._search_key_func(('1', '2')))
+        self.assertEqual('1\x002\x003',
+                         chkmap._search_key_func(('1', '2', '3')))
+
+    def test_search_key_is_passed_to_root_node(self):
+        chkmap = chk_map.CHKMap(self.get_chk_bytes(), None,
+                                search_key_func=_test_search_key)
+        self.assertIs(_test_search_key, chkmap._search_key_func)
+        self.assertEqual('test:1\x002\x003',
+                         chkmap._search_key_func(('1', '2', '3')))
+        self.assertEqual('test:1\x002\x003',
+                         chkmap._root_node._search_key(('1', '2', '3')))
+
+    def test_search_key_passed_via__ensure_root(self):
+        chk_bytes = self.get_chk_bytes()
+        chkmap = chk_map.CHKMap(chk_bytes, None,
+                                search_key_func=_test_search_key)
+        root_key = chkmap._save()
+        chkmap = chk_map.CHKMap(chk_bytes, root_key,
+                                search_key_func=_test_search_key)
+        chkmap._ensure_root()
+        self.assertEqual('test:1\x002\x003',
+                         chkmap._root_node._search_key(('1', '2', '3')))
+
+    def test_search_key_with_internal_node(self):
+        chk_bytes = self.get_chk_bytes()
+        chkmap = chk_map.CHKMap(chk_bytes, None,
+                                search_key_func=_test_search_key)
+        chkmap._root_node.set_maximum_size(10)
+        chkmap.map(('1',), 'foo')
+        chkmap.map(('2',), 'bar')
+        chkmap.map(('3',), 'baz')
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  'test:1' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             "  'test:2' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  'test:3' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             , chkmap._dump_tree())
+        root_key = chkmap._save()
+        chkmap = chk_map.CHKMap(chk_bytes, root_key,
+                                search_key_func=_test_search_key)
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  'test:1' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             "  'test:2' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  'test:3' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             , chkmap._dump_tree())
+
+    def test_search_key_16(self):
+        chk_bytes = self.get_chk_bytes()
+        chkmap = chk_map.CHKMap(chk_bytes, None,
+                                search_key_func=chk_map._search_key_16)
+        chkmap._root_node.set_maximum_size(10)
+        chkmap.map(('1',), 'foo')
+        chkmap.map(('2',), 'bar')
+        chkmap.map(('3',), 'baz')
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  '1' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  '6' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             "  '7' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             , chkmap._dump_tree())
+        root_key = chkmap._save()
+        chkmap = chk_map.CHKMap(chk_bytes, root_key,
+                                search_key_func=chk_map._search_key_16)
+        # We can get the values back correctly
+        self.assertEqual([(('1',), 'foo')],
+                         list(chkmap.iteritems([('1',)])))
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  '1' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  '6' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             "  '7' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             , chkmap._dump_tree())
+
+    def test_search_key_255(self):
+        chk_bytes = self.get_chk_bytes()
+        chkmap = chk_map.CHKMap(chk_bytes, None,
+                                search_key_func=chk_map._search_key_255)
+        chkmap._root_node.set_maximum_size(10)
+        chkmap.map(('1',), 'foo')
+        chkmap.map(('2',), 'bar')
+        chkmap.map(('3',), 'baz')
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  '\\x1a' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  'm' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             "  '\\x83' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             , chkmap._dump_tree())
+        root_key = chkmap._save()
+        chkmap = chk_map.CHKMap(chk_bytes, root_key,
+                                search_key_func=chk_map._search_key_255)
+        # We can get the values back correctly
+        self.assertEqual([(('1',), 'foo')],
+                         list(chkmap.iteritems([('1',)])))
+        self.assertEqualDiff("'' InternalNode\n"
+                             "  '\\x1a' LeafNode\n"
+                             "      ('2',) 'bar'\n"
+                             "  'm' LeafNode\n"
+                             "      ('3',) 'baz'\n"
+                             "  '\\x83' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             , chkmap._dump_tree())
+
+    def test_search_key_collisions(self):
+        chkmap = chk_map.CHKMap(self.get_chk_bytes(), None,
+                                search_key_func=_search_key_single)
+        # The node will want to expand, but it cannot, because it knows that
+        # all the keys must map to this node
+        chkmap._root_node.set_maximum_size(20)
+        chkmap.map(('1',), 'foo')
+        chkmap.map(('2',), 'bar')
+        chkmap.map(('3',), 'baz')
+        self.assertEqualDiff("'' LeafNode\n"
+                             "      ('1',) 'foo'\n"
+                             "      ('2',) 'bar'\n"
+                             "      ('3',) 'baz'\n"
+                             , chkmap._dump_tree())
+
+
+class TestSearchKeyFuncs(tests.TestCase):
+
+    def assertSearchKey16(self, expected, key):
+        self.assertEqual(expected, chk_map._search_key_16(key))
+
+    def assertSearchKey255(self, expected, key):
+        actual = chk_map._search_key_255(key)
+        self.assertEqual(expected, actual, 'actual: %r' % (actual,))
+
+    def test_simple_16(self):
+        self.assertSearchKey16('738C9ADF', ('foo',))
+        self.assertSearchKey16('738C9ADF\x00738C9ADF', ('foo', 'foo'))
+        self.assertSearchKey16('738C9ADF\x0076FF8CAA', ('foo', 'bar'))
+        self.assertSearchKey16('127D32EF', ('abcd',))
+
+    def test_simple_255(self):
+        self.assertSearchKey255('\x8cse!', ('foo',))
+        self.assertSearchKey255('\x8cse!\x00\x8cse!', ('foo', 'foo'))
+        self.assertSearchKey255('\x8cse!\x00v\xff\x8c\xaa', ('foo', 'bar'))
+        # The standard mapping for these would include '\n', so it should be
+        # mapped to '_'
+        self.assertSearchKey255('\xfdm\x93_\x00P_\x1bL', ('<', 'V'))
+
+    def test_255_does_not_include_newline(self):
+        # When mapping via _search_key_255, we should never have the '\n'
+        # character, but all other 255 values should be present
+        chars_used = set()
+        for char_in in range(256):
+            search_key = chk_map._search_key_255((chr(char_in),))
+            chars_used.update(search_key)
+        all_chars = set([chr(x) for x in range(256)])
+        unused_chars = all_chars.symmetric_difference(chars_used)
+        self.assertEqual(set('\n'), unused_chars)
+
+
 class TestLeafNode(TestCaseWithStore):
 
     def test_current_size_empty(self):
@@ -1247,7 +1430,7 @@
             keys)
         # We should be able to access deserialised content.
         bytes = self.read_bytes(chk_bytes, keys[1])
-        node = _deserialise(bytes, keys[1])
+        node = chk_map._deserialise(bytes, keys[1], None)
         self.assertEqual(1, len(node))
         self.assertEqual({('foo',): 'bar'}, self.to_dict(node, chk_bytes))
         self.assertEqual(3, node._node_width)

=== modified file 'bzrlib/tests/test_inv.py'
--- a/bzrlib/tests/test_inv.py	2008-12-03 22:53:37 +0000
+++ b/bzrlib/tests/test_inv.py	2009-01-21 23:04:50 +0000
@@ -198,6 +198,7 @@
         self.assertEqual(inv.root.parent_id, new_inv.root.parent_id)
         self.assertEqual(inv.root.name, new_inv.root.name)
         self.assertEqual("rootrev", new_inv.root.revision)
+        self.assertEqual('plain', new_inv._search_key_name)
 
     def test_deserialise_wrong_revid(self):
         inv = Inventory()
@@ -215,13 +216,53 @@
         inv.root.revision = "bar"
         chk_bytes = self.get_chk_bytes()
         chk_inv = CHKInventory.from_inventory(chk_bytes, inv)
-        self.assertEqual([
-            'chkinventory:\n',
-            'revision_id: foo\n',
-            'root_id: TREE_ROOT\n',
-            'id_to_entry: sha1:36219af8518a9bed1e52db58e99131db2a00b329\n',
-            ],
-            chk_inv.to_lines())
+        lines = chk_inv.to_lines()
+        self.assertEqual([
+            'chkinventory:\n',
+            'revision_id: foo\n',
+            'root_id: TREE_ROOT\n',
+            'id_to_entry: sha1:c9d15ff2621b8774506f702ff4ffd5f4af885a51\n',
+            ], lines)
+        chk_inv = CHKInventory.deserialise(chk_bytes, ''.join(lines), ('foo',))
+        self.assertEqual('plain', chk_inv._search_key_name)
+
+    def test_captures_parent_id_basename_index(self):
+        inv = Inventory()
+        inv.revision_id = "foo"
+        inv.root.revision = "bar"
+        chk_bytes = self.get_chk_bytes()
+        chk_inv = CHKInventory.from_inventory(chk_bytes, inv,
+                    parent_id_basename_index=True)
+        lines = chk_inv.to_lines()
+        self.assertEqual([
+            'chkinventory:\n',
+            'revision_id: foo\n',
+            'root_id: TREE_ROOT\n',
+            'parent_id_basename_to_file_id: sha1:46f33678d1c8cfd9b6d00dc658b6c8a9ac7bb0f0\n',
+            'id_to_entry: sha1:c9d15ff2621b8774506f702ff4ffd5f4af885a51\n',
+            ], lines)
+        chk_inv = CHKInventory.deserialise(chk_bytes, ''.join(lines), ('foo',))
+        self.assertEqual('plain', chk_inv._search_key_name)
+
+    def test_captures_search_key_name(self):
+        inv = Inventory()
+        inv.revision_id = "foo"
+        inv.root.revision = "bar"
+        chk_bytes = self.get_chk_bytes()
+        chk_inv = CHKInventory.from_inventory(chk_bytes, inv,
+                                              parent_id_basename_index=True,
+                                              search_key_name='hash-16-way')
+        lines = chk_inv.to_lines()
+        self.assertEqual([
+            'chkinventory:\n',
+            'revision_id: foo\n',
+            'root_id: TREE_ROOT\n',
+            'search_key_name: hash-16-way\n',
+            'parent_id_basename_to_file_id: sha1:46f33678d1c8cfd9b6d00dc658b6c8a9ac7bb0f0\n',
+            'id_to_entry: sha1:c9d15ff2621b8774506f702ff4ffd5f4af885a51\n',
+            ], lines)
+        chk_inv = CHKInventory.deserialise(chk_bytes, ''.join(lines), ('foo',))
+        self.assertEqual('hash-16-way', chk_inv._search_key_name)
 
     def test_directory_children_on_demand(self):
         inv = Inventory()



More information about the bazaar-commits mailing list