Rev 4461: (jam) Improve initial commit performance by creating a CHKMap in bulk, in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Thu Jun 18 21:25:59 BST 2009


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 4461
revision-id: pqm at pqm.ubuntu.com-20090618202552-xyl6tcvbxtm8bupf
parent: pqm at pqm.ubuntu.com-20090618191345-vgsr5zv78uesqsdg
parent: john at arbash-meinel.com-20090618191949-gd8yfhunrqobru15
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2009-06-18 21:25:52 +0100
message:
  (jam) Improve initial commit performance by creating a CHKMap in bulk,
  	rather than via O(tree) map() calls.
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
  bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
  bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
  bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
  bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
  bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 4413.5.15
    revision-id: john at arbash-meinel.com-20090618191949-gd8yfhunrqobru15
    parent: john at arbash-meinel.com-20090618181836-biodfkat9a8eyzjz
    parent: pqm at pqm.ubuntu.com-20090618191345-vgsr5zv78uesqsdg
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Thu 2009-06-18 14:19:49 -0500
    message:
      Merge bzr.dev 4460 resolving NEWS
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/blackbox/test_branch.py test_branch.py-20060524161337-noms9gmcwqqrfi8y-1
      bzrlib/tests/blackbox/test_ls.py test_ls.py-20060712232047-0jraqpecwngee12y-1
      bzrlib/tests/bzrdir_implementations/test_bzrdir.py test_bzrdir.py-20060131065642-0ebeca5e30e30866
      bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
      bzrlib/tests/test_commit_merge.py test_commit_merge.py-20050920084723-819eeeff77907bc5
      bzrlib/tests/test_pack_repository.py test_pack_repository-20080801043947-eaw0e6h2gu75kwmy-1
      bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
      tools/win32/build_release.py   build_release.py-20081105204355-2ghh5cv01v1x4rzz-1
    ------------------------------------------------------------
    revno: 4413.5.14
    revision-id: john at arbash-meinel.com-20090618181836-biodfkat9a8eyzjz
    parent: john at arbash-meinel.com-20090617191253-1m90zv94aimg1orm
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Thu 2009-06-18 13:18:36 -0500
    message:
      The new add_inventory_by_delta is returning a CHKInventory when mapping from NULL
      Which is completely valid, but 'broke' one of the tests.
      So to fix it, changed the test to use CHKInventories on both sides, and add an __eq__
      member. The nice thing is that CHKInventory.__eq__ is fairly cheap, since it only
      has to check the root keys.
    modified:
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
    ------------------------------------------------------------
    revno: 4413.5.13
    revision-id: john at arbash-meinel.com-20090617191253-1m90zv94aimg1orm
    parent: john at arbash-meinel.com-20090617191035-0vztgazsdwa3mwg4
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Wed 2009-06-17 14:12:53 -0500
    message:
      NEWS entry about --2a improvement.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
    ------------------------------------------------------------
    revno: 4413.5.12
    revision-id: john at arbash-meinel.com-20090617191035-0vztgazsdwa3mwg4
    parent: john at arbash-meinel.com-20090617185949-7lfh5td7pipwp0ss
    parent: pqm at pqm.ubuntu.com-20090617100437-gavn9zkum4dj5yjz
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Wed 2009-06-17 14:10:35 -0500
    message:
      Merge bzr.dev 4454 in prep for NEWS
    removed:
      doc/developers/performance-contributing.txt performancecontribut-20070621063612-ac4zhhagjzkr21qp-1
    added:
      bzrlib/_known_graph_py.py      _known_graph_py.py-20090610185421-vw8vfda2cgnckgb1-1
      bzrlib/_known_graph_pyx.pyx    _known_graph_pyx.pyx-20090610194911-yjk73td9hpjilas0-1
      bzrlib/help_topics/en/diverged-branches.txt divergedbranches.txt-20090608035534-mb4ry8so4hw238n0-1
      bzrlib/tests/per_repository_reference/test_get_rev_id_for_revno.py test_get_rev_id_for_-20090615064050-b6mq6co557towrxh-1
      bzrlib/tests/test__known_graph.py test__known_graph.py-20090610185421-vw8vfda2cgnckgb1-2
      bzrlib/util/bencode.py         bencode.py-20090609141817-jtvhqq6vyryjoeky-1
      doc/developers/bug-handling.txt bughandling.txt-20090615072247-mplym00zjq2n4s61-1
      doc/index.ru.txt               index.ru.txt-20080819091426-kfq61l02dhm9pplk-1
      doc/ru/                        ru-20080818031309-t3nyctvfbvfh4h2u-1
      doc/ru/mini-tutorial/          minitutorial-20080818031309-t3nyctvfbvfh4h2u-2
      doc/ru/mini-tutorial/index.txt index.txt-20080818031309-t3nyctvfbvfh4h2u-4
      doc/ru/quick-reference/        quickreference-20080818031309-t3nyctvfbvfh4h2u-3
      doc/ru/quick-reference/Makefile makefile-20080818031309-t3nyctvfbvfh4h2u-5
      doc/ru/quick-reference/quick-start-summary.pdf quickstartsummary.pd-20080818031309-t3nyctvfbvfh4h2u-6
      doc/ru/quick-reference/quick-start-summary.png quickstartsummary.pn-20080818031309-t3nyctvfbvfh4h2u-7
      doc/ru/quick-reference/quick-start-summary.svg quickstartsummary.sv-20080818031309-t3nyctvfbvfh4h2u-8
      doc/ru/tutorials/              docrututorials-20090427084615-toum0jo7qohd807p-1
      doc/ru/tutorials/centralized_workflow.txt centralized_workflow-20090531190825-ex3ums4bcuaf2r6k-1
      doc/ru/tutorials/tutorial.txt  tutorial.txt-20090602180629-wkp7wr27jl4i2zep-1
      doc/ru/tutorials/using_bazaar_with_launchpad.txt using_bazaar_with_la-20090427084917-b22ppqtdx7q4hapw-1
      doc/ru/user-guide/             docruuserguide-20090601191403-rcoy6nsre0vjiozm-1
      doc/ru/user-guide/branching_a_project.txt branching_a_project.-20090602104644-pjpwfx7xh2k5l0ba-1
      doc/ru/user-guide/core_concepts.txt core_concepts.txt-20090602104644-pjpwfx7xh2k5l0ba-2
      doc/ru/user-guide/images/      images-20090601201124-cruf3mmq5cfxeb1w-1
      doc/ru/user-guide/images/workflows_centralized.png workflows_centralize-20090601201124-cruf3mmq5cfxeb1w-3
      doc/ru/user-guide/images/workflows_centralized.svg workflows_centralize-20090601201124-cruf3mmq5cfxeb1w-4
      doc/ru/user-guide/images/workflows_gatekeeper.png workflows_gatekeeper-20090601201124-cruf3mmq5cfxeb1w-5
      doc/ru/user-guide/images/workflows_gatekeeper.svg workflows_gatekeeper-20090601201124-cruf3mmq5cfxeb1w-6
      doc/ru/user-guide/images/workflows_localcommit.png workflows_localcommi-20090601201124-cruf3mmq5cfxeb1w-7
      doc/ru/user-guide/images/workflows_localcommit.svg workflows_localcommi-20090601201124-cruf3mmq5cfxeb1w-8
      doc/ru/user-guide/images/workflows_peer.png workflows_peer.png-20090601201124-cruf3mmq5cfxeb1w-9
      doc/ru/user-guide/images/workflows_peer.svg workflows_peer.svg-20090601201124-cruf3mmq5cfxeb1w-10
      doc/ru/user-guide/images/workflows_pqm.png workflows_pqm.png-20090601201124-cruf3mmq5cfxeb1w-11
      doc/ru/user-guide/images/workflows_pqm.svg workflows_pqm.svg-20090601201124-cruf3mmq5cfxeb1w-12
      doc/ru/user-guide/images/workflows_shared.png workflows_shared.png-20090601201124-cruf3mmq5cfxeb1w-13
      doc/ru/user-guide/images/workflows_shared.svg workflows_shared.svg-20090601201124-cruf3mmq5cfxeb1w-14
      doc/ru/user-guide/images/workflows_single.png workflows_single.png-20090601201124-cruf3mmq5cfxeb1w-15
      doc/ru/user-guide/images/workflows_single.svg workflows_single.svg-20090601201124-cruf3mmq5cfxeb1w-16
      doc/ru/user-guide/index.txt    index.txt-20090601201124-cruf3mmq5cfxeb1w-2
      doc/ru/user-guide/introducing_bazaar.txt introducing_bazaar.t-20090601221109-6ehwbt2pvzgpftlu-1
      doc/ru/user-guide/specifying_revisions.txt specifying_revisions-20090602104644-pjpwfx7xh2k5l0ba-3
      doc/ru/user-guide/stacked.txt  stacked.txt-20090602104644-pjpwfx7xh2k5l0ba-4
      doc/ru/user-guide/using_checkouts.txt using_checkouts.txt-20090602104644-pjpwfx7xh2k5l0ba-5
      doc/ru/user-guide/zen.txt      zen.txt-20090602104644-pjpwfx7xh2k5l0ba-6
      tools/time_graph.py            time_graph.py-20090608210127-6g0epojxnqjo0f0s-1
    modified:
      .bzrignore                     bzrignore-20050311232317-81f7b71efa2db11a
      Makefile                       Makefile-20050805140406-d96e3498bb61c5bb
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzr                            bzr.py-20050313053754-5485f144c7006fa6
      bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
      bzrlib/_dirstate_helpers_c.pyx dirstate_helpers.pyx-20070503201057-u425eni465q4idwn-3
      bzrlib/branch.py               branch.py-20050309040759-e4baf4e0d046576e
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/chk_serializer.py       chk_serializer.py-20081002064345-2tofdfj2eqq01h4b-1
      bzrlib/commands.py             bzr.py-20050309040720-d10f4714595cf8c3
      bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
      bzrlib/config.py               config.py-20051011043216-070c74f4e9e338e8
      bzrlib/dirstate.py             dirstate.py-20060728012006-d6mvoihjb3je9peu-1
      bzrlib/errors.py               errors.py-20050309040759-20512168c4e14fbd
      bzrlib/filters/__init__.py     __init__.py-20080416080515-mkxl29amuwrf6uir-2
      bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
      bzrlib/groupcompress.py        groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
      bzrlib/help.py                 help.py-20050505025907-4dd7a6d63912f894
      bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
      bzrlib/help_topics/en/configuration.txt configuration.txt-20060314161707-868350809502af01
      bzrlib/help_topics/en/eol.txt  eol.txt-20090327060429-todzdjmqt3bpv5r8-3
      bzrlib/index.py                index.py-20070712131115-lolkarso50vjr64s-1
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/lock.py                 lock.py-20050527050856-ec090bb51bc03349
      bzrlib/mail_client.py          mail_client.py-20070809192806-vuxt3t19srtpjpdn-1
      bzrlib/osutils.py              osutils.py-20050309040759-eeaff12fbf77ac86
      bzrlib/progress.py             progress.py-20050610070202-df9faaab791964c0
      bzrlib/push.py                 push.py-20080606021927-5fe39050e8xne9un-1
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
      bzrlib/repofmt/knitrepo.py     knitrepo.py-20070206081537-pyy4a00xdas0j4pf-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/revisiontree.py         revisiontree.py-20060724012533-bg8xyryhxd0o0i0h-1
      bzrlib/serializer.py           serializer.py-20090402143702-wmkh9cfjhwpju0qi-1
      bzrlib/shellcomplete.py        shellcomplete.py-20050822153127-3be115ff5e70fc39
      bzrlib/smart/bzrdir.py         bzrdir.py-20061122024551-ol0l0o0oofsu9b3t-1
      bzrlib/smart/medium.py         medium.py-20061103051856-rgu2huy59fkz902q-1
      bzrlib/smart/repository.py     repository.py-20061128022038-vr5wy5bubyb8xttk-1
      bzrlib/smart/request.py        request.py-20061108095550-gunadhxmzkdjfeek-1
      bzrlib/tests/__init__.py       selftest.py-20050531073622-8d0e3c8845c97a64
      bzrlib/tests/blackbox/test_diff.py test_diff.py-20060110203741-aa99ac93e633d971
      bzrlib/tests/blackbox/test_init.py test_init.py-20060309032856-a292116204d86eb7
      bzrlib/tests/blackbox/test_pull.py test_pull.py-20051201144907-64959364f629947f
      bzrlib/tests/blackbox/test_push.py test_push.py-20060329002750-929af230d5d22663
      bzrlib/tests/blackbox/test_split.py test_split.py-20061008023421-qy0vdpzysh5rriu8-1
      bzrlib/tests/blackbox/test_status.py teststatus.py-20050712014354-508855eb9f29f7dc
      bzrlib/tests/branch_implementations/test_dotted_revno_to_revision_id.py test_dotted_revno_to-20090121014844-6x7d9jtri5sspg1o-1
      bzrlib/tests/branch_implementations/test_push.py test_push.py-20070130153159-fhfap8uoifevg30j-1
      bzrlib/tests/branch_implementations/test_stacking.py test_stacking.py-20080214020755-msjlkb7urobwly0f-1
      bzrlib/tests/bzrdir_implementations/test_bzrdir.py test_bzrdir.py-20060131065642-0ebeca5e30e30866
      bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
      bzrlib/tests/per_repository_reference/__init__.py __init__.py-20080220025549-nnm2s80it1lvcwnc-2
      bzrlib/tests/test_bzrdir.py    test_bzrdir.py-20060131065654-deba40eef51cf220
      bzrlib/tests/test_commands.py  test_command.py-20051019190109-3b17be0f52eaa7a8
      bzrlib/tests/test_eol_filters.py test_eol_filters.py-20090327060429-todzdjmqt3bpv5r8-2
      bzrlib/tests/test_filters.py   test_filters.py-20080417120614-tc3zok0vvvprsc99-1
      bzrlib/tests/test_generate_docs.py test_generate_docs.p-20070102123151-cqctnsrlqwmiljd7-1
      bzrlib/tests/test_graph.py     test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
      bzrlib/tests/test_help.py      test_help.py-20070419045354-6q6rq15j9e2n5fna-1
      bzrlib/tests/test_knit.py      test_knit.py-20051212171302-95d4c00dd5f11f2b
      bzrlib/tests/test_mail_client.py test_mail_client.py-20070809192806-vuxt3t19srtpjpdn-2
      bzrlib/tests/test_options.py   testoptions.py-20051014093702-96457cfc86319a8f
      bzrlib/tests/test_plugins.py   plugins.py-20050622075746-32002b55e5e943e9
      bzrlib/tests/test_progress.py  test_progress.py-20060308160359-978c397bc79b7fda
      bzrlib/tests/test_remote.py    test_remote.py-20060720103555-yeeg2x51vn0rbtdp-2
      bzrlib/tests/test_smart.py     test_smart.py-20061122024551-ol0l0o0oofsu9b3t-2
      bzrlib/tests/test_ui.py        test_ui.py-20051130162854-458e667a7414af09
      bzrlib/tests/tree_implementations/test_list_files.py test_list_files.py-20070216005501-cjh6fzprbe9lbs2t-1
      bzrlib/tests/workingtree_implementations/test_content_filters.py test_content_filters-20080424071441-8navsrmrfdxpn90a-1
      bzrlib/tests/workingtree_implementations/test_eol_conversion.py test_eol_conversion.-20090327060429-todzdjmqt3bpv5r8-4
      bzrlib/transform.py            transform.py-20060105172343-dd99e54394d91687
      bzrlib/transport/sftp.py       sftp.py-20051019050329-ab48ce71b7e32dfe
      bzrlib/ui/text.py              text.py-20051130153916-2e438cffc8afc478
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
      bzrlib/weave.py                knit.py-20050627021749-759c29984154256b
      bzrlib/win32utils.py           win32console.py-20051021033308-123c6c929d04973d
      bzrlib/workingtree.py          workingtree.py-20050511021032-29b6ec0a681e02e3
      bzrlib/workingtree_4.py        workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
      bzrlib/xml4.py                 xml4.py-20050916091259-db5ab55e7e6ca324
      bzrlib/xml8.py                 xml5.py-20050907032657-aac8f960815b66b1
      bzrlib/xml_serializer.py       xml.py-20050309040759-57d51586fdec365d
      doc/developers/cycle.txt       cycle.txt-20081017031739-rw24r0cywm2ok3xu-1
      doc/developers/index.txt       index.txt-20070508041241-qznziunkg0nffhiw-1
      doc/developers/performance-roadmap.txt performanceroadmap.t-20070507174912-mwv3xv517cs4sisd-2
      doc/developers/planned-change-integration.txt plannedchangeintegra-20070619004702-i1b3ccamjtfaoq6w-1
      doc/developers/releasing.txt   releasing.txt-20080502015919-fnrcav8fwy8ccibu-1
      doc/en/developer-guide/HACKING.txt HACKING-20050805200004-2a5dc975d870f78c
      doc/en/quick-reference/Makefile makefile-20070813143223-5i7bgw7w8s7l3ae2-2
      doc/en/quick-reference/quick-start-summary.png quickstartsummary.pn-20071203142852-hsiybkmh37q5owwe-1
      doc/en/tutorials/using_bazaar_with_launchpad.txt using_bazaar_with_lp-20071211073140-7msh8uf9a9h4y9hb-1
      doc/en/user-guide/images/workflows_centralized.png workflows_centralize-20071114035000-q36a9h57ps06uvnl-8
      doc/en/user-guide/images/workflows_gatekeeper.png workflows_gatekeeper-20071114035000-q36a9h57ps06uvnl-9
      doc/en/user-guide/images/workflows_localcommit.png workflows_localcommi-20071114035000-q36a9h57ps06uvnl-10
      doc/en/user-guide/images/workflows_peer.png workflows_peer.png-20071114035000-q36a9h57ps06uvnl-11
      doc/en/user-guide/images/workflows_pqm.png workflows_pqm.png-20071114035000-q36a9h57ps06uvnl-12
      doc/en/user-guide/images/workflows_shared.png workflows_shared.png-20071114035000-q36a9h57ps06uvnl-13
      doc/en/user-guide/images/workflows_single.png workflows_single.png-20071114035000-q36a9h57ps06uvnl-14
      doc/en/user-guide/introducing_bazaar.txt introducing_bazaar.t-20071114035000-q36a9h57ps06uvnl-5
      doc/index.txt                  index.txt-20070813101924-07gd9i9d2jt124bf-1
      generate_docs.py               bzrinfogen.py-20051211224525-78e7c14f2c955e55
      setup.py                       setup.py-20050314065409-02f8a0a6e3f9bc70
    ------------------------------------------------------------
    revno: 4413.5.11
    revision-id: john at arbash-meinel.com-20090617185949-7lfh5td7pipwp0ss
    parent: john at arbash-meinel.com-20090617184126-i5u6odzoka4sk566
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Wed 2009-06-17 13:59:49 -0500
    message:
      Pull out the common 'populate this CHKInventory' code out into a helper
      and share it between the creators.
    modified:
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
    ------------------------------------------------------------
    revno: 4413.5.10
    revision-id: john at arbash-meinel.com-20090617184126-i5u6odzoka4sk566
    parent: john at arbash-meinel.com-20090617182359-3ms8skqdaxn3db9m
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Wed 2009-06-17 13:41:26 -0500
    message:
      Clean upt the test_inv tests that assumed _root_node was real and not just a key.
    modified:
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
      bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 4413.5.9
    revision-id: john at arbash-meinel.com-20090617182359-3ms8skqdaxn3db9m
    parent: john at arbash-meinel.com-20090608182135-mxmy7bbhluq9rm7x
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Wed 2009-06-17 13:23:59 -0500
    message:
      Some cleanup. Move the check that from_dict works into test_chk_map.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
    ------------------------------------------------------------
    revno: 4413.5.8
    revision-id: john at arbash-meinel.com-20090608182135-mxmy7bbhluq9rm7x
    parent: john at arbash-meinel.com-20090608181741-qznlkpr8wi6kz8q6
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Mon 2009-06-08 13:21:35 -0500
    message:
      Change some asserts into raise: calls.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
    ------------------------------------------------------------
    revno: 4413.5.7
    revision-id: john at arbash-meinel.com-20090608181741-qznlkpr8wi6kz8q6
    parent: john at arbash-meinel.com-20090608181508-p2p2oqiy9e6bu0in
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Mon 2009-06-08 13:17:41 -0500
    message:
      Switch to using a single code path for from_dict().
      Remove an extra pdb.set_trace() statement.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 4413.5.6
    revision-id: john at arbash-meinel.com-20090608181508-p2p2oqiy9e6bu0in
    parent: john at arbash-meinel.com-20090608173356-6fry12l529kb9ylp
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Mon 2009-06-08 13:15:08 -0500
    message:
      Clean up the calls for '_create_inv_from_null' so they use the apis correctly.
      
      In the end this shaves off as much as 2s (15.5s => 13.5s) for an initial commit of
      a mysql tree. Some of that is potentially the InternalNode.map() fix, some of it is
      not going through a regular Inventory and then back into an apply_delta loop, etc.
      
      Stuff like InternalNode.map() has 2 node._current_size() calls, so that it can see
      if the size changed so it knows to check for remap, etc, is wasted on an initial
      build. And optimizing the 'build-from-scratch' is somewhat reasonable, since it is
      the only time that we should be dealing with that many objects.
    modified:
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
    ------------------------------------------------------------
    revno: 4413.5.5
    revision-id: john at arbash-meinel.com-20090608173356-6fry12l529kb9ylp
    parent: john at arbash-meinel.com-20090608172947-r9hucea21l7mse3m
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Mon 2009-06-08 12:33:56 -0500
    message:
      Make it more obvious how the two creation methods are defined.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 4413.5.4
    revision-id: john at arbash-meinel.com-20090608172947-r9hucea21l7mse3m
    parent: john at arbash-meinel.com-20090606014721-xqbksp0fl6ossesk
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Mon 2009-06-08 12:29:47 -0500
    message:
      Change CHKMap.from_dict to create a LeafNode and split it.
      As opposed to using multiple calls to .map().
      There were a few edge cases that needed to be fixed.
      Such as:
      1) setting self._raw_size
      2) LeafNode.map() needed to properly handle when one of its child nodes also split.
      3) Update CHK1.apply_inventory_by_delta to use this new function when appropriate
      4) For now from_dict() continues to do it both ways, just to make sure it gives
      correct results.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
    ------------------------------------------------------------
    revno: 4413.5.3
    revision-id: john at arbash-meinel.com-20090606014721-xqbksp0fl6ossesk
    parent: john at arbash-meinel.com-20090605211101-cu88w49o0ys97r3c
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Fri 2009-06-05 20:47:21 -0500
    message:
      Try a method for apply_insert_delta.
      
      Instead of calling lots of map() calls, just build a leaf node, and split it.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 4413.5.2
    revision-id: john at arbash-meinel.com-20090605211101-cu88w49o0ys97r3c
    parent: john at arbash-meinel.com-20090605211029-1mc1tejpgyj7ww8g
    parent: john at arbash-meinel.com-20090605195715-q2gcpaypbixwk4wg
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Fri 2009-06-05 16:11:01 -0500
    message:
      Merge the chk_map.InternalNode._iter_nodes improvements.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/tests/test_chk_map.py   test_chk_map.py-20081001014447-ue6kkuhofvdecvxa-2
      bzrlib/workingtree.py          workingtree.py-20050511021032-29b6ec0a681e02e3
    ------------------------------------------------------------
    revno: 4413.5.1
    revision-id: john at arbash-meinel.com-20090605211029-1mc1tejpgyj7ww8g
    parent: pqm at pqm.ubuntu.com-20090605081039-abvojdsxjbg5i4ff
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: 1.16-chk-direct
    timestamp: Fri 2009-06-05 16:10:29 -0500
    message:
      Prototype an alternative way to handle 'first commit'
      So far it is just as slow, because we still call .map() for every object.
      However, it at least gives us a route to make things better.
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
=== modified file 'NEWS'
--- a/NEWS	2009-06-18 19:13:45 +0000
+++ b/NEWS	2009-06-18 19:19:49 +0000
@@ -71,6 +71,9 @@
   to 1.1 seconds. The improvement for ``bzr ls -r-1`` is more
   substantial dropping from 54.3 to 1.1 seconds. (Ian Clatworthy)
 
+* Initial commit performance in ``--2a`` repositories has been improved by
+  making it cheaper to build the initial CHKMap. (John Arbash Meinel)
+
 * Resolving a revno to a revision id on a branch accessed via ``bzr://``
   or ``bzr+ssh://`` is now much faster and involves no VFS operations.
   This speeds up commands like ``bzr pull -r 123``.  (Andrew Bennetts)

=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py	2009-06-15 14:49:27 +0000
+++ b/bzrlib/chk_map.py	2009-06-17 19:10:35 +0000
@@ -203,13 +203,48 @@
             multiple pages.
         :return: The root chk of the resulting CHKMap.
         """
-        result = CHKMap(store, None, search_key_func=search_key_func)
+        root_key = klass._create_directly(store, initial_value,
+            maximum_size=maximum_size, key_width=key_width,
+            search_key_func=search_key_func)
+        return root_key
+
+    @classmethod
+    def _create_via_map(klass, store, initial_value, maximum_size=0,
+                        key_width=1, search_key_func=None):
+        result = klass(store, None, search_key_func=search_key_func)
         result._root_node.set_maximum_size(maximum_size)
         result._root_node._key_width = key_width
         delta = []
         for key, value in initial_value.items():
             delta.append((None, key, value))
-        return result.apply_delta(delta)
+        root_key = result.apply_delta(delta)
+        return root_key
+
+    @classmethod
+    def _create_directly(klass, store, initial_value, maximum_size=0,
+                         key_width=1, search_key_func=None):
+        node = LeafNode(search_key_func=search_key_func)
+        node.set_maximum_size(maximum_size)
+        node._key_width = key_width
+        node._items = dict(initial_value)
+        node._raw_size = sum([node._key_value_len(key, value)
+                              for key,value in initial_value.iteritems()])
+        node._len = len(node._items)
+        node._compute_search_prefix()
+        node._compute_serialised_prefix()
+        if (node._len > 1
+            and maximum_size
+            and node._current_size() > maximum_size):
+            prefix, node_details = node._split(store)
+            if len(node_details) == 1:
+                raise AssertionError('Failed to split using node._split')
+            node = InternalNode(prefix, search_key_func=search_key_func)
+            node.set_maximum_size(maximum_size)
+            node._key_width = key_width
+            for split, subnode in node_details:
+                node.add_node(split, subnode)
+        keys = list(node.serialise(store))
+        return keys[-1]
 
     def iter_changes(self, basis):
         """Iterate over the changes between basis and self.
@@ -764,7 +799,19 @@
                 result[prefix] = node
             else:
                 node = result[prefix]
-            node.map(store, key, value)
+            sub_prefix, node_details = node.map(store, key, value)
+            if len(node_details) > 1:
+                if prefix != sub_prefix:
+                    # This node has been split and is now found via a different
+                    # path
+                    result.pop(prefix)
+                new_node = InternalNode(sub_prefix,
+                    search_key_func=self._search_key_func)
+                new_node.set_maximum_size(self._maximum_size)
+                new_node._key_width = self._key_width
+                for split, node in node_details:
+                    new_node.add_node(split, node)
+                result[prefix] = new_node
         return common_prefix, result.items()
 
     def map(self, store, key, value):

=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py	2009-06-10 03:56:49 +0000
+++ b/bzrlib/inventory.py	2009-06-18 18:18:36 +0000
@@ -1470,6 +1470,19 @@
         self._path_to_fileid_cache = {}
         self._search_key_name = search_key_name
 
+    def __eq__(self, other):
+        """Compare two sets by comparing their contents."""
+        if not isinstance(other, CHKInventory):
+            return NotImplemented
+
+        this_key = self.id_to_entry.key()
+        other_key = other.id_to_entry.key()
+        this_pid_key = self.parent_id_basename_to_file_id.key()
+        other_pid_key = other.parent_id_basename_to_file_id.key()
+        if None in (this_key, this_pid_key, other_key, other_pid_key):
+            return False
+        return this_key == other_key and this_pid_key == other_pid_key
+
     def _entry_to_bytes(self, entry):
         """Serialise entry as a single bytestring.
 
@@ -1716,29 +1729,38 @@
         :param maximum_size: The CHKMap node size limit.
         :param search_key_name: The identifier for the search key function
         """
-        result = CHKInventory(search_key_name)
+        result = klass(search_key_name)
         result.revision_id = inventory.revision_id
         result.root_id = inventory.root.file_id
-        search_key_func = chk_map.search_key_registry.get(search_key_name)
-        result.id_to_entry = chk_map.CHKMap(chk_store, None, search_key_func)
-        result.id_to_entry._root_node.set_maximum_size(maximum_size)
-        file_id_delta = []
-        result.parent_id_basename_to_file_id = chk_map.CHKMap(chk_store,
-            None, search_key_func)
-        result.parent_id_basename_to_file_id._root_node.set_maximum_size(
-            maximum_size)
-        result.parent_id_basename_to_file_id._root_node._key_width = 2
-        parent_id_delta = []
+
+        entry_to_bytes = result._entry_to_bytes
+        parent_id_basename_key = result._parent_id_basename_key
+        id_to_entry_dict = {}
+        parent_id_basename_dict = {}
         for path, entry in inventory.iter_entries():
-            file_id_delta.append((None, (entry.file_id,),
-                result._entry_to_bytes(entry)))
-            parent_id_delta.append(
-                (None, result._parent_id_basename_key(entry),
-                 entry.file_id))
-        result.id_to_entry.apply_delta(file_id_delta)
-        result.parent_id_basename_to_file_id.apply_delta(parent_id_delta)
+            id_to_entry_dict[(entry.file_id,)] = entry_to_bytes(entry)
+            p_id_key = parent_id_basename_key(entry)
+            parent_id_basename_dict[p_id_key] = entry.file_id
+
+        result._populate_from_dicts(chk_store, id_to_entry_dict,
+            parent_id_basename_dict, maximum_size=maximum_size)
         return result
 
+    def _populate_from_dicts(self, chk_store, id_to_entry_dict,
+                             parent_id_basename_dict, maximum_size):
+        search_key_func = chk_map.search_key_registry.get(self._search_key_name)
+        root_key = chk_map.CHKMap.from_dict(chk_store, id_to_entry_dict,
+                   maximum_size=maximum_size, key_width=1,
+                   search_key_func=search_key_func)
+        self.id_to_entry = chk_map.CHKMap(chk_store, root_key,
+                                          search_key_func)
+        root_key = chk_map.CHKMap.from_dict(chk_store,
+                   parent_id_basename_dict,
+                   maximum_size=maximum_size, key_width=2,
+                   search_key_func=search_key_func)
+        self.parent_id_basename_to_file_id = chk_map.CHKMap(chk_store,
+                                                    root_key, search_key_func)
+
     def _parent_id_basename_key(self, entry):
         """Create a key for a entry in a parent_id_basename_to_file_id index."""
         if entry.parent_id is not None:

=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- a/bzrlib/repofmt/groupcompress_repo.py	2009-06-17 17:57:15 +0000
+++ b/bzrlib/repofmt/groupcompress_repo.py	2009-06-18 19:19:49 +0000
@@ -675,6 +675,42 @@
         return self._inventory_add_lines(revision_id, parents,
             inv_lines, check_content=False)
 
+    def _create_inv_from_null(self, delta, revision_id):
+        """This will mutate new_inv directly.
+
+        This is a simplified form of create_by_apply_delta which knows that all
+        the old values must be None, so everything is a create.
+        """
+        serializer = self._format._serializer
+        new_inv = inventory.CHKInventory(serializer.search_key_name)
+        new_inv.revision_id = revision_id
+        entry_to_bytes = new_inv._entry_to_bytes
+        id_to_entry_dict = {}
+        parent_id_basename_dict = {}
+        for old_path, new_path, file_id, entry in delta:
+            if old_path is not None:
+                raise ValueError('Invalid delta, somebody tried to delete %r'
+                                 ' from the NULL_REVISION'
+                                 % ((old_path, file_id),))
+            if new_path is None:
+                raise ValueError('Invalid delta, delta from NULL_REVISION has'
+                                 ' no new_path %r' % (file_id,))
+            if new_path == '':
+                new_inv.root_id = file_id
+                parent_id_basename_key = ('', '')
+            else:
+                utf8_entry_name = entry.name.encode('utf-8')
+                parent_id_basename_key = (entry.parent_id, utf8_entry_name)
+            new_value = entry_to_bytes(entry)
+            # Populate Caches?
+            # new_inv._path_to_fileid_cache[new_path] = file_id
+            id_to_entry_dict[(file_id,)] = new_value
+            parent_id_basename_dict[parent_id_basename_key] = file_id
+
+        new_inv._populate_from_dicts(self.chk_bytes, id_to_entry_dict,
+            parent_id_basename_dict, maximum_size=serializer.maximum_size)
+        return new_inv
+
     def add_inventory_by_delta(self, basis_revision_id, delta, new_revision_id,
                                parents, basis_inv=None, propagate_caches=False):
         """Add a new inventory expressed as a delta against another revision.
@@ -700,24 +736,29 @@
             repository format specific) of the serialized inventory, and the
             resulting inventory.
         """
-        if basis_revision_id == _mod_revision.NULL_REVISION:
-            return KnitPackRepository.add_inventory_by_delta(self,
-                basis_revision_id, delta, new_revision_id, parents)
         if not self.is_in_write_group():
             raise AssertionError("%r not in write group" % (self,))
         _mod_revision.check_not_reserved_id(new_revision_id)
-        basis_tree = self.revision_tree(basis_revision_id)
-        basis_tree.lock_read()
-        try:
-            if basis_inv is None:
+        basis_tree = None
+        if basis_inv is None:
+            if basis_revision_id == _mod_revision.NULL_REVISION:
+                new_inv = self._create_inv_from_null(delta, new_revision_id)
+                inv_lines = new_inv.to_lines()
+                return self._inventory_add_lines(new_revision_id, parents,
+                    inv_lines, check_content=False), new_inv
+            else:
+                basis_tree = self.revision_tree(basis_revision_id)
+                basis_tree.lock_read()
                 basis_inv = basis_tree.inventory
+        try:
             result = basis_inv.create_by_apply_delta(delta, new_revision_id,
                 propagate_caches=propagate_caches)
             inv_lines = result.to_lines()
             return self._inventory_add_lines(new_revision_id, parents,
                 inv_lines, check_content=False), result
         finally:
-            basis_tree.unlock()
+            if basis_tree is not None:
+                basis_tree.unlock()
 
     def _iter_inventories(self, revision_ids):
         """Iterate over many inventory objects."""

=== modified file 'bzrlib/tests/per_repository/test_add_inventory_by_delta.py'
--- a/bzrlib/tests/per_repository/test_add_inventory_by_delta.py	2009-04-09 20:23:07 +0000
+++ b/bzrlib/tests/per_repository/test_add_inventory_by_delta.py	2009-06-18 18:18:36 +0000
@@ -47,8 +47,17 @@
 
     def make_inv_delta(self, old, new):
         """Make an inventory delta from two inventories."""
-        old_ids = set(old._byid.iterkeys())
-        new_ids = set(new._byid.iterkeys())
+        by_id = getattr(old, '_byid', None)
+        if by_id is None:
+            old_ids = set(entry.file_id for entry in old.iter_just_entries())
+        else:
+            old_ids = set(by_id)
+        by_id = getattr(new, '_byid', None)
+        if by_id is None:
+            new_ids = set(entry.file_id for entry in new.iter_just_entries())
+        else:
+            new_ids = set(by_id)
+
         adds = new_ids - old_ids
         deletes = old_ids - new_ids
         common = old_ids.intersection(new_ids)
@@ -68,7 +77,11 @@
         # validator.
         tree = self.make_branch_and_tree('tree')
         revid = tree.commit("empty post")
-        revtree = tree.basis_tree()
+        # tree.basis_tree() always uses a plain Inventory from the dirstate, we
+        # want the same format inventory as we have in the repository
+        revtree = tree.branch.repository.revision_tree(
+                    tree.branch.last_revision())
+        tree.basis_tree()
         revtree.lock_read()
         self.addCleanup(revtree.unlock)
         new_inv = revtree.inventory

=== modified file 'bzrlib/tests/test_chk_map.py'
--- a/bzrlib/tests/test_chk_map.py	2009-06-05 18:03:40 +0000
+++ b/bzrlib/tests/test_chk_map.py	2009-06-17 18:23:59 +0000
@@ -78,6 +78,11 @@
         root_key = CHKMap.from_dict(chk_bytes, a_dict,
             maximum_size=maximum_size, key_width=key_width,
             search_key_func=search_key_func)
+        root_key2 = CHKMap._create_via_map(chk_bytes, a_dict,
+            maximum_size=maximum_size, key_width=key_width,
+            search_key_func=search_key_func)
+        self.assertEqual(root_key, root_key2, "CHKMap.from_dict() did not"
+                         " match CHKMap._create_via_map")
         chkmap = CHKMap(chk_bytes, root_key, search_key_func=search_key_func)
         return chkmap
 

=== modified file 'bzrlib/tests/test_inv.py'
--- a/bzrlib/tests/test_inv.py	2009-04-09 20:23:07 +0000
+++ b/bzrlib/tests/test_inv.py	2009-06-17 18:41:26 +0000
@@ -297,7 +297,13 @@
         inv.root.revision = "rootrev"
         chk_bytes = self.get_chk_bytes()
         chk_inv = CHKInventory.from_inventory(chk_bytes, inv, 120)
+        chk_inv.id_to_entry._ensure_root()
         self.assertEqual(120, chk_inv.id_to_entry._root_node.maximum_size)
+        self.assertEqual(1, chk_inv.id_to_entry._root_node._key_width)
+        p_id_basename = chk_inv.parent_id_basename_to_file_id
+        p_id_basename._ensure_root()
+        self.assertEqual(120, p_id_basename._root_node.maximum_size)
+        self.assertEqual(2, p_id_basename._root_node._key_width)
 
     def test___iter__(self):
         inv = Inventory()
@@ -454,6 +460,8 @@
         # new_inv should be the same as reference_inv.
         self.assertEqual(reference_inv.revision_id, new_inv.revision_id)
         self.assertEqual(reference_inv.root_id, new_inv.root_id)
+        reference_inv.id_to_entry._ensure_root()
+        new_inv.id_to_entry._ensure_root()
         self.assertEqual(reference_inv.id_to_entry._root_node._key,
             new_inv.id_to_entry._root_node._key)
 
@@ -473,6 +481,10 @@
         reference_inv = CHKInventory.from_inventory(chk_bytes, inv)
         delta = [(None, "A",  "A-id", a_entry)]
         new_inv = base_inv.create_by_apply_delta(delta, "expectedid")
+        reference_inv.id_to_entry._ensure_root()
+        reference_inv.parent_id_basename_to_file_id._ensure_root()
+        new_inv.id_to_entry._ensure_root()
+        new_inv.parent_id_basename_to_file_id._ensure_root()
         # new_inv should be the same as reference_inv.
         self.assertEqual(reference_inv.revision_id, new_inv.revision_id)
         self.assertEqual(reference_inv.root_id, new_inv.root_id)




More information about the bazaar-commits mailing list