[MERGE] Updates to annotate

John Arbash Meinel john at arbash-meinel.com
Tue Feb 26 23:33:59 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Thu, 2008-02-21 at 12:42 -0600, John Arbash Meinel wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Robert Collins wrote:
>> | On Wed, 2008-02-20 at 16:26 -0600, John Arbash Meinel wrote:
>> |> -----BEGIN PGP SIGNED MESSAGE-----
>> |> Hash: SHA1
>> |>
>> |> The attached patch changes how annotate works. This isn't the last step that
>> |> I'll be doing to refine annotate, but it does make a significant improvement.
>> |
>> | I don't like the change to get_components_positions - can you enlarge on
>> | what its for? I'd rather see a new method that does what you need
>> | without causing more data to be pushed around for things that don't need
>> | it.
>> |
>> | The rest seems fine.
>> |
>> | -Rob
>>
>> Well, the places that are currently using get_component_positions are actually
>> using the extra data, I just didn't finish updating them.
>>
>> Specifically, get_line_list needs to know about newlines and currently has to
>> re-query the index to find that out after the fact.
>>
>> It doesn't need the parent_ids, though. But it seemed to make sense to have an
>> api that gives slightly more info than you need rather than creating 2 apis that
>> give almost identical values.
> 
> Depends on the work the index has to do. I'm looking forward slightly to
> the new delta support and external references, and there I figure
> generalising knits code further to be less specific will be useful; In
> particular the newline thing is an abortion :|. It would be nice to
> include it in the index result but as something optional (because its
> compressor specific) rather than as part of the
> everything-can-rely-on-this api.
> 
> -Rob

I agree that noeol is a bit of an abomination (not sure about it being
an abortion :). I went ahead and removed it from
_get_components_positions. Ideally it would be a delta-specific data
field, which would then get stored as part of the content rather than in
the index anyway.

I would like to remove it from get_build_details, but there is a bit of
friction again. _KnitIndex is sort of designed around a query for info
structure, but you designed pack index (GraphIndex?) to be more of a
"ask me everything you want, but only ask me once". I think the new
design can scale up a bit better, but it shifts the burden of caching
onto the callers, which means it gets spread around and is a
harder-to-use api.

Of course the fact that we put critical data in the index which must be
used to interpret the data hunks was just poor design. At present it
means you need to keep the data from the index around so that you can
pass it back to the data building routines.

So I think my problems are that you would like stuff like noeol to be
abstracted behind the Knit logic so that callers wouldn't care. But we
are asking that _KnitData not query the _KnitIndex directly, so the
caller needs to track what is stored in the index. But the index will
change what is stored because it is associated with the data.

Maybe we could change the Content layering such that you pass in an
opaque "index record" and the content figures out how to interpret it?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHxKHnJdeBCYSNAAMRAqUIAJ9bv095UT0QMktl9yE9obElE/SmfgCgzCTk
CVU8ApZhXXsjOboAcY49Tl8=
=/jgM
-----END PGP SIGNATURE-----



More information about the bazaar mailing list