Repository.insert_data_stream contract [Re: [MERGE] Packs. Kthxbye.]
Andrew Bennetts
andrew at canonical.com
Fri Oct 19 07:22:58 BST 2007
Robert Collins wrote:
> On Fri, 2007-10-19 at 15:03 +1000, Andrew Bennetts wrote:
[...]
> >
> > I sat down to write one, and found that I'm not really sure exactly what the
> > contract for this method should be. Here's an intentionally vague docstring for
> > it that avoids the hard questions:
>
> Some specific questions:
> - is it keyed to a RepositoryFormat? is it able to be used between
> arbitrary repositories?
Looks like the answer is that it is tied to a RepositoryFormat.
> - At the serialisation layer, is it a fixed format, or are we expecting
> to use magic to determine whats in there?
Ditto.
> - At the semantic layer, What about Knit1 to Knit3? Knit to Pack? Knit3
> to Knit1?
Good question. I don't know the answer to this yet.
> > That's pretty unsatisfactory, though I suppose it's better than nothing.
> >
> > This doesn't say anything concrete about what “bytes” contains. At the moment
> > the implementation assumes it is a bencoded list of knit records, that is a list
> > whose elements are tuples of (version, options, parents, knit_bytes). This is
> > convenient for stuffing into a knit versionedfile (via
> > VersionedFile.insert_data_stream). I'm not sure how convenient this is for
> > other formats. Robert, does this seem a reasonable interface for a stream
> > between packs?
>
> An easier format for a stream between packs is probably the bytes for
> the pack, and each index. So 5 byte strings. Plus the expected hash of
> the pack. That avoids the knit layers altogether, and the requirement to
> do knit-at-a-time.
Ok, so a knit data stream isn't convenient for packs. So let's not standardise
the existing stream format as One True Format.
> > To write a good docstring for this, we need to figure out if it can (and should)
> > work with a stream from a repository in a different format. I guess some
> > streams have to be incompatible, e.g. a repository supporting tree roots can have
> > changes that cannot be represented in a repository that doesn't support tree
> > roots.
>
> Agreed. And we should have tests for this. I also noticed one test in
> tests.test_repository that this method should error in some
> circumstances, and I think that that should be an interface test.
Agreed.
> > So I think perhaps this docstring should say that the encoding is
> > format-specific? e.g.
> >
[...]
>
> I think thats a massive improvement over the current docstring. What
> about error conditions? Like, you say same format, and should, but what
> happens if you get it wrong - and because this is likely to matter, I
> think we should define it - e.g. 'a mismatched format will cause an
> error and not modify the repository at all'.
Agreed. I'll work on details and tests for this. The current stream doesn't
have an explicit version/format marker, and this is looking like a bit of an
issue. Repository.insert_data_stream hasn't been in a release yet, so we can
still change this without breaking much, so I'll do that.
> We can follow up a more clear definition of it without expanding its
> scope to cross-format situations. Once we have a good solid definition
> of it today merged into bzr.dev, then I'd like to move onto discussing
> cross-format scenarios.
Sounds good.
I think perhaps I should also write “I will write docstrings” on a blackboard
100 times until I learn my lesson ;)
Thanks,
-Andrew.
More information about the bazaar
mailing list