Next Steps for hpss, etc

Andrew Bennetts andrew.bennetts at canonical.com
Sun Nov 21 21:30:57 GMT 2010


John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm roughly finishing up my current work items, and figured I'd put out
> some ideas for what I should be working on next. We've mentioned wanting
> to do a bit more collaboration, so at least working in the same area,
> even if we aren't working on the same bug.
> 
> The top two items in my head are:
> 
> 1) Commit to stacked branch
> 2) Shallow branching
>    As defined by, creating a stacked local copy that has the data it
>    needed to create the working copy, and no further policy. (It
>    doesn't explicitly cache, but it doesn't avoid extra history, either)

+1000 !

These would both be excellent to have for UDD — and for regular use.

> (1) seems like a prerequisite for (2). Since having a shallow branch
> doesn't help you if you still can't commit to it.

(I can probably imagine some rare cases where it would be useful, but
basically yes.)

> (2) seems like something that could use a really good HPSS verb for
> "give me everything I need to start hacking on the tip revision of this
> branch".

Agreed.  Happily I'm currently working on improving the “fetch spec”
code (as part of “fetching a branch should by default fetch all the
branch's tags too” bug) which will help with this I think.

> I've considered making it a branch that will cache anything it has to
> access (so any time it falls back to the stacked-on repository, it will
> save that data locally.) The primary reason I would skip that, is
> because inserting into a repository is transactional. We require both a
> write lock, and a write group that is then committed. And defining the
> transaction boundaries during an otherwise read-only operation is
> difficult at best.
> Also, you get into problems of data model violation. (log needs to look
> at only the revision texts, but the presence of a revision text implies
> the presence of the associated inventory and texts.)

We can always tackle this later.  I agree it sounds a bit hairy... but
when we get there perhaps we'll see some ways to deal with that.  We'll
see, eventually :)

> I think the two of these would be very good for UDD stuff. Since it
> allows people to download ~tarball sized content, and still gives them a
> branch they can hack on and upload.

I believe Launchpad's build-from-recipe infrastructure would like to use
this as well.

> I also think most of the bits are there, so we can migrate
> incrementally, rather than having to do a large amount of work to get
> anything that can land.
> 
> I also think these two things could be worked on in parallel. You may
> end up with a shallow branch that you can't commit to yet, but it would
> still be a productive thing to get done.

Yes, I think the verb can be done at least somewhat in parallel... and
on that note: precisely which records should a hypothetical
“get_shallow_stream for revid X” HPSS operation fetch?

Off the top of my head you want:

 * revision X
 * inventory X
 * at least one parent inventory (as usual for stacking), unless of
   course the parent is null:
 * all texts referenced in inventory X (available as fulltexts, of
   course)
 * signature X (if present)

-Andrew.




More information about the bazaar mailing list