Article: An Idea for a Revision Control System
Jari Aalto
jari.aalto at cante.net
Tue Mar 14 09:11:52 GMT 2006
I came accorss an interesting article. Perhaps the developers will find
some ideas that could be implementable in bzr?
Jari
Zed's blog: An Idea for a Revision Control System
http://www.zedshaw.com/blog/programming/an_rcs_idea.html
[...]
I reviewed several others, but I was never really satisfied with
how they work. I don't have a list of specific problems just yet,
but I do have an idea of how I would like to do my SCM work.
Basically, I would like to have a distributed kind of agent based
system that lets everyone exchange revisions freely. The revisions
would be in discrete chunks and published or sent to participants.
A published revision would be "ready for others". Sending a
revision to someone (P2P style) would be for short collaborations
between people before publishing. As people work, exchanging
revisions and publishing them, the SCM (or others) would grab the
changes and create the releases from them, potentially updating
the main branch.
The distributed agent based exchange part would be implemented by
a publish/subscribe style P2P network. There would be two general
types of usage: people sending revisions between each other and
people posting revisions to a central mediator service for others
to pick-up later. In the first instance, people would need to be
on-line at the same time and coordinate sending/receiving the
revisions. In the second instance, people would want to publish
the revision for others to pick-up when they're offline (or, as a
more permanent publishing). This allows for people to collaborate
in the way that is most natural to create changes to the source
(through P2P), and then use the publish mechanism to get "ready
for others" revisions to others (through publishing to the
mediator).
The SCM would then use these two mechanisms to grab revisions from
people and apply them to the target deployment. An SCM would
subscribe to the revisions (based on a naming system) they are
interested in from the mediator. When revisions are published by
others, the system notifies the SCM that they are available. The
SCM is then able to grab them and apply them to the source in a
controlled manner. Other members can also subscribe to changes
they are interested in so they can coordinate their work. The SCM
could also use the P2P transport system for verifying revisions
quickly before the person who made them publishes them for others.
I think that, using this publish/subscribe mechanism allows for
better coordination between all participants because it gives them
the following three control mechanisms for revisions:
* Filtering through subscriptions. You just don't subscribe to
the stuff you don't care about. You don't need to take down
anything you don't want.
* Ordering revision applications to the source. In CVS, you
just have to take everything all at once or go through great
pains to break out the parts you need. With this system, you
would select the revisions you want to apply, and can
specify that one is more important than another. There would
need to be an "auto-pilot" mode for the times when you
don't care though.
* Flexible sources of revisions. You can either get revisions
from various subscription points, or you can just get them
directly from another member. Because of this, it may be
possible to have entirely anonymous distribution networks
(although, I'd think that is retarded).
In order for this to work, the revision packaging system has to be
capable of handling changes to any type of file, changes to
directory structure, and logs of activities. This last part is
something I thought would be nice: Separate the logging of
developer activity from the revision publishing system. My idea
here is that nobody really uses commit logs like they were
intended in CVS, and really what you want is meta-data on the
particular work completed with the given revision. It would be
better to have a developer do their work, and using a small tool
to record their work as they go in a separate "development log
file". When the revision is created, this development log file
is packaged with the revision. This allows the source tree to
remain free of any unnecessary files, and lets the developer edit
the log to make it sane before publishing or sending.
I have a small python program that creates very primitive
revisions right now. I got it to work, but it's totally not
optimal. I may try it out in several scenarios doing revisions by
hand to see how the work-flow operates. I may also try to get the
revisions to work over e-mail with some other people to see if the
publish/subscribe stuff works. I have found that, with this method
of doing revisions, creating branches is really easy involving
nothing more than a directory copy. It doesn't handle conflicts,
but that might not be such a bad thing since I haven't found a
tool yet that handles conflicts appropriately. I think it might be
better for an SCM to sort out who's revisions are better based on
the merits of each revision and the logs, possibly telling the
publisher to change them as needed.
Right now the ideas are flowing and I'm having fun, but we'll see
if I don't run into the same problems that everyone else does.
There are several issues which I haven't confronted yet, but I
hope to re-use as much as possible and keep the system as simple
as possible. One major goal would be to keep the system language
and platform neutral with very minimal requirements for installing
a client or mediator (or, even allow operating without a
mediator).
More information about the bazaar
mailing list