Problems with deleting tags?

Wed Apr 20 05:03:17 UTC 2016

[ Hey, lookit how long this got.  It's really your own fault for
engaging me... ]

On Tue, Apr 19, 2016 at 01:55:47PM -0600 I heard the voice of
Richard Wilbur, and lo! it spake thus:
>
> You make some pretty strong arguments in favor of version-controlled
> tags.

Well, that's not hard to do.  Versioned tags are strictly more
powerful than non-versioned tags, so it's Obvious(tm) that they're the
proper way to go.  I mean, as long as you ignore implementation
questions and installed-base compatibility and whatnot, but those
don't really matter to anything in the real world, so...

One could infer from how everyone-but-hg doesn't have them, that the
advantage is so minimal that nobody bothers.  And actually, this
thread even sorta supports that; it starts off with a problem that's
known, acknowledged, and fundamental to the design, and also generally
comes up so rarely that nobody's motivated to do much about it.

I'd suggest a second contributing factor; that it's something very
hard to retrofit well into a design.  If your model doesn't already
have a place to slot them in, it's very hard to do unless you can
change the model.  Remember, bzr didn't originally have tags at all;
it was already widespread (well, FSVO) and deployed before anybody
tried implementing them.  So _NO_ though was given to them in the
initial design.  I'm pretty sure the same is true of every other
system.  Non-versioned tags aren't too hard to bolt onto the side
reasonably cleanly; versioned are.

I think they're definitely the better course.  And I would consider
them an "of-course".  In a greenfield project.  An existing one,
that's harder.  And for bzr specifically...  well.  It's a volunteer
project, and that means people can, will, should, and should be
encouraged to, work on the things they want to work on.  I certainly
wouldn't want to say "If you want to work on versioned tags for bzr,
you shouldn't" (though deployment issues are still a thing, even if
you figure out all the implementation troubles to do it well).  But I
wouldn't really consider them in the top 20 of improvements bzr could
really use, and with the man-hours going into development nowadays...
if somebody said "I'd like to do something on bzr.  Should I look at
doing versioned tags?", I'm pretty sure an appropriate response would
be pretty negative.  Perhaps laced with colorful metaphors.

> On Mon, Apr 18, 2016 at 3:20 AM, Matthew D. Fuller
> <fullermd at over-yonder.net> wrote:
> > And there's the creepy layering inversion
> > that you have to pull something out of a revision to figure out which
> > revision you're looking for.
> 
> That would be an outgrowth of having the tags under normal version
> control.  What type of version control would you propose that would
> avoid this situation?

That's a specifically hg-related issue.  It's not because the tags are
versioned per se; it's that the tag storage is Just A File in the
tree.  You could implement it such that $ROOT/.myvcs-tags was a
pseudo-file for the UI of editing and merging and conflicts and such,
while the actual tags storage was some internal data structure.  And
maybe you should [make the UI that]; it's an obvious abstraction and
makes good intuitive sense.  But that's not how it is in hg.  It
really is just another file, no different from any other file in your
project.  So looking up a tag means actually checking out a file from
a revision's tree, and then parsing through it to find the rev
identifier to go from.

  n.b.: I'm not an hg expert, so I don't guarantee the above.  But it
  was correct as far as I could tell some time back when I actually
  looked into it, and I'm assuming it's not changed since.

> So how would you characterize this use case where the user misuses
> tags and where you suggest fixing the user by possibly "adding
> another feature to the handle the use-case that's actually desired"?
> (In other words, what feature is needed to fix the user?  Possibly
> "local" or "personal" tags?)

There's probably a lot of them.

"tag" has an established meaning in the VCS world.  Back in the CVS
era, it was [almost; ignoring date-based] the only way you could
actually refer to a specific complete coherent tree, since all the
versions were of files.  With modern systems with atomistic commits,
you don't need that meaning anymore, but assigning a specific
human-meaningful label to a specific revision is still useful.  We use
it for marking releases, and for other similarly momentous occasions.
For instance, I laid down tags in ctwm before and after reindenting
the whole codebase, because those are points in time that deserve
special notice.

The word also got sopped up in the "tag cloud" abstraction blogs and
such, where it gets used in a classificatory role.  This is a _very_
divergent usecase from the VCS-tag one, because the whole point is
having a common label that applies to many revisions.  We've
occasionally had people in the past come in apparently very puzzled
that they couldn't make a tag point to multiple revisions, because
they wanted something in this role.  Often for usecases like "tests
passing/failing", etc.  This asks more for something like
post-annotations on revisions.  In monotone, you could do this by
defining a new certificate type.

The "local tags" or the like concept is generally (IMO) more in line
with a concept like "bookmarks"[0].  "I need to go back and look at
this", or "Lemme drop a pin here and go try something...  [commit 5x]
...  nope, nevermind, let's dump that and go back".  In some ways,
this is very similar to the straight VCS-tags concept; in fact, it may
be technically identical.  But it's socially distinct.

    [0] This leaves aside how we already have a "bookmarks" plugin in
        bzr, which does something completely unrelated.  And then
        there's hg's "bookmarks", which are a third unrelated thing.
        Naming is hard  :)

Something like the git reflog is a fourth almost-but-not-quite idea.
It's more like "breadcrumbs", as a record of where you've been, just
in case you need to quickly hop back to somewhere.

There are liable to be a lot more that aren't on the top of my head.

Generally, it's always nice when you can say "Hey, we don't need a new
feature for this, you can just use $EXISTING_FEATURE".  And it's
sometimes nice to be able to say "Hey, we can just slightly generalize
$EXISTING_FEATURE and use it for both".  But there are at least two
problems with doing that.  One is that you generalize it far enough to
cover both cases, and you find that it's now a pretty rough fit for at
least one (and usually the older one, if you care at all about the
newer) and often both.  I believe this technique is SVN's slogan  ;)

The other is suggested by the difference above between "VCS-tag" and
"local-tag"; even things which are technically the same are often
socially not, and having them in the same bucket can be confusing.  So
it can be a gain even if we have (to make up names) "bzr tag1" and
"bzr tag2", which use the same exact code on the backend, but maintain
separate lists of names, if they get used for socially and mentally
very different tasks, and users can be consistent about it.

  So, for the VCS-tag vs. local-tag, it may be reasonable to have the
  same backend code (maybe slightly different to maintain disjoint
  lists, or maybe an extra flag in the data structure saying which
  it's on) and just different commands or flags for them.  Though that
  can intersect oddly with some questions of how the history is
  represented too as far as how it gets publicized, so maybe it's not
  quite so simple.

So, on the one hand, you have to be skeptical about saying "here's a
new use-case, we need to build a new feature to handle it", but on the
other, you have to be cautious about saying "hey, we can change this
just a bit, and it can handle both!" too.

> Maybe a new {tree|branch|repository} format (first supported in bzr
> 2.8.0?) would suffice which would allow us to recognize during a merge
> that we need to try to bridge the gap.  bzr 2.8.0 would include code
> in `bzr upgrade` to bring old {trees|branches|repositories} up to the
> new format.

It'd have to be.  But it would also have to be[1] a trapdoor upgrade,
which isn't backward compatible, so you need a new (probably
non-default at the start) format, and potentially two new versions of
every other format until we're ready to desupport non-versioned-tag
formats.  Hearkens back to the "fun" we went through getting rich-root
formats in and default; I don't think anybody wants to revisit THAT
territory...

    [1] It's _just_ possible you could find dodges that let you not
        quite rule out downgrades, but it's dangerous ground.  You're
        throwing away information when you do, so an A->B->A'
        conversion leads to A != A'.  And if you have a downgraded
        version, you presumably want to support people using it to
        contribute, which means they might be contributing tags too,
        and you have to worry a lot about what might happen in all
        those cases.  So it's a lot easier to just say "nope,
        trapdoor".

-- 
Matthew Fuller     (MF4839)   |  fullermd at over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.