Nomenclature

John A Meinel john at arbash-meinel.com
Fri Sep 30 17:03:04 BST 2005


Gustavo Niemeyer wrote:
> Greetings!
> 
> Since I've been coding in bzr, I've perceived that some parts of the
> code are slightly inconsistent regarding nomenclature. This is
> certainly not a major issue, and is of course trivial to fix. Even
> then, I'd like to bring some of them to your attention.
> 
> I belive it'd be worthwhile to fix them at some point, and possibly
> introduce a convention to guide future development. At this stage
> it's hard to tell what convention one should follow when writing
> new code.

Sounds like a good idea. I will try and give my understanding of the 
different conventions in use. Obviously Martin is the ultimate locus of 
meaning, since he did the bulk of the work. Probably some of the 
differences are because of the many cooks, and that does need a good 
convention to keep things from spoiling. :)

> 
> * revno vs. rev_id

revno is an integer which represents a particular revision present in 
"revision-history".
rev_id or revision_id is the actual hash (such as 
john at arbash-meinel.com-2005060912341-a34aeou2h34a)

The idea is that user input will frequently be a revision number (it 
used to be the only way you could do input). Because nobody wants to 
type a hash.

The merge code also currently only lets you state a merge based on the 
revision number.

In general revision_id is the route to move to, since we are starting to 
allow you to do things to revisions which are not in your revision history.

> * branch.get_revision_*() vs. branch.revision_*()

I personally prefer the get_revision* forms. But I think in general the 
usage has been get_revision* when you are getting some particular bit of 
a revision, and revision_* when you are getting something related. For 
instance, revision_history() is a list of revisions, not an individual 
revision. revision_tree() is the Tree associated with that particular 
revision.

I don't know what the general preference is.

> * branch.get_rev_id() vs. branch.revision_id_to_revno() vs. *.id2*()
get_rev_id translates the above revno into a revision_id. 
revision_id_to_revno is the opposite translation.

id2* are in the inventories, and translate a file id into its path. 
Which is a rather different operation (and a different id, file_ids are 
a different namespace than revision_ids)

> * branch.get_*() vs. branch.(!get_)*() in general
> * branch.controlfilename() and branch.controlfile() vs.
>   branch.revision_history()

controlfilename is "give me the full path to a file, given that it's 
control name is X". For example "pending_merges" is the control name, 
and its full path is
$wd/.bzr/pending_merges

controlfile returns a file-like object, which contains the actual 
contents of that file.

I think controlfilename should be deprecated, especially since my 
Transport stuff means that it may be a remote file.

*Most* of the time, the only thing to call Branch.controlfile should be 
a member of Branch (but sometimes plugins add functionality, and need to 
hook into it).

revision_history() is because when possible we don't want people reading 
the control files directly.
> * inventory.get_idpath() vs. inventory.id2path() (what's the
>   difference?)

get_idpath() returns the list of ids from the root to a file, so if you have
dir/subdir/subsubdir/file.txt
get_idpath() will return [dir-XXYYZZ, subdir-ZZYYXX, subsubdir-YYZZXX, 
file.txt-YYZZXX]

id2path just translates an id into its full path:

inv.id2path(file.txt-YYZZXX) = dir/subdir/subsubdir/file.txt

They are something like inverses, but not completely.

> * osutils.isfile()/isdir() vs. osutils.is_inside()/is_inside_any()
I think the isfile is because it mimicks os.isfile(), while most people 
prefer the is_* form.

> * osutils.filesize() vs. osutils.sha_file() vs. osutils.file_kind()
> * osutils.splitpath() vs. osutils.split_lines()
> * weave.idx_to_name() vs inventory.id2path() 
> * weave.numversions() vs weave.*_*()
> * weavestore.filename() vs. weavestore.get_*()
filename is frequently thought of as a single word. And it is 
translating the lookup key, into the full filename. (similar to 
controlfilename).
Again, it is more of an internal use function, so it probably should be 
more _filename()

To me, get_* is used when you return something of importance, not just 
translating X => Y.

> * weavestore.file_class() vs. weavestore.get_file_*()
> * weavestore.ignored_files() vs. weavestore.get_ignore_list()
> 

I don't see anything like weavestore.ignored_files() I think you are 
meaning WorkingTree.ignored_files().

I think there, they are consistent in a localized fashion. For example, 
you have .extras() .unknowns() .ignored_files(), and then get_file*
(though you could certainly argue that it should be .ignored() if it 
wanted to really be consistent).

You are right, that we could convert everything to get_* forms, or 
(!get_*) forms.

I think get_ makes sense, especially if you are going to have a matching 
put_ or set_.

However, having every function be get_* doesn't seem right either. I'm 
not sure what the specific cutoff should be. But if you aren't going to 
have a set/put form, I would argue that you don't need the get.

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050930/931c8c5b/attachment.pgp 


More information about the bazaar mailing list