Rev 40: Add wip about a new configuration implementation. in file:///home/vila/src/bzr/devnotes/

Vincent Ladeuil v.ladeuil+lp at free.fr
Thu Dec 2 14:00:03 GMT 2010


At file:///home/vila/src/bzr/devnotes/

------------------------------------------------------------
revno: 40
revision-id: v.ladeuil+lp at free.fr-20101202140003-6cqwy38cvgz6hwvj
parent: mbp at canonical.com-20100609065349-9l6vsvylq6szlein
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: devnotes
timestamp: Thu 2010-12-02 15:00:03 +0100
message:
  Add wip about a new configuration implementation.
-------------- next part --------------
=== added file 'configuration.txt'
--- a/configuration.txt	1970-01-01 00:00:00 +0000
+++ b/configuration.txt	2010-12-02 14:00:03 +0000
@@ -0,0 +1,433 @@
+==================
+Configuring Bazaar
+==================
+
+Goal
+====
+
+Not all needs can be addressed by the default values used inside bzr and
+bzrlib, no matter how well they are chosen (and they are ;).
+
+Many parts of ``bzrlib`` depends on some constants though and the user
+should be able to customize the behavior to suit his needs so these
+constants need to become configuration options.
+
+These options can be set from the command-line, in an environment variable
+or recorded in a configuration file.
+
+Current issues
+==============
+
+* Many parts of ``bzrlib`` declare constants and there is no way for the
+  user to look at or modify them.
+
+* The current API requires a configuration object to create, modify or
+  delete a configuration option in a given configuration file.  ``bzr
+  config`` makes it almost transparent for the user. Internally though, not
+  all cases are handled: only BranchConfig implements chained configs,
+  nothing is provided at the repository level and too many plugins define
+  their own section or even their own config file.
+
+* ``locations.conf`` defines the options that needs to override any setting
+  in ``branch.conf`` for both local and remotes branches. Many users want a
+  way to define default values for options that are not defined in
+  ``branch.conf``. This could be approximated today by *not* defining these
+  options in ``branch.conf`` but in ``locations.conf`` instead. This
+  workaround doesn't allow a user to define defaults in ``locations.conf``
+  and override them in ``branch.conf``.
+
+* Defining a new option requires adding a new method in the ``Config``
+  object to get access to features like:
+
+  * should the option be inherited by more specific sections,
+
+  * should the inherited value append the relative path between the
+    section one and the location it applies to.
+
+  * the default value (including calling any python code that may be
+    required to calculate this value),
+
+  * priority between sections and various config files
+
+  A related problem is that, in the actual implementation, some
+  configuration options have defined methods, other don't and this is
+  inconsistent.
+
+* Access to the 'active' configuration option value from the command line
+  doesn't give access to specific section.
+
+* Rules for configuration options are not clearly defined for remote
+  branches (they may differ between dumb and smart servers).
+
+* The features offered by the Bazaar configuration files should be easily
+  accessible to plugin authors either by supporting plugin configuration
+  options in the configuration files or allowing plugin to define their
+  own configuration files.
+
+* While the actual configuration files support sections, they are used in
+  mutually exclusive ways that make it impossible to offer the same set of
+  features to all configuration files:
+
+  * ``bazaar.conf`` use arbitrary names for sections. ``DEFAULT`` is used
+    for global options, ``ALIASES`` are used to define command aliases,
+    plugins can define their own sections, some plugins do that
+    (``bzr-bookmarks`` use ``BOOKMARKS`` for example), some other define
+    their own sections.
+
+  * ``locations.conf`` use globs as section names. This provides an easy
+    way to associate a set of options to a matching working tree or
+    branch, including remote ones.
+
+  * ``branch.conf`` doesn't use any section.
+
+* There is no easy way to get configuration options for a given repository
+  or an arbitrary path. Working trees and branches are generally organized
+  in hierarchies and being able to share the option definitions is an often
+  required feature. This can also address some needs exhibited by various
+  branch schemes like looms, pipeline, colocated branches and nested
+  trees. Being able to specify options *in* a working tree could also help
+  support conflict resolution options for a given file, directory or
+  subtree.
+
+* Since sections allow different definitions for the same option, a total
+  order should be defined between sections to select the right definition
+  for a given path. Allowing globs for section names is harmful in this
+  respect since the order is currently defined as being the lexicographical
+  one. The caveat here is that if the order is always defined for a given
+  set of sections it can change when one or several globs are modified and
+  the user may get surprising and unwanted results in these cases. The
+  lexicographical order is otherwise fine to define what section is more
+  specific than another. (This may not be a problem in real life since
+  longer globs are generally more specific than shorter ones and explicit
+  paths should also be longer than matching globs. That may leave a glob and
+  a path of equal length in a gray area but in practice using ``bzr config``
+  should give enough feedback to address them).
+
+* Internally, configuration files (and their fallbacks, ``bazaar.conf`` and
+  ``locations.conf`` for ``branch.conf``) are read every time *one* option is
+  queried. Likewise, setting or deleting a configuration option implies
+  writing the configuration file *immediately* after re-reading the file to
+  avoid racing updates.
+
+* The current implementation use a mix of transport-based and direct file
+  systems operations.
+
+* While the underlying ``ConfigObj`` implementation provides an
+  interpolation feature, the ``bzrlib`` implementation doesn't provide an
+  easy handling of templates where other configuration options can be
+  interpolated. Instead, ``locations.conf`` (and only it) allows for
+  ``appendpath`` and ``norecurse``.
+
+* Inherited list values can't be modified, a more specific configuration can
+  only redefine the whole list.
+
+* There is no easy way to define dicts (the most obvious one being to use a
+  dedicated section which is already overloaded). Using embedded sections
+  for this would not be practical either if we keep using a no-name section
+  for default values. In a few known cases, a bencoded dict is stored in a
+  config value, so while this isn't user-friendly, not providing a better
+  alternative shouldn't be a concern.
+
+
+Proposed implementation
+=======================
+
+
+Configuration files definition
+------------------------------
+
+While of course configurations files can be versioned they are not intended
+to be accessed in sync with the files they refer to (one can imagine
+handling versioned properties this way but this is *not* what the bazaar
+configuration files are targeted at). ``bzr`` will always refer to
+configuration files as they exist on disk when an option is queried or set.
+
+The configuration files are generally local to the file system but some of
+them can be accessed remotely (``branch.conf``, ``repo.conf``).
+
+
+Naming
+------
+
+The option name space is organized as follow:
+
+* Bazaar itself defines all its constants as ``bzr.option_name``.
+ 
+* plugins can define their own options by prefixing them with the plugin
+  name as ``svn.option_name`` for the ``svn`` plugin.
+
+Using valid python identifiers is recommended but not enforced (but we may
+do so in the future).
+
+Value
+-----
+
+All option values are text. They are provided as Unicode strings to API
+users with some refinements:
+
+* boolean values can be obtained for a set of acceptable strings (yes/no,
+  y/n, on/off, etc),
+
+* a list of strings from a value containing a comma separated list of
+  strings.
+
+Since the configuration files can be edited by the user, ``bzr`` doesn't
+expect their content to be validated. Instead, the code using options should
+be ready to handle *invalid* values by warning the user and fallback to a
+default value. Likely, if an option is not defined in any configuration
+file, the code should fallback to a default value (helpers should be
+provided by the API to handle common cases, warning the user, getting a
+particular type of value, returning a default value).
+
+This also ensures compatibility with values provided via environment
+variables or from the command line.
+
+Interpolation
+-------------
+
+Some option values can be templates and contain references to other
+options. This is especially useful to define URLs in sections shared for
+multiple branches for example. It can also be used to describe commands
+where some parameters are set by ``bzrlib`` at runtime.
+
+Since options values are text-only, and to avoid clashing with other
+interpolation syntaxes, references are enclosed with curly brackets::
+
+  push_location = lp:~{launchpad_username}/bzr/{nick}
+
+In the example above, ``launchpad_username`` is an already defined
+configuration option while ``nick`` is the branch nickname and is set when a
+configuration applies to a given branch.
+
+The interpolation implementation should accept an additional dict so that
+``bzrlib`` or plugins can define references that can be interpolated without
+being existing configuration options::
+
+  diff_command={cmd} {cmd_opts} {file_a} {file_b}
+
+There are two common errors that should be handled when handling interpolation:
+
+* loops: when a configuration value refers to itself, directly or indirectly,
+
+* undefined references: when a configuration value refers to an unknown option.
+
+One possible implementation is to report errors when such references are
+encountered.
+
+Another implementation could be envisioned though: when a loop is
+encountered, we can fall back to the less specific configurations. This
+allows list values to refer to the definition in the less specific
+configurations allowing::
+
+  bazaar.conf:
+    debug_flags = hpss
+
+  branch.conf for mybranch:
+    debug_flags = {debug_flags}, hpssdetail
+
+  $ bzr -d mybranch config debug_flags
+  hpss, hpssdetail
+
+Undefined references would still be detected if they are not defined in any
+configuration or just stay unresolved which should be enough to trigger
+errors displaying the value. Diagnosing typos should be doable in this case.
+
+Configuration file syntax
+-------------------------
+
+The configuration file is mostly an ``ini-file``. It contains ``name =
+value`` lines grouped in sections. Comments are allowed by prefixing them
+with the '#' character.
+
+A section is named by the path it should apply to (more examples below).
+
+Options defined outside of any section act as defaults when no section
+applies. This means that in the most common cases, the user doesn't need to
+define any section.
+
+When sections are used, they provide a finer grain of configuration by
+defining option sets that apply to some working trees, branches,
+repositories or part of them.
+
+The subset is defined by the common leading path or a glob.
+
+* a full url: used to described options for remote branches and
+  repositories.
+
+* local absolute path: used for working trees, branches or repositories
+  on the local disks.
+
+* relative path: the path is relative to the configuration file and can be
+  used for colocated branches or threads in a loom, i.e any working tree,
+  branch or repository that is located in a place related to the
+  configuration file path. Some configuration files may define this path
+  relationship in specific ways to make them easier to use (i.e. if a config
+  file is somewhere below ``.bzr`` and refers to threads in a loom for
+  example, the relative path would be the thread name, it doesn't have to be
+  an *exact* relative path, as long as its interpretation is unambiguous and
+  clear for the user).
+
+Whatever path is used, the options apply if the branch path starts with
+the path defining the section (or if the glob matches).
+
+The ConfigOption object
+-----------------------
+
+In addition to the configuration files, one internal configuration dict can
+contain definitions for some configuration options. This will allow a finer
+grained definition of the default values and the online help.
+
+The ConfigOption object will define:
+
+* name
+
+* default value. ``None`` is the "default" default value.
+
+* docstring used for the help
+
+* a list of config files where the option can be defined.
+
+The ConfigFile object
+---------------------
+
+This is an implementation-level object that should rarely be used directly.
+
+* it can be local or remote
+
+* locking
+
+  All lock operations should be implemented via transport objects.
+
+* option life cycle
+
+  Working trees, branches and repositories should define a config attribute
+  following the same life cycle as their lock: the associated config file is
+  read once and written once if needed. This should minimize the file system
+  accesses or the network requests. There is no known racing scenarios for
+  configuration options, changing the existing implementation to this less
+  constrained one shouldn't introduce any. Yet, in order to detect such
+  racing scenarios, we can check that the current content of the
+  configuration file is the expected one before writing the new content and
+  emit warnings if differences occurred. The checks should be performed for
+  the modified values only. As of today, the size of the configuration files
+  are small enough to be kept in memory.
+
+The Config object
+-----------------
+
+This the object that provides access to the needed features:
+
+* getting an option value,
+
+* setting an option value,
+
+* deleting an option value,
+
+* handling a list of configuration files that should be tried in the given
+  order to find an option.
+
+Depending on the files involved, a working tree, branch or repository object
+should be provided to access the corresponding configuration files. Note
+that providing a working tree object also implicitly provides the
+associated branch and repository object so only one of them is required (or
+none for configuration files specific to the user like bazaar.conf and
+locations.conf).
+
+Getting an option value
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Depending on the option, there are various places where it can be defined
+and several ways to override these settings when needed.
+
+The following lists all possible places where a configuration option can
+be defined, but some options will make sense in only some of them. The
+first to define a value for an option wins (None is therefore used to
+express that an option is not set).
+
+* command-line (Not Implemented Yet)
+  ``-Ooption=value`` see bug #491196.
+
+* environment variable
+
+  ``export BZR_OPTION=value``
+
+  Some environment variables doesn't have a corresponding configuration
+  option (BZR_PLUGIN_PATH) and most configuration options doesn't have a
+  corresponding environment variable.
+
+* locations.conf
+
+  When an option is set in ``locations.conf`` it overrides any other
+  configuration file. This should be used with care as it allows setting a
+  different value than what is recommended by the project
+
+* tree.conf (Not Implemented Yet)
+
+  The options related to the working tree.
+
+  This includes all options related to commits, ignored files, junk files,
+  etc.
+
+  Note that the sections defined there can use relative paths if some
+  options should apply to a subtree or some specific files only.
+
+  See bug #430538 and bug #654998.
+
+* branch.conf
+
+  The options related to the branch.
+
+  Sections can be defined for co-located branches or loom threads.
+
+* repo.conf (Not Implemented Yet)
+
+  The options related to the repository.
+
+  Using an option to describe whether or not a repository is shared could
+  help address bug #342119 but this will probably requires a format bump).
+
+* project.conf (Not Implemented Yet)
+
+  The options common to all branches and working trees for a project.
+
+* organization.conf (Not Implemented Yet)
+
+  The options common to all branches and working trees for an organization.
+
+  See bug #419854.
+
+* bazaar.conf
+
+  The options the user has selected for the host he is using.
+
+  Sections can be defined for both remote and local branches to define
+  default values (i.e. the most common use of ``locations.conf`` today).
+
+* default (Not Implemented Yet)
+
+  The options defined in the ``bzr`` source code.
+
+  This will be implemented via the ConfigOption objects.
+
+Plugins can define additional configuration files as they see fit and
+insert them in this list, see their documentation for details.
+
+Compatibility
+=============
+
+* The ``DEFAULT`` section in bazaar.conf should still be recognized but
+  won't be mandatory anymore.
+
+* Other sections in the ``bazaar.conf`` configuration file are still
+  supported but their use is discouraged and we may deprecate them in the
+  future. Plugin authors are encouraged to migrate to the new name space
+  scheme by prefixing their options with their plugin name.
+
+* Option policies should be deprecated:
+
+  * The ``norecurse`` policy is useless, all options are recursive by
+    default. If specific values are needed for specific paths, they can just
+    be defined as such.
+
+  * The ``appendpath`` policy should be implemented via interpolation and a
+    ``relpath`` option provided by the configuration framework.



More information about the bazaar-commits mailing list