Rev 40: Add wip about a new configuration implementation. in file:///home/vila/src/bzr/devnotes/
Vincent Ladeuil
v.ladeuil+lp at free.fr
Thu Dec 2 14:00:03 GMT 2010
At file:///home/vila/src/bzr/devnotes/
------------------------------------------------------------
revno: 40
revision-id: v.ladeuil+lp at free.fr-20101202140003-6cqwy38cvgz6hwvj
parent: mbp at canonical.com-20100609065349-9l6vsvylq6szlein
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: devnotes
timestamp: Thu 2010-12-02 15:00:03 +0100
message:
Add wip about a new configuration implementation.
-------------- next part --------------
=== added file 'configuration.txt'
--- a/configuration.txt 1970-01-01 00:00:00 +0000
+++ b/configuration.txt 2010-12-02 14:00:03 +0000
@@ -0,0 +1,433 @@
+==================
+Configuring Bazaar
+==================
+
+Goal
+====
+
+Not all needs can be addressed by the default values used inside bzr and
+bzrlib, no matter how well they are chosen (and they are ;).
+
+Many parts of ``bzrlib`` depends on some constants though and the user
+should be able to customize the behavior to suit his needs so these
+constants need to become configuration options.
+
+These options can be set from the command-line, in an environment variable
+or recorded in a configuration file.
+
+Current issues
+==============
+
+* Many parts of ``bzrlib`` declare constants and there is no way for the
+ user to look at or modify them.
+
+* The current API requires a configuration object to create, modify or
+ delete a configuration option in a given configuration file. ``bzr
+ config`` makes it almost transparent for the user. Internally though, not
+ all cases are handled: only BranchConfig implements chained configs,
+ nothing is provided at the repository level and too many plugins define
+ their own section or even their own config file.
+
+* ``locations.conf`` defines the options that needs to override any setting
+ in ``branch.conf`` for both local and remotes branches. Many users want a
+ way to define default values for options that are not defined in
+ ``branch.conf``. This could be approximated today by *not* defining these
+ options in ``branch.conf`` but in ``locations.conf`` instead. This
+ workaround doesn't allow a user to define defaults in ``locations.conf``
+ and override them in ``branch.conf``.
+
+* Defining a new option requires adding a new method in the ``Config``
+ object to get access to features like:
+
+ * should the option be inherited by more specific sections,
+
+ * should the inherited value append the relative path between the
+ section one and the location it applies to.
+
+ * the default value (including calling any python code that may be
+ required to calculate this value),
+
+ * priority between sections and various config files
+
+ A related problem is that, in the actual implementation, some
+ configuration options have defined methods, other don't and this is
+ inconsistent.
+
+* Access to the 'active' configuration option value from the command line
+ doesn't give access to specific section.
+
+* Rules for configuration options are not clearly defined for remote
+ branches (they may differ between dumb and smart servers).
+
+* The features offered by the Bazaar configuration files should be easily
+ accessible to plugin authors either by supporting plugin configuration
+ options in the configuration files or allowing plugin to define their
+ own configuration files.
+
+* While the actual configuration files support sections, they are used in
+ mutually exclusive ways that make it impossible to offer the same set of
+ features to all configuration files:
+
+ * ``bazaar.conf`` use arbitrary names for sections. ``DEFAULT`` is used
+ for global options, ``ALIASES`` are used to define command aliases,
+ plugins can define their own sections, some plugins do that
+ (``bzr-bookmarks`` use ``BOOKMARKS`` for example), some other define
+ their own sections.
+
+ * ``locations.conf`` use globs as section names. This provides an easy
+ way to associate a set of options to a matching working tree or
+ branch, including remote ones.
+
+ * ``branch.conf`` doesn't use any section.
+
+* There is no easy way to get configuration options for a given repository
+ or an arbitrary path. Working trees and branches are generally organized
+ in hierarchies and being able to share the option definitions is an often
+ required feature. This can also address some needs exhibited by various
+ branch schemes like looms, pipeline, colocated branches and nested
+ trees. Being able to specify options *in* a working tree could also help
+ support conflict resolution options for a given file, directory or
+ subtree.
+
+* Since sections allow different definitions for the same option, a total
+ order should be defined between sections to select the right definition
+ for a given path. Allowing globs for section names is harmful in this
+ respect since the order is currently defined as being the lexicographical
+ one. The caveat here is that if the order is always defined for a given
+ set of sections it can change when one or several globs are modified and
+ the user may get surprising and unwanted results in these cases. The
+ lexicographical order is otherwise fine to define what section is more
+ specific than another. (This may not be a problem in real life since
+ longer globs are generally more specific than shorter ones and explicit
+ paths should also be longer than matching globs. That may leave a glob and
+ a path of equal length in a gray area but in practice using ``bzr config``
+ should give enough feedback to address them).
+
+* Internally, configuration files (and their fallbacks, ``bazaar.conf`` and
+ ``locations.conf`` for ``branch.conf``) are read every time *one* option is
+ queried. Likewise, setting or deleting a configuration option implies
+ writing the configuration file *immediately* after re-reading the file to
+ avoid racing updates.
+
+* The current implementation use a mix of transport-based and direct file
+ systems operations.
+
+* While the underlying ``ConfigObj`` implementation provides an
+ interpolation feature, the ``bzrlib`` implementation doesn't provide an
+ easy handling of templates where other configuration options can be
+ interpolated. Instead, ``locations.conf`` (and only it) allows for
+ ``appendpath`` and ``norecurse``.
+
+* Inherited list values can't be modified, a more specific configuration can
+ only redefine the whole list.
+
+* There is no easy way to define dicts (the most obvious one being to use a
+ dedicated section which is already overloaded). Using embedded sections
+ for this would not be practical either if we keep using a no-name section
+ for default values. In a few known cases, a bencoded dict is stored in a
+ config value, so while this isn't user-friendly, not providing a better
+ alternative shouldn't be a concern.
+
+
+Proposed implementation
+=======================
+
+
+Configuration files definition
+------------------------------
+
+While of course configurations files can be versioned they are not intended
+to be accessed in sync with the files they refer to (one can imagine
+handling versioned properties this way but this is *not* what the bazaar
+configuration files are targeted at). ``bzr`` will always refer to
+configuration files as they exist on disk when an option is queried or set.
+
+The configuration files are generally local to the file system but some of
+them can be accessed remotely (``branch.conf``, ``repo.conf``).
+
+
+Naming
+------
+
+The option name space is organized as follow:
+
+* Bazaar itself defines all its constants as ``bzr.option_name``.
+
+* plugins can define their own options by prefixing them with the plugin
+ name as ``svn.option_name`` for the ``svn`` plugin.
+
+Using valid python identifiers is recommended but not enforced (but we may
+do so in the future).
+
+Value
+-----
+
+All option values are text. They are provided as Unicode strings to API
+users with some refinements:
+
+* boolean values can be obtained for a set of acceptable strings (yes/no,
+ y/n, on/off, etc),
+
+* a list of strings from a value containing a comma separated list of
+ strings.
+
+Since the configuration files can be edited by the user, ``bzr`` doesn't
+expect their content to be validated. Instead, the code using options should
+be ready to handle *invalid* values by warning the user and fallback to a
+default value. Likely, if an option is not defined in any configuration
+file, the code should fallback to a default value (helpers should be
+provided by the API to handle common cases, warning the user, getting a
+particular type of value, returning a default value).
+
+This also ensures compatibility with values provided via environment
+variables or from the command line.
+
+Interpolation
+-------------
+
+Some option values can be templates and contain references to other
+options. This is especially useful to define URLs in sections shared for
+multiple branches for example. It can also be used to describe commands
+where some parameters are set by ``bzrlib`` at runtime.
+
+Since options values are text-only, and to avoid clashing with other
+interpolation syntaxes, references are enclosed with curly brackets::
+
+ push_location = lp:~{launchpad_username}/bzr/{nick}
+
+In the example above, ``launchpad_username`` is an already defined
+configuration option while ``nick`` is the branch nickname and is set when a
+configuration applies to a given branch.
+
+The interpolation implementation should accept an additional dict so that
+``bzrlib`` or plugins can define references that can be interpolated without
+being existing configuration options::
+
+ diff_command={cmd} {cmd_opts} {file_a} {file_b}
+
+There are two common errors that should be handled when handling interpolation:
+
+* loops: when a configuration value refers to itself, directly or indirectly,
+
+* undefined references: when a configuration value refers to an unknown option.
+
+One possible implementation is to report errors when such references are
+encountered.
+
+Another implementation could be envisioned though: when a loop is
+encountered, we can fall back to the less specific configurations. This
+allows list values to refer to the definition in the less specific
+configurations allowing::
+
+ bazaar.conf:
+ debug_flags = hpss
+
+ branch.conf for mybranch:
+ debug_flags = {debug_flags}, hpssdetail
+
+ $ bzr -d mybranch config debug_flags
+ hpss, hpssdetail
+
+Undefined references would still be detected if they are not defined in any
+configuration or just stay unresolved which should be enough to trigger
+errors displaying the value. Diagnosing typos should be doable in this case.
+
+Configuration file syntax
+-------------------------
+
+The configuration file is mostly an ``ini-file``. It contains ``name =
+value`` lines grouped in sections. Comments are allowed by prefixing them
+with the '#' character.
+
+A section is named by the path it should apply to (more examples below).
+
+Options defined outside of any section act as defaults when no section
+applies. This means that in the most common cases, the user doesn't need to
+define any section.
+
+When sections are used, they provide a finer grain of configuration by
+defining option sets that apply to some working trees, branches,
+repositories or part of them.
+
+The subset is defined by the common leading path or a glob.
+
+* a full url: used to described options for remote branches and
+ repositories.
+
+* local absolute path: used for working trees, branches or repositories
+ on the local disks.
+
+* relative path: the path is relative to the configuration file and can be
+ used for colocated branches or threads in a loom, i.e any working tree,
+ branch or repository that is located in a place related to the
+ configuration file path. Some configuration files may define this path
+ relationship in specific ways to make them easier to use (i.e. if a config
+ file is somewhere below ``.bzr`` and refers to threads in a loom for
+ example, the relative path would be the thread name, it doesn't have to be
+ an *exact* relative path, as long as its interpretation is unambiguous and
+ clear for the user).
+
+Whatever path is used, the options apply if the branch path starts with
+the path defining the section (or if the glob matches).
+
+The ConfigOption object
+-----------------------
+
+In addition to the configuration files, one internal configuration dict can
+contain definitions for some configuration options. This will allow a finer
+grained definition of the default values and the online help.
+
+The ConfigOption object will define:
+
+* name
+
+* default value. ``None`` is the "default" default value.
+
+* docstring used for the help
+
+* a list of config files where the option can be defined.
+
+The ConfigFile object
+---------------------
+
+This is an implementation-level object that should rarely be used directly.
+
+* it can be local or remote
+
+* locking
+
+ All lock operations should be implemented via transport objects.
+
+* option life cycle
+
+ Working trees, branches and repositories should define a config attribute
+ following the same life cycle as their lock: the associated config file is
+ read once and written once if needed. This should minimize the file system
+ accesses or the network requests. There is no known racing scenarios for
+ configuration options, changing the existing implementation to this less
+ constrained one shouldn't introduce any. Yet, in order to detect such
+ racing scenarios, we can check that the current content of the
+ configuration file is the expected one before writing the new content and
+ emit warnings if differences occurred. The checks should be performed for
+ the modified values only. As of today, the size of the configuration files
+ are small enough to be kept in memory.
+
+The Config object
+-----------------
+
+This the object that provides access to the needed features:
+
+* getting an option value,
+
+* setting an option value,
+
+* deleting an option value,
+
+* handling a list of configuration files that should be tried in the given
+ order to find an option.
+
+Depending on the files involved, a working tree, branch or repository object
+should be provided to access the corresponding configuration files. Note
+that providing a working tree object also implicitly provides the
+associated branch and repository object so only one of them is required (or
+none for configuration files specific to the user like bazaar.conf and
+locations.conf).
+
+Getting an option value
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Depending on the option, there are various places where it can be defined
+and several ways to override these settings when needed.
+
+The following lists all possible places where a configuration option can
+be defined, but some options will make sense in only some of them. The
+first to define a value for an option wins (None is therefore used to
+express that an option is not set).
+
+* command-line (Not Implemented Yet)
+ ``-Ooption=value`` see bug #491196.
+
+* environment variable
+
+ ``export BZR_OPTION=value``
+
+ Some environment variables doesn't have a corresponding configuration
+ option (BZR_PLUGIN_PATH) and most configuration options doesn't have a
+ corresponding environment variable.
+
+* locations.conf
+
+ When an option is set in ``locations.conf`` it overrides any other
+ configuration file. This should be used with care as it allows setting a
+ different value than what is recommended by the project
+
+* tree.conf (Not Implemented Yet)
+
+ The options related to the working tree.
+
+ This includes all options related to commits, ignored files, junk files,
+ etc.
+
+ Note that the sections defined there can use relative paths if some
+ options should apply to a subtree or some specific files only.
+
+ See bug #430538 and bug #654998.
+
+* branch.conf
+
+ The options related to the branch.
+
+ Sections can be defined for co-located branches or loom threads.
+
+* repo.conf (Not Implemented Yet)
+
+ The options related to the repository.
+
+ Using an option to describe whether or not a repository is shared could
+ help address bug #342119 but this will probably requires a format bump).
+
+* project.conf (Not Implemented Yet)
+
+ The options common to all branches and working trees for a project.
+
+* organization.conf (Not Implemented Yet)
+
+ The options common to all branches and working trees for an organization.
+
+ See bug #419854.
+
+* bazaar.conf
+
+ The options the user has selected for the host he is using.
+
+ Sections can be defined for both remote and local branches to define
+ default values (i.e. the most common use of ``locations.conf`` today).
+
+* default (Not Implemented Yet)
+
+ The options defined in the ``bzr`` source code.
+
+ This will be implemented via the ConfigOption objects.
+
+Plugins can define additional configuration files as they see fit and
+insert them in this list, see their documentation for details.
+
+Compatibility
+=============
+
+* The ``DEFAULT`` section in bazaar.conf should still be recognized but
+ won't be mandatory anymore.
+
+* Other sections in the ``bazaar.conf`` configuration file are still
+ supported but their use is discouraged and we may deprecate them in the
+ future. Plugin authors are encouraged to migrate to the new name space
+ scheme by prefixing their options with their plugin name.
+
+* Option policies should be deprecated:
+
+ * The ``norecurse`` policy is useless, all options are recursive by
+ default. If specific values are needed for specific paths, they can just
+ be defined as such.
+
+ * The ``appendpath`` policy should be implemented via interpolation and a
+ ``relpath`` option provided by the configuration framework.
More information about the bazaar-commits
mailing list