[apparmor] RFC: Policy versioning

Sun Dec 10 11:05:53 UTC 2017

Currently we have a few problems with policy that must be addressed

------------------------------------------------------------------------
Problems
1. Current policy on new kernels

  Policy authored with an older feature set abi will compile and load
  to the current kernel abi, potentially resulting in denials when a
  kernel with a newer feature set abi than the policy was designed for
  is booted.

  The current solution of policy feature abi pinning has a couple of
  problems.

  - it currently only supports a single feature set abi, which can
    cause its own problems when a user is trying to boot between
    different kernels.

  - it is applied globally to allow policy loaded from a userspace,
    potentially breaking distributed policy, which may have been
    developed against a different abi.

  - no distro is currently using it. Which means, all users who run
    non-distro kernels are being opted into policy dev. It's nice for
    distro maintainers who want bugs reported so the policy can be
    improved but not a great experience for users, and fatal for our
    ability to get new features upstream, as this will be counted as a
    kernel regression.

2. Split packaging of policy

  Instead of having a single policy package, policy is being split
  up, distributed and installed via multiple packages. This leads to
  policy being developed and maintained against different versions of
  apparmor, different feature set abis, and even with features that
  are not supported by the installed apparmor user space.

  - Currently when these policies are loaded, they will all be loaded
    under the same feature abi, which can result in the same problems
    as discussed in 1 for some policies.

    What is needed is the ability to support multiple feature
    sets/abis simultaneously. The kernel already does this, the
    userspace needs to grow support for it.

  - Policy that makes use of newer features available in new versions of
    apparmor

  - In addition the split packaging of policy can lead to the problem
    where either the application policy must support multiple versions
    of policy, or hope that the newest version of policy is supported
    on an older release.

    Experience has shown this not to be the case and application
    policy for LXD etc. are having to special case policy for
    different versions of apparmor.

    Unfortunately apparmor does not provide anyway to do this within
    policy so the special casing must be done in the application or
    packaging.

3. We don't directly support policy in early boot.

  Currently booting into different kernels means a cache flush and
  recompile. This is problematic for a few reasons.

  - It slows down boot.

  - The compile can not be done in early boot, meaning that either the
    wrong policy is loaded, or no policy is loaded. For tasks that
    need to have early policy applied neither solution is acceptable.

4. Support for multiple kernels

  - currently we don't retain a fallback policy for older kernels if
    there are problems with a newer kernels, or newer binary policy
    generated for those kernels.

  - Booting a new kernel means dealing with all of the above from #3
    and #4. The chance for policy errors, causing new kernel failures
    at boot for kernel devs is unacceptable and could lead to more
    problems for apparmor upstream.

5. Policy being split across multiple locations.

  Some distros have split policy across multiple locations. This has
  resulted in initscripts being modified to handle the split instead
  of having a defined standard of where policy is.

6. Applications managing their own policy

  Some applications like LXD and libvirt are doing their own policy
  management, which has made it difficult to properly manage policy as
  a set. In particular there was a policy replacement "bug" where
  unknown policy was being removed from the system on policy
  replacement.

  I say "bug" because policy replacement was infact functioning as
  designed (whether design was correct is a different arguement) and
  the problem was that these applications were loading policy without
  informing the system about it. This was "fixed" by removing
  the ability from the policy management scripts to identify and
  remove unknown and presumably unused and stale policy, with a new
  script aa-remove-unknown being added that could be called manually.

  This was an expedient but less than satisfactory solution that
  introduced a regression in policy management (manually removing
  policy files no longer results in the correct removal of policy
  from the kernel on reload). Nor did it really solve other problems
  around applications managing policy.

7. Conflicting requirements

  1. Currently policy on new kernels and 2. Split packaging of policy
  have conflicting requirements. With policy being split between
  different packages and having different versions, it becomes very
  possible we get into the very similar problems as in 1.

  To be precise if a user is using newer policy with an older kernel
  parts of that policy may be downgraded or not enforced. When that
  user moves to a newer kernel, policy rules that weren't enforced
  before are enforced under the new kernel, potentially leading to
  breakage for that user's use case.

  Unfortunately this is impossible to avoid, as we can't control what
  policy and kernels a user will use. However we can take some
  measures to help reduce the risk.

------------------------------------------------------------------------

While Problem #1 is the current looming emergency, Problem #2 is real
and becoming more of a problem every day. Problem #3 is more of a nice
to have until you need early policy (which we are working towards
having better support for). Problem #4 is something we need to solve to
safely continue landing new features upstream. And Problems 5 and 6
are fairly easy to address and it's more a matter of making sure we can
address them or at least not make them worse as we move forward
solving problems 1-4.

I. Proposed Solution

The basic proposal to address the issues is fairly simple, some of the
details are harder. However the work can be split into several phases
so we can move forward immediately.

1. We extend policy so that the feature file can be included directly into
   profiles.

  something like

    features=/etc/apparmor/featuresX

  OR

    #pragma features=/etc/apparmor/featuresX

  and the feature set will be made available to policy conditionals
  (see 6. Handling abstractions below). Hiding the features as a
  pragma comment would allow this policy change to be transparent to
  older parsers and tools, but I am unsure if it is worth doing.
  The feature definition is not the only thing in this proposal that
  would break on older parsers, so I am leaning towards features=

  The specified features will be used instead of the kernel's exported
  feature set abi. If a policy feature abi version is not supported by
  the kernel it may fail to load (at least for the first pass at
  this).  The compiler will of course still be free to take in the
  live kernel information and use it in conjunction with the policy
  abi, to generate rule downgrades or broader support. It just won't
  use new features that the kernel supports and the policy doesn't.

  Directly referencing the features file however is somewhat ugly and
  doesn't deal with abstractions, tunables, or making the abi
  information human friendly. Instead I propose we wrap the feature
  file with an include, giving the include a meaningful name.

    include <version/4.14>

    include <version/ubuntu-17.10>

  or maybe

    include <abi/4.14>

  and of course instead of using the name for specific kernel or
  release we can use a simple revision number if we so choose.

    include <version/11>

  we could use a new keyword but using an include gives use
  flexibility.  Especially wrt supporting older parsers with new
  policy.

  Besides declaring the feature file the include should define a newly
  standardized variable

    @{version}
  OR
    @{abi}

  dependent on the specifics of the include above; which can be used
  by policy to find abi specific includes and potentially make
  conditional policy decisions (see 6. Handling abstractions).

  Old policy that does not specify the feature file will fallback to
  the least of the running kernel abi or the 4.14 abi. This is
  necessary to try and avoid breaking existing behavior on older
  kernels and to ensure policy doesn't break newer kernels going
  forward.

  It will be easy to inspect what policy versions are supported on the
  system by listing the version/abi directory

    ls /etc/apparmor.d/version/
    7
    8
    9

  And if a particular policy version include does not exist on the
  system the policy will fail to compile.

2. Support multiple policy caches

  To address problems 3-4 we extend the poliy cache so that we retain
  a compiled policy per kernel installed. When a new kernel is
  installed we build a new policy cache for it.

  Handling this correctly is really important. We need to move away
  from building policy on boot, as that is already not viable for some
  policy and will become even more so in the future. Nor does it match
  well with systemd doing policy loads without having to call out to
  the compiler.

3. Policy hashing for better cache conistency

  We need to adopt policy hashing to provide better cache consistency.
  This is not only so we can fix problems with using file time stamps
  but also as away to detect inconsistencies with the compiled feature
  set.

  With the feature abi becoming an integral part of policy compiles it
  is critical we detect any changes to the features abi. Previously
  the cache was cleared when the kernel features abi was changed but
  that is no longer the case, with multiple caches being retained.
  However within each of those caches profile abis can change and we
  need to ensure that the change is picked up.

3. Standardize policy config dir and files

  Problem 5 is addressed by standardizing a config directory and file
  layout. New locations must be added to the config dir to inform
  apparmor of new policy locations and how they should be handled.

  The parser config has proven insufficient so Ubuntu has been
  modifying the initscript to manage this which is not a solution that
  can be shared across distros, nor does it provide a solution that
  works with other parts of apparmor like the tools.

  Instead we have a directory in which each new location can drop its
  own config, allowing to set its policy and include location cache,
  and even compiler options if so desired.

4. Limit distros ability to compile policy to the current kernels
   feature abi

  Along with this Distros will no longer be able to set a default
  policy compile that will use the current kernel's abi. This will not
  even be supported at the distro level as the project can not afford
  to break the feature abi of current policy for kernel developers.

  To address this a new tool will be added to extract the kernel
  features abi, and tooling will be updated to allow users update a
  profiles abi and thus begin development on newer versions. Basically
  a per user opt in only approach.

5. Applications managing policy and unknown profiles

  The current solution to problem 6 of having unknown policy and
  relying on aa-remove-unknown is more problematic. We are going to
  have to break existing behavior to fix it.

  Applications that want to manage their own policy are going to have
  to register to do so. This will require a new API for applications
  to use which could just be a thin layer on top of the policy config
  file.

  Ideally that policy will be placed into a unique policy namespace so
  that it is easy to identify and control. However we will not be able
  to enforce this at first as we need to get current applications that
  are dynamically creating and managing policy to migrate.

  After this is done we can deprecate the use of aa-remove-unknown.
  The tool itself can still be useful for developers and people who
  are manually tinkering with policy so it will likely remain but it
  shouldn't be needed to manage policy reloads.

6. Handling Abstractions

  With multiple versions of policy needing to be simultaneously
  supported, we are going to have improve how the abstractions and
  tunables are handled. I'd like to keep the change as transparent as
  possible at the regular policy level.

  ie. I would rather NOT have to have
    include <abstractions/4.14/base>

  instead of the current
    include <abstractions/base>

  We can achieve this using conditionals, introducing a few variables
  and extending the include mechanism to allow for conditional
  includes.

  Many of the rules will be able to be shared between different
  version, only when they can't do we need to fallback to the custom
  includes and conditionals

6.1 Extending conditionals

  The current conditional statements are rather limited and will need
  to be extended to support a broader range of tests. There is an open
  question as to how much needs to be done, partly dependent on how
  other features like conditional includes are implemented.

6.1 New variables

  The @{abi} or @{version} variable that will be defined as part of
  the abi include can be used by the rest of the includes to
  selectively include rules that are abi specific

  eg. the <abstractions/base> abstraction can do an include on

   <abstractions/@{abi}/base>

  In addition to the @{abi} variable the parser should make the full
  feature set available for finer grained decision making.

  if @{features/network/af_unix} {
     ...
  }

6.2 Conditional include

  The current include fails if the file or dir it references doesn't
  exist. We need to extend the include mechanism that it can
  conditionally not fail if the referenced entry does not exist.

  The syntax needs to be decided on, but some suggestions that have
  been thrown around in the past are:

  * Make style

    with a - at the start of the line, in apparmor's case it would be
    a special qualifier that for the time being only applies to
    includes.

    - include <abstractions/@{abi}>

  * systemd style

    with a - just before the include parameter.

    include -<abstractions/@{abi}>

  * bash style

    Uh no, lets just throw my NAK on this one right now

  * a new keyword

    include_if_exists <abstractions/@{abi}>

  * wrapping it in a conditional

    this requires extending conditions to support an existence test

    if -e <abstractions/@{abi}> {
      include <abstractions/@(abi}>
    }

  I would like to note that the - is going to be used to indicate set
  subtraction in future expressions, to aid in righting righter
  expressions.

  eg.

    allow rw /** - {/foo,/bar*},

  the space will be required (as - is allowed with in paths today). I
  just raise this point as something to consider when choosing a
  format. And to make sure if one of the - formats is chosen it will
  not conflict or be confused with this use.

7. Dealing with new policy features on older releases.

  Where possible the parser supports downgrading rules. However this
  only works for rule types that the parser knows about. To support
  newer policy features on older releases the best solution is
  dropping the newest version of apparmor into an older release.
  However this is not always possible.

7.1 Wrapping rules in conditionals

  With the feature set being exported as conditionals it becomes
  possible for policy to wrap new feature rules in conditionals.

  eg.

    if @{features/network/af_unix} {
       unix peer=foo,
    }

  While this addresses the need to do special casing in policy
  packaging, it makes policy harder to read.

7.2 Supporting unknown rule templates

  Instead of wrapping new rule types in conditionals we should extend
  policy to support rule templates. Rule templates would allow userspace
  to specify patterns for unknown rule types, so that the parser or
  tools can parse the rule, and ignore it.

  The Rule templates could then be dropped into the abstractions,
  as new features are added providing an easy way to update older
  userspaces to ignore new rule types.

  eg.
    if !supports(key) {
      template key='key\w.*,'		# yes its overly simple
    }

  Such rule templates wouldn't completely remove the need for being
  able to wrap some policy in conditionals, but it done properly it
  should be able to support most cases.

II. Impact on caching

1. There is a cache per kernel feature abi, which obviously means
   multiple caching support is needed.

  Extending the cache to support multiple kernels can use the current
  propsed design with only minor modifications. The current design for
  multiple versions of policy cache, creates a hash of the kernel
  features abi, and creates a cache dir based on the hash for each
  different kernel features abi, with the full kernel feature set
  stored in the .features file in the cache file to ensure that hash
  collisions do not occur.

  The only change needed is that the policy within the cache is NOT
  built with kernel features abi, but based on the policy version
  features as its base.

2. Cache contains multiple features abi.

  Because different parts of policy may be using different feature abi
  versions, policy caching will have to support having different
  versions of policy in cache. This can mostly be done with the
  current cache design, where different compiles drop the compiled
  policy into the shared cache, with each file being capable of being
  built against a different feature abi version.

  The .features abi file changes meaning from the features abi of the
  cache to the features abi of the kernel the cache was compiled for.
  There is no features abi for the cache, nor can we completely
  reconstruct the features abi from the compiled cache files,
  thankfully this isn't required for this to work, and adding an
  extension to the binary format or per cache file shadow file to
  store this information can be done in the future.

3. Caches between kernels can be shared

  The cache doesn't need to be per kernel, but per kernel feature abi
  so if kernels use the same feature abi they will share the same cache.

  For cases where the kernel feature abis differs, while the kernel
  abi might cause changes to the compile of a policy, since the cache
  contents is based on the policy feature abi, the cache will actually
  be the same for many kernels. And cache files can be shared via a
  symlink or hardlink.

  Detecting that the generated policy is the same can be done by a
  dedup operation, which could be sped up by leveraging the per cache
  file policy hash that is proposed for consistency checks.

4. Improved cache consistency checks are needed

  This is another improvement that was already needed, but it becomes
  more important with multiple caches. Instead of just checking time
  stamps each cache file will contain a hash extension providing a
  hash of the policy text used to generate the cache.

  Hashing the policy text is preferred over just hashing the time
  stamps of the files involved as it is more robust and will only be
  moderately slower as the policy text has to be read and fully parsed
  to properly to determine the include files.

5. Cache fallback if kernel doesn't match

  It is possible that a precompiled policy will be loadable on a new
  kernel that is not an exact match. While I do think it is generally
  best to have a per kernel feature set cache, using a fallback policy
  for cache is preferable in one case, that of kernel developers.

  This is because it is possible that a user will be using a policy
  developed against a newer kernel feature set abi than their current
  kernel, and when they roll forward to a new kernel that starts
  enforcing rules in the existing policy, something for that user's
  use case breaks.

  Unfortunately I don't see a way to guarentee this won't happen as
  its impossible to guarentee complete coverage and testing of policy
  under all use cases. The best we can do is encourage policy be
  extensively updated before distributing it.

  The fix for this type of breakage would be having the user modify
  the broken profiles feature abi or using pinning (as now) to the
  previous kernel feature abi, but as we have learned this is not
  considered an acceptable solution.

  Hence for cases where a new custom kernel is in use we should use a
  best match* fallback (using closest kernel version and maybe some
  features) to reduce the chance that we will have a policy break when
  a new kernel is loaded.

  * The exact definition of best match still needs to be worked out
    and will likely have to be based on a distance function.

6. Shipping precompiled policy

  With the cache supporting multiple kernels, shipping precompiled
  policy cache files along with the policy text becomes possible. This
  is largely a packaging issue and the details can be resolved later.

III. Impact of the compiler and tools

- The compiler will have to be modified to support the feature tagging.

- The compiler will need to be able to also accept a kernel features
  files specification (which it already does), that will only be
  used for generating the cache, and modifying policy specified
  abi to be loadable on the given kernel.

- For policy that is compiled together it will to order and group
  profiles based on the specified features abis.

- Some changes will be needed to properly support conditionals.

- a few new tools will need to be created

- current tools will need to be updated to support the language
  extensions

IV. Impact on packaging

- It will enable shipping precompiled policy

- It will enable shipping config snipets to update policy locations

- It will require packaging to be able to cleanup old policy caches
  that are no longer needed

V. Roll Out

I've tried to break the work down into phase, partly by priority
and partly by dependencies. Work items from later phases can always
be moved forward as long as their dependencies are met.

Phase 1

- add features keyword support

- multiple caches

- conditional include

Phase 2

- fallback to best match cache for kernels that don't have a cache
  - requires multiple caches

- fallback to using pinning/kernel features (up to 4.14) for when
  features in not preset in policy.
  - requires features support/detection within policy
  - interacts with multiple cache

- start reworking includes and policy
  - introduce @{abi} and abstraction structure to use it
  - requires features keyword
  - requires conditional include

Phase 3
- improving/extending conditionals

- hashing of policy text (improved cache consistency)

- intersection of kernel features and policy features to determine
  abi and rule downgrades.

Phase 4
- making features available as conditionals
  - requires: improving/extending conditionals

- standardized config file for locations

Phase 5

- policy management api/tools
  - requires: standardized config file for locations

- unknown rule templates