[apparmor] [patch] [utils] Refactor Severity module [Refactor series]

Wed Jun 8 20:20:12 UTC 2016

Hello,

Am Donnerstag, 2. Juni 2016, 03:02:43 CEST schrieb Kshitij Gupta:
> The following patch:
> 1. Refactors Severity module in hopes of making it more readable and
> simplifying up some of the terse logic. (This resulted in re-write of
> some segments)
> 2. Updates relevant modules/tools accordingly (checked with grep for
> usages)
> 
> Side notes:
> 1. The tests were ultra-useful while refactoring, hopefully they cover
> sufficient stuff.

Use   make coverage-html   and have a look at htmlcov/* to get an exact 
picture of test coverage ;-)

> 2. I missed not working with git (but then I am not familiar with bzr)

Whenever I read an introduction to git, I'm scared by all the commands
and options it offers. Therefore I still prefer something simple like 
bzr ;-)  (nevertheless I won't object to switching to git if I'm the 
only one who thinks so)

> 3. Comments are welcome regarding bugs and style

> === modified file 'utils/aa-mergeprof'
> --- utils/aa-mergeprof	2016-05-10 12:32:46 +0000
> +++ utils/aa-mergeprof	2016-06-01 21:16:50 +0000
> @@ -18,14 +18,18 @@
>  import os
>  
>  import apparmor.aa
> -from apparmor.aa import available_buttons, combine_name, delete_duplicates, get_profile_filename, is_known_rule, match_includes
>  import apparmor.aamode
> -from apparmor.common import AppArmorException
> -from apparmor.regex import re_match_include
> +
>  import apparmor.severity
>  import apparmor.cleanprofile as cleanprofile
>  import apparmor.ui as aaui
>  
> +from apparmor.aa import (available_buttons, combine_name, delete_duplicates, 

superfluous trailing space

> +                         get_profile_filename, is_known_rule, match_includes)
> +from apparmor.common import AppArmorException
> +from apparmor.regex import re_match_include

The imports reordering makes sense, even if it will probably make it a
bit more interesting for my FileRule series (whoever commits last will 
need to adjust the patch ;-)

To ensure a defined commit order, maybe I should just let you win the 
race and add
    Acked-by: Christian Boltz <apparmor at cboltz.de>
for the import reordering above and in rule/capability.py.

> @@ -660,7 +664,7 @@
>  
>                              # Load variables into sev_db? Not needed/used for capabilities and network rules.
>                              severity = rule_obj.severity(sev_db)
> -                            if severity != sev_db.NOT_IMPLEMENTED:
> +                            if severity != apparmor.severity.NOT_IMPLEMENTED:

I'm not sure if I like this change.
sev_db gets passed into *Rule as function parameter, so it feels more
natural to check against sev_db.NOT_IMPLEMENTED instead of checking
against something that's imported in a totally different way.

> === modified file 'utils/apparmor/aa.py'
> --- utils/apparmor/aa.py	2016-05-10 12:32:46 +0000
> +++ utils/apparmor/aa.py	2016-05-29 18:40:31 +0000
> @@ -1644,7 +1644,7 @@
>  
>                              # Load variables into sev_db? Not needed/used for capabilities and network rules.
>                              severity = rule_obj.severity(sev_db)
> -                            if severity != sev_db.NOT_IMPLEMENTED:
> +                            if severity != apparmor.severity.NOT_IMPLEMENTED:

(same as above)

> === modified file 'utils/apparmor/rule/__init__.py'
> --- utils/apparmor/rule/__init__.py	2016-01-25 22:48:34 +0000
> +++ utils/apparmor/rule/__init__.py	2016-05-29 18:42:17 +0000
> @@ -13,6 +13,8 @@
>  #
>  # ----------------------------------------------------------------------
>  
> +import apparmor.severity
> +
>  from apparmor.aare import AARE
>  from apparmor.common import AppArmorBug, type_is_str
>  
> @@ -237,9 +239,9 @@
>          '''return severity of this rule, which can be:
>             - a number between 0 and 10, where 0 means harmless and 10 means critical,
>             - "unknown" (to be exact: the value specified for "unknown" as set when loading the severity database), or
> -           - sev_db.NOT_IMPLEMENTED if no severity check is implemented for this rule type.
> +           - apparmor.severity.NOT_IMPLEMENTED if no severity check is implemented for this rule type.
>             sev_db must be an apparmor.severity.Severity object.'''
> -        return sev_db.NOT_IMPLEMENTED
> +        return apparmor.severity.NOT_IMPLEMENTED

(same as above)

> === modified file 'utils/apparmor/rule/capability.py'
> --- utils/apparmor/rule/capability.py	2016-01-25 22:48:34 +0000
> +++ utils/apparmor/rule/capability.py	2016-06-01 21:17:54 +0000
> @@ -13,10 +13,11 @@
>  #
>  # ----------------------------------------------------------------------
>  
> +import re
> +
>  from apparmor.regex import RE_PROFILE_CAP
>  from apparmor.common import AppArmorBug, AppArmorException, type_is_str
>  from apparmor.rule import BaseRule, BaseRuleset, logprof_value_or_all, parse_modifiers
> -import re
>  
>  # setup module translations
>  from apparmor.translations import init_translation

    Acked-by: Christian Boltz <apparmor at cboltz.de>
for the import reordering in rule/capability.py.

Feel free to commit these changes _now_ ;-)

> === modified file 'utils/apparmor/severity.py'
> --- utils/apparmor/severity.py	2015-06-19 19:43:19 +0000
> +++ utils/apparmor/severity.py	2016-06-01 21:11:08 +0000

>      def rank_capability(self, resource):
>          """Returns the severity of for the capability resource, default value if no match"""
> +        resource = resource.upper()
>          cap = 'CAP_%s' % resource.upper()

with the added resource = resource.upper(), you can drop the .upper() 
in the cap = ... line.

>      def handle_file(self, resource, mode):
>          """Returns the severity for the file, default value if no match found"""
>          resource = resource[1:]    # remove initial / from path
>          pieces = resource.split('/')    # break path into directory level chunks
> -        sev = None
> -        # Check for an exact match in the db
> -        if resource in self.severity['FILES'].keys():
> -            # Find max value among the given modes
> -            for m in mode:
> -                if sev is None or self.severity['FILES'][resource].get(m, -1) > sev:
> -                    sev = self.severity['FILES'][resource].get(m, None)
> +        severity = None

I know severity looks better than sev, but it might also lead to 
confusion with self.severity. If things go worse, mixing up severity
and self.severity can even lead to funny[tm] effects.

Therefore I tend to keep sev as name for the local variable.

>      def rank(self, resource, mode=None):
>          """Returns the rank for the resource file/capability"""
>          if '@' in resource:    # path contains variable
>              return self.handle_variable_rank(resource, mode)
> -        elif resource[0] == '/':    # file resource
> +        elif resource.startswith('/'):    # file resource
>              return self.handle_file(resource, mode)
> -        elif resource[0:4] == 'CAP_':    # capability resource
> +        elif is_capability(resource):
>              return self.rank_capability(resource[4:])
>          else:
>              raise AppArmorException("Unexpected rank input: %s" % resource)

My plan is to get rid of the rank() dispatcher - the calling classes
(FileRule, CapabilityRule) know what they are asking about, and can call
rank_capability() etc. directly.

Please drop your changes to rank() - I'll drop this function as part of 
the FileRule series (I hope to finish it soon[tm], and yes, I know that 
I'm saying that since weeks ;-)

> @@ -178,35 +216,104 @@
>              replacement = replacement[:-1]
>          return resource.replace(variable, replacement)
>  
> -    def load_variables(self, prof_path):
> -        """Loads the variables for the given profile"""
> -        if os.path.isfile(prof_path):
> -            with open_file_read(prof_path) as f_in:
> -                for line in f_in:
> -                    line = line.strip()
> -                    # If any includes, load variables from them first
> -                    match = re_match_include(line)
> -                    if match:
> -                        new_path = self.PROF_DIR + '/' + match
> -                        self.load_variables(new_path)
> -                    else:
> -                        # Remove any comments
> -                        if '#' in line:
> -                            line = line.split('#')[0].rstrip()
> -                        # Expected format is @{Variable} = value1 value2 ..
> -                        if line.startswith('@') and '=' in line:
> -                            if '+=' in line:
> -                                line = line.split('+=')
> -                                try:
> -                                    self.severity['VARIABLES'][line[0]] += [i.strip('"') for i in line[1].split()]
> -                                except KeyError:
> -                                    raise AppArmorException("Variable %s was not previously declared, but is being assigned additional value in file: %s" % (line[0], prof_path))
> -                            else:
> -                                line = line.split('=')
> -                                if line[0] in self.severity['VARIABLES'].keys():
> -                                    raise AppArmorException("Variable %s was previously declared in file: %s" % (line[0], prof_path))
> -                                self.severity['VARIABLES'][line[0]] = [i.strip('"') for i in line[1].split()]
> +    def load_variables(self, file_path):
> +        """Loads the variables from the given file"""
> +        if not os.path.isfile(file_path):
> +            return

Good idea, that easily saves us one tab level :-)

> +        with open_file_read(file_path) as f_in:
> +            for line in f_in:
> +                line = line.strip()
> +                # If any includes, load variables from them first
> +                match = re_match_include(line)
> +                if match:
> +                    new_path = self.profile_dir + '/' + match
> +                    self.load_variables(new_path)
> +                else:
> +                    # Remove any comments
> +                    if '#' in line:
> +                        line = line.split('#')[0].rstrip()
> +                    # Expected format is @{Variable} = value1 value2 ..
> +                    if line.startswith('@') and '=' in line:
> +                        if '+=' in line:
> +                            line = line.split('+=')
> +                            try:
> +                                self.variables[line[0]] += [i.strip('"') for i in line[1].split()]
> +                            except KeyError:
> +                                raise AppArmorException("Variable %s was not previously declared, but is being assigned additional value in file: %s" % (line[0], file_path))
> +                        else:
> +                            line = line.split('=')
> +                            if line[0] in self.variables.keys():
> +                                raise AppArmorException("Variable %s was previously declared in file: %s" % (line[0], file_path))
> +                            self.variables[line[0]] = [i.strip('"') for i in line[1].split()]

These changes are mostly whitespace changes, but also hide some other
small changes (self.severity['VARIABLES'] -> self.variables etc.).

I'd prefer to first have a patch that adds the "if not
os.path.isfile(prof_path): return" and does the whitespace changes, and
then the actual changes (variable renaming etc.) in a second patch.

>      def unload_variables(self):
>          """Clears all loaded variables"""
> -        self.severity['VARIABLES'] = dict()
> +        self.variables = dict()
> +
> +
> +def is_severity_valid(severity):
> +    return severity in VALID_SEVERITY_RANGE
> +
> +def path_contains_regex(path):
> +    return '*' in path

That's something that should be done in AARE (the "boring" flag we
discussed on IRC, but that name isn't set in stone ;-) - something 
everybody understands (maybe "contains_regex"?) is, well, boring, but 
still better)

You should also check for other special chars. From the top of my head, 
I'd at least say ?, [...] and {...,...} are missing, and \ escaping 
might also be an interesting thing. Variables might also qualify as
something special (even if not handled/expanded by AARE yet).

> +def is_capability(resource):
> +    return resource.startswith('CAP_')

rank_capability() is only called by CapabilityRule, therefore I
seriously expect that it only asks about valid capabilities ;-)

I noticed you also call this function from load_database() - maybe the
easiest solution is to keep the line.startswith('CAP_') there.

> +def find_max_severity_for_modes(modes, mode_severities, severity=MINIMUM_SEVERITY):
> +    if severity is None:
> +        severity = MINIMUM_SEVERITY
> +
> +    for mode in modes:
> +        severity = max(severity, mode_severities.get(mode, MINIMUM_SEVERITY))
> +
> +    if severity == MINIMUM_SEVERITY:
> +        return None
> +
> +    return severity
> +
> +def add_path_to_trie(trie, path, mode_severities):
> +    node = trie.get_head()
> +
> +    path_pieces = path.split('/')
> +
> +    for index, path_piece in enumerate(path_pieces):
> +        if path_contains_regex(path_piece):
> +            path_onwards = '/'.join(path_pieces[index:])
> +
> +            node.add_glob(convert_regexp(path_onwards), mode_severities)
> +            break
> +        else:
> +            if node.is_present(path_piece):
> +                node = node.get_child(path_piece)
> +            else:
> +                child_node = SeverityRegexTrieNode(path_piece)
> +                node.add_child(path_piece, child_node)
> +                node = child_node
> +
> +    return trie
> +
> +def find_matching_modes_for_path(node, keys):
> +    if not keys:
> +        ## TODO: Possibly merge file paths into the trie
> +        #if node.has_value():
> +        #    return node.get_mode()
> +        return None
> +
> +    key = keys[0]
> +    path = '/'.join(keys)
> +    modes = None
> +
> +    if node.is_present(key):
> +        modes = find_matching_modes_for_path(node.get_child(key), keys[1:])
> +
> +    globs_to_mode = node.get_globs()
> +    if modes is None and globs_to_mode is not None:
> +        modes = []
> +
> +        for glob in globs_to_mode:
> +            if re.search("^" + glob, path):

This re.search() looks expensive, but probably isn't easy to avoid or
cache.

> +                modes.append(globs_to_mode[glob])
> +
> +    return modes

Can you please explain what the advantage of a trie is for us?

I'd guess AARE can cover our usecases (and would even argue it's a bug
if it doesn't), but maybe I'm overlooking something.

I'll give all the trie-related code a second look once I understand why
we need it ;-)

> === added file 'utils/apparmor/utility.py'
> --- utils/apparmor/utility.py	1970-01-01 00:00:00 +0000
> +++ utils/apparmor/utility.py	2016-06-01 21:04:40 +0000
> @@ -0,0 +1,43 @@
> +class Trie(object):

Please always use filenames that matches the class name ("trie.py"
instead of "utility.py"). Having a collection class (like the *Ruleset 
classes) in the same file is OK.

This will hopefully lead to small, readable files instead of one big
monster file (like aa.py) for everything.

> +    """A basic skeleton of a trie data structure
> +        supports basic head, find operations for a 
                                                     ^
superfluous trailing space

> +        type of node using custom search function
> +    """
> +    def __init__(self, node_type):
> +        self.root = node_type(None)
> +    
   ^^^^
more space

> +    def get_head(self):
> +        return self.root
> +    
   ^^^^
and even more

> === modified file 'utils/test/test-severity.py'
> --- utils/test/test-severity.py	2015-06-06 11:59:11 +0000
> +++ utils/test/test-severity.py	2016-06-01 20:35:09 +0000
> @@ -65,6 +65,7 @@
>          ('UNKNOWN', 'unknown'),
>          ('K*', 'unknown'),
>          ('__ALL__', 10),
> +        ('__all__', 10)
>      ]
>  
>      def _run_test(self, params, expected):

Additional tests are always a good idea ;-)

> @@ -167,7 +168,7 @@
>          self.assertRaises(AppArmorException, severity.Severity, 'severity_broken.db')
>  
>      def test_nonexistent_db(self):
> -        self.assertRaises(IOError, severity.Severity, 'severity.db.does.not.exist')
> +        self.assertRaises(AppArmorException, severity.Severity, 'severity.db.does.not.exist')

Ah, that means users will get a one-line error message instead of a full
backtrace. Looks like a good idea for such a simple error.

Regards,

Christian Boltz
-- 
Also meine Erfahrung ist, daß man mit den großen Lüftern in der Tat
sehr leise Gehäuse bekommt. Sobald man allerdings den Netzteilkühler
nicht mehr hört, fängt die Festplatte an zu nerven.  Und wenn man die
einigermaßen ruhiggestellt hat, dann nervt der Prozessorkühler. Und
wenn's der nicht ist, dann der Grafikkühler.
Einzige Lösung: zurück zum Taschenrechner...  [Andre Tann in suse-linux]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20160608/48b3b3f3/attachment.pgp>