[apparmor] [patch] update aa.py is_skippable_file() according to libapparmor

Kshitij Gupta kgupta8592 at gmail.com
Thu Dec 25 12:55:44 UTC 2014


Hello,

On Thu, Dec 25, 2014 at 6:02 AM, Christian Boltz <apparmor at cboltz.de> wrote:

> Hello,
>
> Am Mittwoch, 24. Dezember 2014 schrieb Kshitij Gupta:
> > On Sun, Dec 7, 2014 at 2:49 AM, Christian Boltz wrote:
> > > this patch updates is_skippable_file() to match all extensions that
> > > are listed in  libapparmor _aa_is_blacklisted() - some extensions
> > > were missing in the python code.
> > >
> > > Also make the code more readable - even if it merges 4 re.search()
> > > into one, and add some testcases.
> > >
> > > Notes:
> > > - the original code additionally ignored *.swp. I didn't include
> > > that ->
> > >   *.swp looks like vim swap files which are also dot files
> >
> > I'm not sure do we want to keep this or not. As stated parser_misc
> > doesn't ignore this.
> > A abc.txt.swp (open by vim) wont be covered under the dot file?
> > Also there are other ways a .swp file maybe created.
>
> vim creates a dot file - "vi abc.txt" creates .abc.txt.swp, and that is
> already covered by the "filename starts with a dot" rule.
>
> I must've missed the first "."

> We'll get a bug report if it is meant to be there.
>
> ;-)
>
> > A user says even mv created a .swp file:
> > http://unix.stackexchange.com/questions/27923/what-causes-swap-files-t
> > o-be-created
>
> Maybe (I never checked), but moving around profiles in GB size range is
> a very rare event ;-) and for smaller files it's unlikely that the tools
> see that file in the second it exists.
>
> Given enough time it will happen for someone at sometime. ;-)

> - the python code ignores README files, but the C code doesn't
> >
> > >   (do we need to add README in the C code?)
> >
> > Somebody else can answer this?
>
> John? Tyler? Steve?
>
> > > [ is_skippable_file.diff ]
> > >
> > > === modified file 'utils/apparmor/aa.py'
> > > --- utils/apparmor/aa.py        2014-11-29 12:40:10 +0000
> > > +++ utils/apparmor/aa.py        2014-12-06 21:12:49 +0000
> > > @@ -2546,13 +2546,16 @@
> > >
> > >          else:
> > >              return False
> > >
> > > -# rpm backup files, dotfiles, emacs backup files should not be
> > > processed -# The skippable files type needs be synced with apparmor
> > > initscript>
> > >  def is_skippable_file(path):
> > > -    """Returns True if filename matches something to be skipped"""
> > > -    if (re.search('(^|/)\.[^/]*$', path) or
> > > re.search('\.rpm(save|new)$', path)
> > > -            or re.search('\.dpkg-(old|new)$', path) or
> > > re.search('\.swp$', path)
> > > -            or path[-1] == '~' or path == 'README'):
> > > +    """Returns True if filename matches something to be skipped
> > > (rpm or dpkg backup files, hidden files etc.)
> > > +        The list of skippable files needs to be synced with
> > > apparmor
> > > initscript and libapparmor _aa_is_blacklisted()
> > > +        path: filename (without directory)"""
> > > +
> > > +    skippable_suffix =
> > > '(\.dpkg-new|\.dpkg-old|\.dpkg-dist|\.dpkg-bak|\.rpmnew|\.rpmsave|\.
> > > orig|\.rej|~)$'
> > > +    skippable_files  = '^(.*/)?(README|\.[^/]*|)$'
> > > # README, dot files and empty filename
> >
> > I suppose the number of calls to this function will be many as a large
> > number of files are parsed/ignored.
>
> Yes, basically number of files in /etc/apparmor.d/ (and subdirectories?)
>
> That are quite some files, but we have places that are called more often
> (for example parsing each line in the profile ;-)
>
> Hence, I said over-optimise ;-)

> > So, might I suggest that we (over?)optimise this?
> >
> > I can think of two ways:
> > 1. Use the regex and use re.compile for all the three regex's outside
> > the function call (It shows considerable speedup as tested by
> > cProfile in Python2.7)
>
> Indeed, re.compile is a good idea.
>
> Note that the first two are only strings that are then merged into
> skippable_files (to keep the regex parts easier readable).
> Only skippable_files is used as regex.
>
> Indeed only one re.compile call is needed.


> > 2. Use simple string methods and leverage os.path submodule. This
> > gives probably performance better than uncompiled regex and only a
> > bit better than compiled versions.
> >
> > An alternative version string based:
> >
> > def is_skippable_file(path):
> >     basename = os.path.basename(path)
> >     if not basename or basename[-1] == '~' or basename[0] == '.' or
> > basename == 'README':
> >         return True
> >     skippable_extension = ['.dpkg-new', '.dpkg-old', '.dpkg-dist',
> > '.dpkg-bak', '.rpmnew', '.rpmsave', '.orig', '.rej']
> >     extension = os.path.splitext(basename)[1]
> >     if extension in skippable_extension:
> >         return True
> >     return False
>
> That gives slightly less readable code IMHO - and the os.path.* calls
> also don't come for free. If it doesn't bring a big performance boost,
> I'd prefer the regex-based way.
>
> BTW: using basename.endswith() could also be an option, and the python
> docs say that it accepts tuplis as suffix since py 2.5.
>
> Sounds like a good idea to use ends with.

> > Letme know if you want to have a look at the script I used to compare
> > the performance.
>
> Yes, that would be interesting.
>
> PFA.

> I'd also be interested in some numbers - how long do the following
> variants take?
>
> a) the "old" code we currently have in bzr (hint: it does 4 re.search)

 0.185 s for 1000 (x13 in each for all cases) calls to functions

b) with my patch applied (only one re.search, so I'd expect it to be
>    faster)
>
 0.160 s for 1000 (x13 in each for all cases) calls to functions

c) with my patch applied + using re.compile for skippable_files
>    c1) with re.compile inside the function (= one re.compile per file)
>
0.144 s for 1000 (x13 in each for all cases) calls to functions

   c2) with re.compile outside the function (= only one re.compile in
>        total)
>
0.065 s for 1000 (x13 in each for all cases) calls to functions

d) with the code using os.path.* you proposed
>
Using os.path.* and in operator:
0.082 s for 1000 (x13 in each for all cases) calls to functions

Using os.path.basename followed by your suggested endswith for basename:
0.052 s for 1000 (x13 in each for all cases) calls to functions

I've attach the test scripts, separate files for all 6 cases.

Though the %improvement may seem big but for all purposes it wont be a huge
improvement, maybe a couple of milliseconds for large number of files. ;-)

>

(If you send me the test script, I can do the tests myself - but I won't
> object if you do them and send the result ;-)
>
> > +    skippable_all    = '%s|%s' % (skippable_suffix, skippable_files)
> >
> > > +
> > >
> > > +    if re.search(skippable_all, path):
> > >          return True
> > >
> > > Also maybe add a "return False" at the end ;-)
>
> Python returns None by default, which is "good enough" ;-)
>
>
OTOH, it never hurts if the code explicitely states what it does, so
> I'll add it in the next patch version.
>
:-)

>

> >  def is_skippable_dir(path):
> > > === modified file 'utils/test/test-aa.py'
> > > --- utils/test/test-aa.py       2014-11-27 22:20:26 +0000
> > > +++ utils/test/test-aa.py       2014-12-06 21:06:15 +0000
> ...
> > The tests look good. I'll hold-back the ACK for the modified
> > version.
> >
> > Please have a look at the suggestions.
> >
> > Merry Christmas. (Apparently you were expecting an ACK under the
> > christmas tree ;-) )
>
> A review with comments is also ok ;-)
>
>
> Regards,
>
> Christian Boltz
> --
> [GUI vs. Command-Line] Einen ähnlichen Streit wird es in 20 Jahren
> auch geben, wenn die "2D-Screenfanatiker" auf die "VR Fans" losgehen
> und wieder ein Streit vom Zaun bricht der an Sinnfreiheit kaum zu
> überbieten ist.   [Phillip Richdale in suse-linux]
>
>
> --
> AppArmor mailing list
> AppArmor at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/apparmor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext1.py
Type: text/x-python
Size: 1116 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0006.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext2.py
Type: text/x-python
Size: 1132 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0007.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext3.py
Type: text/x-python
Size: 1104 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0008.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext4.py
Type: text/x-python
Size: 1082 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0009.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext5.py
Type: text/x-python
Size: 879 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0010.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ext6.py
Type: text/x-python
Size: 1092 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141225/c3d85796/attachment-0011.py>


More information about the AppArmor mailing list