[apparmor] [PATCH] parser - fix trailing \\ in regex (was Re: [PATCH] [parsers] allow for nested alternations expressions)
Steve Beattie
steve at nxnw.org
Fri Nov 8 00:48:16 UTC 2013
On Wed, Nov 06, 2013 at 04:17:02PM -0800, John Johansen wrote:
> On 11/06/2013 04:10 PM, Steve Beattie wrote:
> > On Tue, Nov 05, 2013 at 12:29:56PM -0800, John Johansen wrote:
> >> On 11/05/2013 11:33 AM, Steve Beattie wrote:
> >>> On Mon, Nov 04, 2013 at 05:28:22PM -0800, John Johansen wrote:
> >>>> On 11/04/2013 04:34 PM, Steve Beattie wrote:
> >>>>> Well, part of the slowdown was me writing some unit tests for that
> >>>>> function. Here's the patch that does that:
> >>> [SNIP]
> >>>>> + //MY_REGEX_TEST("\\", "\\", ePatternBasic);
> >>>>> + MY_REGEX_TEST("\\\\", "\\\\", ePatternBasic);
> >>>>> + //MY_REGEX_TEST("\\blort", "\\blort", ePatternBasic);
> >>>>> + MY_REGEX_TEST("\\\\blort", "\\\\blort", ePatternBasic);
> >>>>> + //MY_REGEX_TEST("blort\\", "blort\\", ePatternBasic);
> >>>> why are these 3 commented out?
> >>>
> >>> Ah, right, I'd forgotten about these. They're commented out
> >>> because as-is, they fail; however I wasn't sure if that was the
> >>> correct expected output. Basically, what happens is that if whatever
> >>> follows isn't expecting an escape character, then the escaping '\' is
> >>> dropped. Thus, the current behavior is that '\\' becomes '' and both
> >>> '\\blort' and 'blort\\' become 'blort'.
> >>>
> >>> The question is, is this a bug? I think so... but I'm willing to hear
> >>> countering arguments.
> >>>
> >> O_o Its a bug
> >
> > Alright, here's a patch that fixes the issue:
> >
> > Signed-off-by: Steve Beattie <steve at nxnw.org>
> > ---
> > parser/parser_regex.c | 140 +++++++++++++++++++++++++++++++++++++++-----------
> > 1 file changed, 110 insertions(+), 30 deletions(-)
> >
> > Index: b/parser/parser_regex.c
> > ===================================================================
> > --- a/parser/parser_regex.c
> > +++ b/parser/parser_regex.c
> > @@ -326,9 +347,14 @@ static pattern_t convert_aaregex_to_pcre
> > case '(':
> > case ')':
> > STORE("\\", dptr, 1);
> > - // fall through to default
> > + STORE(sptr, dptr, 1);
> > + break;
> >
> > default:
> > + if (bEscape) {
> > + /* just a regular backslash */
> > + STORE("\\", dptr, 1);
> > + }
> > STORE(sptr, dptr, 1);
> > break;
> > } /* switch (*sptr) */
> > @@ -344,6 +370,10 @@ static pattern_t convert_aaregex_to_pcre
> > progname);
> > }
> >
> > + if ((error == e_no_error) && bEscape) {
> > + /* just a regular backslash */
> > + STORE("\\", dptr, 1);
> > + }
>
> hrmmm, I am thinking this should be a bug
You mean a parse error?
> otherwise if we ever expand the set of supported escape characters we
> will have issue.
So, the way it works with perl/pcre[1] is that, if a \ is followed by:
- a punctuation character, it eliminates the specialness of that
character. '\' followed by punctuation in pcre is supposed to
never have a special meaning.
- an alphanumeric character: can have a special meaning, but if it
doesn't, then it just matches the character. Perl is supposed to
give warnings for such usage if warnings are turned on.
- a naked trailing '\' results in an error ("Trailing \ in regex ...")
with both perl and pcregrep
Now, we accept (or intend to accept) pcre style for character classes,
but not generally for our regexs, so it's not a strict requirement that
we match behavior here.
Anyway, the following patch throws an error if an escaping slash is the
trailing character and otherwise just drops it for non-special
characters, and adjusts the relevant test cases.
[1] info gleaned from http://perldoc.perl.org/perlrebackslash.html
except where noted.
Signed-off-by: Steve Beattie <steve at nxnw.org>
---
parser/parser_regex.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
Index: b/parser/parser_regex.c
===================================================================
--- a/parser/parser_regex.c
+++ b/parser/parser_regex.c
@@ -344,6 +344,12 @@ static pattern_t convert_aaregex_to_pcre
progname);
}
+ if ((error == e_no_error) && bEscape) {
+ /* trailing backslash quote */
+ error = e_parse_error;
+ PERROR(_("%s: Regex error: trailing '\\' escape character\n"),
+ progname);
+ }
/* anchor end and terminate pattern string */
if ((error == e_no_error) && anchor) {
STORE("$" , dptr, 1);
@@ -1304,11 +1310,11 @@ static int test_aaregex_to_pcre(void)
MY_REGEX_TEST("/most/basic/test", "/most/basic/test", ePatternBasic);
- MY_REGEX_TEST("\\", "\\", ePatternBasic);
+ MY_REGEX_FAIL_TEST("\\");
MY_REGEX_TEST("\\\\", "\\\\", ePatternBasic);
- MY_REGEX_TEST("\\blort", "\\blort", ePatternBasic);
+ MY_REGEX_TEST("\\blort", "blort", ePatternBasic);
MY_REGEX_TEST("\\\\blort", "\\\\blort", ePatternBasic);
- MY_REGEX_TEST("blort\\", "blort\\", ePatternBasic);
+ MY_REGEX_FAIL_TEST("blort\\");
MY_REGEX_TEST("blort\\\\", "blort\\\\", ePatternBasic);
MY_REGEX_TEST("*", "[^/\\x00]*", ePatternRegex);
MY_REGEX_TEST("blort*", "blort[^/\\x00]*", ePatternRegex);
--
Steve Beattie
<sbeattie at ubuntu.com>
http://NxNW.org/~steve/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20131107/c06d6265/attachment.pgp>
More information about the AppArmor
mailing list