[PATCH 3/6] epoll: fix epoll's own poll

Paolo Pisati paolo.pisati at canonical.com
Fri Oct 28 08:07:54 UTC 2011


On 10/27/2011 03:08 PM, Stefan Bader wrote:
>
> @@ -1099,41 +1194,40 @@ retry:
>   */
>  SYSCALL_DEFINE1(epoll_create1, int, flags)
>  {
> -       int error, fd = -1;
> -       struct eventpoll *ep;
> +       int error;
> +       struct eventpoll *ep = NULL;
> 
>         /* Check the EPOLL_* constant for consistency.  */
>         BUILD_BUG_ON(EPOLL_CLOEXEC != O_CLOEXEC);
> 
> -       if (flags & ~EPOLL_CLOEXEC)
> -               return -EINVAL;
> -
>         DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d)\n",
>                      current, flags));
> 
> +       error = -EINVAL
> +       if (flags & ~EPOLL_CLOEXEC)
> +               goto error_return;
> +
>         /*
> -        * Create the internal data structure ( "struct eventpoll" ).
> +        * Create the internal data structure ("struct eventpoll").
>          */
>         error = ep_alloc(&ep);
> -       if (error < 0) {
> -               fd = error;
> +       if (error < 0)
>                 goto error_return;
> -       }
> 
>         /*
>          * Creates all the items needed to setup an eventpoll file. That is,
>          * a file structure and a free file descriptor.
>          */
> -       fd = anon_inode_getfd("[eventpoll]", &eventpoll_fops, ep,
> -                             flags & O_CLOEXEC);
> -       if (fd < 0)
> +       error = anon_inode_getfd("[eventpoll]", &eventpoll_fops, ep,
> +                                flags & O_CLOEXEC);
> +       if (error < 0)
>                 ep_free(ep);
> 
>  error_return:
>         DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d) = %d\n",
> -                    current, flags, fd));
> +                    current, flags, error));
> 
> -       return fd;
> +       return error;
>  }
> 
>  SYSCALL_DEFINE1(epoll_create, int, size)
> 
> The hunk above is in the upstream patch and giving it only a quick look, seems
> to be close enough to go into sys_epoll_create.
> We did not backport the whole of the syscall rewrites into hardy as the affected
> architectures were not the ones primarily supported by us. And of course that
> change was a PITA to backport (into Dapper).
> 

i deliberately left this piece out since it didn't buy us anything and:

1) it adds a new constrain on the flags we pass in:

> +       error = -EINVAL
> +       if (flags & ~EPOLL_CLOEXEC)
> +               goto error_return;
> +

and this is a breakage of POLA - all of sudden all binaries using this
call without passing EPOLL_CLOEXEC will get EINVAL

2) it uses a new KAPI for anon_inode_getfd() that it's not there (it
requires another patch at least)

3) if we take out the two points above, the rest is just shuffling
around - not worth the work IMO

-- 
bye,
p.




More information about the kernel-team mailing list