[PATCH 3/6] epoll: fix epoll's own poll

Stefan Bader stefan.bader at canonical.com
Fri Oct 28 08:43:41 UTC 2011


On 28.10.2011 10:07, Paolo Pisati wrote:
> On 10/27/2011 03:08 PM, Stefan Bader wrote:
>>
>> @@ -1099,41 +1194,40 @@ retry:
>>   */
>>  SYSCALL_DEFINE1(epoll_create1, int, flags)
>>  {
>> -       int error, fd = -1;
>> -       struct eventpoll *ep;
>> +       int error;
>> +       struct eventpoll *ep = NULL;
>>
>>         /* Check the EPOLL_* constant for consistency.  */
>>         BUILD_BUG_ON(EPOLL_CLOEXEC != O_CLOEXEC);
>>
>> -       if (flags & ~EPOLL_CLOEXEC)
>> -               return -EINVAL;
>> -
>>         DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d)\n",
>>                      current, flags));
>>
>> +       error = -EINVAL
>> +       if (flags & ~EPOLL_CLOEXEC)
>> +               goto error_return;
>> +
>>         /*
>> -        * Create the internal data structure ( "struct eventpoll" ).
>> +        * Create the internal data structure ("struct eventpoll").
>>          */
>>         error = ep_alloc(&ep);
>> -       if (error < 0) {
>> -               fd = error;
>> +       if (error < 0)
>>                 goto error_return;
>> -       }
>>
>>         /*
>>          * Creates all the items needed to setup an eventpoll file. That is,
>>          * a file structure and a free file descriptor.
>>          */
>> -       fd = anon_inode_getfd("[eventpoll]", &eventpoll_fops, ep,
>> -                             flags & O_CLOEXEC);
>> -       if (fd < 0)
>> +       error = anon_inode_getfd("[eventpoll]", &eventpoll_fops, ep,
>> +                                flags & O_CLOEXEC);
>> +       if (error < 0)
>>                 ep_free(ep);
>>
>>  error_return:
>>         DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d) = %d\n",
>> -                    current, flags, fd));
>> +                    current, flags, error));
>>
>> -       return fd;
>> +       return error;
>>  }
>>
>>  SYSCALL_DEFINE1(epoll_create, int, size)
>>
>> The hunk above is in the upstream patch and giving it only a quick look, seems
>> to be close enough to go into sys_epoll_create.
>> We did not backport the whole of the syscall rewrites into hardy as the affected
>> architectures were not the ones primarily supported by us. And of course that
>> change was a PITA to backport (into Dapper).
>>
> 
> i deliberately left this piece out since it didn't buy us anything and:
> 
> 1) it adds a new constrain on the flags we pass in:
> 
>> +       error = -EINVAL
>> +       if (flags & ~EPOLL_CLOEXEC)
>> +               goto error_return;
>> +
> 
> and this is a breakage of POLA - all of sudden all binaries using this
> call without passing EPOLL_CLOEXEC will get EINVAL
> 
> 2) it uses a new KAPI for anon_inode_getfd() that it's not there (it
> requires another patch at least)
> 
> 3) if we take out the two points above, the rest is just shuffling
> around - not worth the work IMO
> 

Ok, I see. Seems that the hardy code (beside if that additional check, that gets
moved around) already looks close to what the upstream code looks after the
change. So it should be ok. Maybe a little note in the cover mail will help next
time.

-Stefan




More information about the kernel-team mailing list