ACK/cmt: [SRU][X/B/E/F][PATCH] Revert "ipc, sem: remove uneeded sem_undo_list lock usage in exit_sem()"

Khaled Elmously khalid.elmously at canonical.com
Mon Feb 24 17:22:00 UTC 2020


On 2020-02-24 14:06:44 , Ioanna Alifieraki wrote:
> BugLink: https://bugs.launchpad.net/bugs/1858834
> 
> This reverts commit a97955844807e327df11aa33869009d14d6b7de0.
> 
> Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage
> in exit_sem()") removes a lock that is needed.  This leads to a process
> looping infinitely in exit_sem() and can also lead to a crash.  There is
> a reproducer available in [1] and with the commit reverted the issue
> does not reproduce anymore.
> 
> Using the reproducer found in [1] is fairly easy to reach a point where
> one of the child processes is looping infinitely in exit_sem between
> for(;;) and if (semid == -1) block, while it's trying to free its last
> sem_undo structure which has already been freed by freeary().
> 
> Each sem_undo struct is on two lists: one per semaphore set (list_id)
> and one per process (list_proc).  The list_id list tracks undos by
> semaphore set, and the list_proc by process.
> 
> Undo structures are removed either by freeary() or by exit_sem().  The
> freeary function is invoked when the user invokes a syscall to remove a
> semaphore set.  During this operation freeary() traverses the list_id
> associated with the semaphore set and removes the undo structures from
> both the list_id and list_proc lists.
> 
> For this case, exit_sem() is called at process exit.  Each process
> contains a struct sem_undo_list (referred to as "ulp") which contains
> the head for the list_proc list.  When the process exits, exit_sem()
> traverses this list to remove each sem_undo struct.  As in freeary(),
> whenever a sem_undo struct is removed from list_proc, it is also removed
> from the list_id list.
> 
> Removing elements from list_id is safe for both exit_sem() and freeary()
> due to sem_lock().  Removing elements from list_proc is not safe;
> freeary() locks &un->ulp->lock when it performs
> list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
> removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list
> lock usage in exit_sem()").
> 
> This can result in the following situation while executing the
> reproducer [1] : Consider a child process in exit_sem() and the parent
> in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).
> 
>  - The list_proc for the child contains the last two undo structs A and
>    B (the rest have been removed either by exit_sem() or freeary()).
> 
>  - The semid for A is 1 and semid for B is 2.
> 
>  - exit_sem() removes A and at the same time freeary() removes B.
> 
>  - Since A and B have different semid sem_lock() will acquire different
>    locks for each process and both can proceed.
> 
> The bug is that they remove A and B from the same list_proc at the same
> time because only freeary() acquires the ulp lock. When exit_sem()
> removes A it makes ulp->list_proc.next to point at B and at the same
> time freeary() removes B setting B->semid=-1.
> 
> At the next iteration of for(;;) loop exit_sem() will try to remove B.
> 
> The only way to break from for(;;) is for (&un->list_proc ==
> &ulp->list_proc) to be true which is not. Then exit_sem() will check if
> B->semid=-1 which is and will continue looping in for(;;) until the
> memory for B is reallocated and the value at B->semid is changed.
> 
> At that point, exit_sem() will crash attempting to unlink B from the
> lists (this can be easily triggered by running the reproducer [1] a
> second time).
> 
> To prove this scenario instrumentation was added to keep information
> about each sem_undo (un) struct that is removed per process and per
> semaphore set (sma).
> 
>           CPU0                                CPU1
>   [caller holds sem_lock(sma for A)]      ...
>   freeary()                               exit_sem()
>   ...                                     ...
>   ...                                     sem_lock(sma for B)
>   spin_lock(A->ulp->lock)                 ...
>   list_del_rcu(un_A->list_proc)           list_del_rcu(un_B->list_proc)
> 
> Undo structures A and B have different semid and sem_lock() operations
> proceed.  However they belong to the same list_proc list and they are
> removed at the same time.  This results into ulp->list_proc.next
> pointing to the address of B which is already removed.
> 
> After reverting commit a97955844807 ("ipc,sem: remove uneeded
> sem_undo_list lock usage in exit_sem()") the issue was no longer
> reproducible.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779
> 
> Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com
> Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
> Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki at canonical.com>
> Acked-by: Manfred Spraul <manfred at colorfullife.com>
> Acked-by: Herton R. Krzesinski <herton at redhat.com>
> Cc: Arnd Bergmann <arnd at arndb.de>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Cc: <malat at debian.org>
> Cc: Joel Fernandes (Google) <joel at joelfernandes.org>
> Cc: Davidlohr Bueso <dave at stgolabs.net>
> Cc: Jay Vosburgh <jay.vosburgh at canonical.com>
> Cc: <stable at vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> (cherry picked from commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800)
> ---
>  ipc/sem.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/ipc/sem.c b/ipc/sem.c
> index ec97a7072413..fe12ea8dd2b3 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk)
>  		ipc_assert_locked_object(&sma->sem_perm);
>  		list_del(&un->list_id);
>  
> -		/* we are the last process using this ulp, acquiring ulp->lock
> -		 * isn't required. Besides that, we are also protected against
> -		 * IPC_RMID as we hold sma->sem_perm lock now
> -		 */
> +		spin_lock(&ulp->lock);
>  		list_del_rcu(&un->list_proc);
> +		spin_unlock(&ulp->lock);
>  
>  		/* perform adjustments registered in un */
>  		for (i = 0; i < sma->sem_nsems; i++) {

I think this is needed in the disco kernel as well. Looks good:

Acked-by: Khalid Elmously <khalid.elmously at canonical.com>




More information about the kernel-team mailing list