[PATCH 2/2] drm/i915: Synchronize active and retire callbacks

Sultan Alsawaf sultan.alsawaf at canonical.com
Thu Apr 9 16:59:11 UTC 2020

On Thu, Apr 09, 2020 at 06:40:43PM +0200, Andrea Righi wrote:
> >  	/* After the final retire, the entire struct may be freed */
> > -	if (ref->retire)
> > -		ref->retire(ref);
> > +	if (ref->retire) {
> > +		if (ref->active) {
> > +			bool freed = false;
> > +
> > +			/* Don't race with the active callback, and avoid UaF */
> > +			down_write(&ref->rwsem);
> > +			ref->freed = &freed;
> So, we allocate "freed" in the stack and assign to ref->freed, so
> another thread can potentially access this thread's stack... hm.. I'm
> not sure if this is a safe practice.

ref->retire() isn't asynchronous for any of the functions registered to it, so
no other thread accesses this thread's stack.

> I'm still trying to understand the logic behind this, sorry if I'm
> asking something dumb... but why not doing something simple like this:
> 	down_write(&ref->rwsem);
> 	if (ref->retire)
> 		ref->retire(ref);
> 	up_write(&ref->rwsem);

I actually made this mistake with one of my earlier attempts. `ref->retire()`
will free `ref` on the final put, so putting `up_write(&ref->rwsem);` after it
unconditionally is a use-after-free. And we have no way to access the reference
counter that determines when the final put occurs, because it varies by each
user of the API.

Plus, all the functions involved in the retirement sequence do not return
values, so trying to make `ref->retire()` return a boolean that says whether
`ref` was freed or not would require extensive changes across multiple files.

> Again, why not a simple:
> 	down_read(&ref->rwsem);
> 	if (!atomic_read(&ref->count) && ref->active)
> 		err = ref->active(ref);
> 	up_read(&ref->rwsem);

No reason to hold a lock when we don't need to. There are some users of this API
that do not have a retire callback (i.e., `ref->retire` is NULL), so in that
case we don't need to do any synchronization.

> > +	bool *freed;
> Why does freed need to be a pointer? Why not just a bool?

Because everything inside the `ref` struct can be freed after `ref->retire()`
executes, so accessing `ref` after that would be a use-after-free.

Thanks for the review!


More information about the kernel-team mailing list