Subsurface support, or delegated compositing

Christopher James Halse Rogers raof at
Mon Nov 25 23:29:31 UTC 2013

On Mon, 2013-11-25 at 08:12 +0100, Thomas Voß wrote:
> On Mon, Nov 25, 2013 at 7:51 AM, Christopher James Halse Rogers
> <raof at> wrote:
> > One of the architectural things that I want to get done at the sprint
> > next week is a solid idea of how we want to do nested
> > compositing/out-of-process plugins/subsurfaces - which all seem to me to
> > be aspects of the same problem.
> >
> There is a security relevant issue here, too. Trusted helpers as well
> as input methods should have a way to
> "compose" UIs across process boundaries, too. See
> for an
> introduction to the problem.
> > In order to prime the discussion - and to invite outside contributions -
> > I thought I'd lay out the usecases as I see them before we get to
> > London.
> >
> > There are two conceptual use-cases here -
> > 1) “I want to delegate some of my UI to a third party”, and
> > 2) “I need to do some compositing, and want to do this efficiently”
> >
> > A Unity8 session running under unity-system-compositor falls under (2).
> >
> > A video player playing a YUV stream that might also want to throw some
> > RGB-rendered UI over it is also (2); a video player that has some chrome
> > around a video widget is (1) and (2).
> >
> > The “embed bits of other applications in our window” requested on
> > is firmly in (2).
> >
> > As I see it, there are also two classes of problem:
> > a) How is the rendering loop coordinated between parent and child - does
> > the parent need to swap each time the child has new rendering, or can
> > the child swap independently, or are both modes available? What happens
> > if a child also wishes to embed children of its own?
> >
> > This is the only concern for the type (2) use-case.
> >
> > b) How is input handled? Does the parent need to proxy input events and
> > forward them on? How does enter/leave notification work for the parent
> > and child? Can the child return events to the parent? Etc.
> >
> > This is necessary for the type (1) use-cases, and also seems to be the
> > hairy bit.
> >
> > Weston (but not yet Wayland) currently partially solves (2) with the
> > subsurfaces protocol, which has chosen the “no child rendering appears
> > until the parent swaps” approach, and doesn't handle out of process
> > renderers at all.
> >
> > For full out-of-process rendering, and for type-1 usecases, the my
> > understanding of the current state of the art is that the parent should
> > become a Wayland compositor itself. This seems a bit of a co-out to me,
> > and doesn't really solve case 2; however, this area is gnarly, so it
> > might prove to be the best solution.
> >
> From my pov, for out-of-process rendering approaches, the rendering
> and input handling should be opaque
> to top/higher-level compositors, with a compositor only requiring
> knowledge of its direct children. However, I would think that we
> should avoid forcing a compositor model on applications like chrome.
> They already have app-local compositors running, together with event
> handling. What we should offer (and commit to) is a way to stream
> pixel data across process boundaries. Candidate solutions could be:
> (1.)
> We use this extension today on Ubuntu Touch for streaming data from
> decoded video and from the builtin cameras.
> (2.) EGLStream with a texture sink (somewhat in the future).

Right - getting pixel data across process boundaries is the easy bit, as
long as you're ok with requiring the receiver of the pixel data to do
all the work. Which is acceptable (but not great) for Chrome, as they've
already that.

The interesting part is when you want to optimise.

Say you've got a video player called Totem that wants to display
subtitles on the video; this is fullscreen, running in Unity8, under

You're running this on your tablet, which as luck would have it has a
YUV plane and an ARGB plane.

In the just-getting-pixel-data-across world, Totem wakes up whenever the
decoder provides it with a frame, composites the YUV frame and its RGB
pixel data onto Totem's surface; calls SwapBuffers, and Unity8 passes
that on to u-s-c, which swaps it to the display (as everything should be
eligable for bypass).

In an *ideal* world, Totem sets up its video stage, specifying the
external image stream for the lower level, and making an overlay surface
for its subtitles. Totem wakes up each time it needs to change the
subtitle surface, calling SwapBuffers on that. U8 passes that straight
to u-s-c, which puts it on the ARGB plane. The decoded image stream gets
passed straight to U8, which passes it straight to u-s-c, which puts it
on the YUV plane.

Basically, the later you're able to do the compositing step the more
optimisation opportunities there are. This is reasonably important for
us, because our default setup is going to involve nesting U8 under
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <>

More information about the Mir-devel mailing list