New Buffer Semantics Planning

Christopher James Halse Rogers chris at cooperteam.net
Fri Jun 26 04:29:17 UTC 2015


On Fri, Jun 26, 2015 at 2:24 PM, Daniel van Vugt 
<daniel.van.vugt at canonical.com> wrote:
> That's why it's a hard problem to replace BufferQueue -- you have to 
> figure out whether 2, 3 or 4 buffers for any given surface is correct 
> at the time. Never over-allocate and never under-allocate (which 
> causes freezing/deadlocks).
> 
> What I was suggesting is that a frame callback (for non-idle surfaces 
> anyway) would allow us to solve most of the hard problems in a 
> distributed manner between the client and server processes fairly 
> elegantly. And we don't need heuristics; this is all based on 
> certainty and logical reasoning.

Ah, yes. A frame callback is indeed a fine idea that neatly bypasses 
all the problems.

> 
> 
> On 26/06/15 12:16, Christopher James Halse Rogers wrote:
>> 
>> 
>> On Fri, Jun 26, 2015 at 2:04 PM, Daniel van Vugt
>> <daniel.van.vugt at canonical.com> wrote:
>>> Hmm, maybe not. If we assume the server is only communicating with 
>>> the
>>> client at 60Hz then the client could just do all the dropping 
>>> itself and
>>> send one frame (the newest completed one) every 16.6ms when the 
>>> server
>>> asks.
>> 
>> I don't think the server is ever going to ask for a frame.
>> 
>> All the client sees when the server is done with a buffer is that 
>> one of
>> their previously submitted buffers changes state from read-only to
>> exclusive-write access. It could possibly use a “has my last 
>> submitted
>> buffer become writeable yet” heuristic to guess when the server 
>> will
>> actually use a new buffer, but we don't guarantee that (nor can we, 
>> as
>> you note with bypass).
>> 
>>> 
>>> On 26/06/15 12:01, Daniel van Vugt wrote:
>>> > bypass/overlays: If you look at the current logic you will see 
>>> that the
>>> > DisplayBuffer holds the previous bypass/overlay buffer until 
>>> _after_
>>> the
>>> > client has provided the next one. And it must, to avoid scan-out
>>> > artefacts. So the server holds two of them very briefly. But only
>>> one is
>>> > held most of the time. Without "predictive bypass" as I'm working 
>>> on
>>> > right now, that buffer is held for almost two frames. With 
>>> "predictive
>>> > bypass" it's closer to (but greater than still) one frame held. On
>>> > startup, absolutely you're right that only one buffer is required 
>>> to
>>> get
>>> > bypass/overlays going. So my wording was wrong.
>> 
>> Right, but that's fine. If the client has submitted one buffer, and 
>> is a
>> candidate for overlay, then it's clear that the old scanout buffer
>> *wasn't* from the client. We hold onto the old scanout buffer and 
>> start
>> scanning out of the (single) buffer the client has submitted.
>> 
>> When the client submits a second buffer, the first isn't released 
>> until
>> we know it's no longer being scanned out of, but we don't need to 
>> have
>> the client's second buffer before scanning out of the first.
>> 
>> We don't need to have two buffers around all the time for overlay; we
>> need to have two buffers around to *switch* overlay buffers. But the
>> fact that we're switching buffers already means that we've got at 
>> least
>> two buffers.
>> 
>> This is sort of client-visible behaviour because the client can *see*
>> that the server is holding more than one buffer, but it's the same 
>> logic
>> for the client - “Do I have write ownership of a buffer? Yes: 
>> render to
>> it. No: wait¹ for one of my buffers to become writeable, or 
>> allocate a
>> new one”.
>> 
>> ¹: Potentially “wait” by adding the fence to the GL command 
>> stream and
>> submit rendering commands anyway.
>> 
>>> 
>>> >
>>> > client wake-up: I may have worded that poorly too. The point is 
>>> in the
>>> > new world (tm) frame dropping mostly happens in the client (as 
>>> opposed
>>> > to all in the server like it is today). But some of it still 
>>> needs to
>>> > happen in the server because you don't want a compositor that 
>>> tries to
>>> > keep up with a 1000 FPS client by scheduling all of those frames 
>>> on a
>>> > 60Hz display. It has to drop some.
>>> >
>>> >
>>> > On 26/06/15 11:39, Christopher James Halse Rogers wrote:
>>> >> On Fri, Jun 26, 2015 at 12:39 PM, Daniel van Vugt
>>> >> <daniel.van.vugt at canonical.com> wrote:
>>> >>> I'm curious (but not yet concerned) about how the new plan will 
>>> deal
>>> >>> with the transitions we have between 2-3-4 buffers which is 
>>> neatly
>>> >>> self-contained in the single BufferQueue class right now. 
>>> Although as
>>> >>> some responsibilities clearly live on one side and not the 
>>> other,
>>> >>> maybe things could become conceptually simpler if we manage them
>>> >>> carefully:
>>> >>>
>>> >>>   framedropping: Always implemented in the client process as a
>>> >>> non-blocking acquire. The server just receives new buffers 
>>> quicker
>>> >>> than usual and needs the smarts to deal with (skip) a high rate 
>>> of
>>> >>> incoming buffers [1].
>>> >>
>>> >> Clients will need to tell the server at submit_buffer time 
>>> whether or
>>> >> not this buffer should replace the other buffers in the queue.
>>> Different
>>> >> clients will need different behaviour here - the obvious case 
>>> being a
>>> >> video player that wants to dump a whole bunch of time-stamped
>>> buffers on
>>> >> the compositor at once and then go to sleep for a while.
>>> >>
>>> >> But in general, yes. The client acquires a bunch of buffers and 
>>> cycles
>>> >> through them.
>>> >>
>>> >>>   bypass/overlays: Always implemented in the server process,
>>> invisible
>>> >>> to the client. The server just can't enable those code paths 
>>> until at
>>> >>> least two buffers have been received for a surface.
>>> >>
>>> >> I don't think that's the case? Why does the server need two 
>>> buffers in
>>> >> order to overlay? Even with a single buffer the server always 
>>> has a
>>> >> buffer available¹.
>>> >>
>>> >> It won't be entirely invisible to the client; we'll probably need
>>> to ask
>>> >> the client to reallocate buffers when overlay state changes, at 
>>> least
>>> >> sometimes.
>>> >>
>>> >>>   client wake-up: Regardless of the model/mode in place the 
>>> client
>>> >>> would get woken up at the physical display rate by the server 
>>> if it's
>>> >>> had a buffer consumed (but not woken otherwise). More frequent
>>> >>> wake-ups for framedropping are the responsibility of 
>>> libmirclient
>>> >>> itself and need not involve the server to do anything different.
>>> >>
>>> >> By and large, clients will be woken up by EGL when the relevant
>>> fence is
>>> >> triggered.
>>> >>
>>> >> I don't think libmirclient will have any role in waking the 
>>> client.
>>> >> Unless maybe we want to mess around with
>>> >>
>>> >>> [1] Idea: If the server skipped/dropped _all_ but the newest
>>> buffer it
>>> >>> has for each surface on every composite() then that would 
>>> eliminate
>>> >>> buffer lag and solve the problem of how to replace dynamic 
>>> double
>>> >>> buffering. Client processes would still only be woken up at the
>>> >>> display rate so vsync-locked animations would not speed up
>>> >>> unnecessarily. Everyone wins -- minimal lag and maximal 
>>> smoothness.
>>> >>
>>> >> ¹: The assumption here is that a buffer can be simultaneously 
>>> scanned
>>> >> out from and textured from. I *think* that's a reasonable 
>>> assumption,
>>> >> and in the cases where I know it doesn't apply having multiple 
>>> buffers
>>> >> doesn't help, because it's the buffer *format* that can only be
>>> scanned
>>> >> out from, not textured from.
>>> >>
>>> >
>>> 
>>> --
>>> Mir-devel mailing list
>>> Mir-devel at lists.ubuntu.com
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/mir-devel




More information about the Mir-devel mailing list