Improving next_buffer() rpc

Wed Jul 9 16:29:38 UTC 2014

Alberto pointed out a gap in my suggestion #2/#6, which is that the client
wouldn't have a way of getting ownership of the additional buffers. So
maybe that idea (#2/#6) should become:

rpc exchange_buffer(Buffer) returns (Buffer)
rpc allocate_additional_buffer(Void) returns (Buffer)
rpc remove_additional_buffer(Buffer) returns (Void)

This still would have the submission happen on the exchange call, but make
additional buffer allocations more explicit than the #5 idea.

On Wed, Jul 9, 2014 at 12:04 PM, Kevin DuBois <kevin.dubois at canonical.com>
wrote:

> Attempting to steer the convo more towards the near term (kgunn's
> "First"), I'm just trying to make sure that the protocol change is somewhat
> forward-looking without wading into changing the buffer
> distribution/swapping system too much.
>
> Trying to distil the potential directions suggested a bit more:
>
> #1 what we currently have:
> rpc next_buffer(SurfaceId) returns (Buffer)
>
> #2 what I've proposed:
> rpc exchange_buffer(Buffer) returns (Buffer)
>
> #3 "push buffers", the server pushes out buffers to the clients
>
> #4 multiple buffers owned by the client/client allocated buffers
>
> now,
> #3 and #4 are interesting and potentially beneficial, but outside the
> scope of what I want to accomplish [1] with the optimization I'm aiming
> for. Its good to future-proof the protocol for when/if we decide to
> experiment.
>
> #3 strikes me as something I'd want to avoid (especially in the near
> term), because when you're pushing buffers to the client, the server is
> 'driving' the activity of the client processes, which seems strange. Now,
> I'd be fine with the client /making a request/ and then getting the buffer
> back asynchronously, but couldn't think of an advantage to this.
>
> I think #4 could be accommodated by having idea #5:
> #5
> rpc next_buffer(SurfaceId) returns (Buffer);
> rpc submit_buffer(Buffer) returns (Void);
>
> or supplementing idea #2 to make idea #6 (in the future)
> rpc exchange_buffer(Buffer) returns (Buffer)
> rpc request_client_buffer_count(int) returns (Void)
>
> Effectively, #5 removes the implicit release of the client buffer, making
> the submission of the filled buffer explicit. This seems a bit less nice to
> me than #2/#6 from an RAII mindset, so I think I still prefer #2 with a
> future expansion to #6 for multiple client buffers.
>
> So I guess #2/#6 or  #5 are fine for my immediate purposes, and both seem
> acceptible at the current time, and have some forward-thinking in them.
> Interested if others share my slight preference for #2/#6
>
> [1]
> My aim is to let the android platform's clients delay waiting for the gpu
> to finish with the buffer. It lets the client pass the fence and the
> uncompleted buffer back to the server without waiting, and requires that
> the fences are waited upon in the compositon pass. About a year or so ago,
> doing something similar with internal clients in the first landing of
> unity, we saw some noticeable performance improvements that makes us think
> this would be useful in the USC/unity8 setup we have now.
>
>
> On Wed, Jul 9, 2014 at 11:39 AM, Kevin Gunn <kevin.gunn at canonical.com>
> wrote:
>
>> First
>> Not sure we're still on topic necessarily wrt changing from id's to fd's
>> do we need to conflate that with the double/triple buffering topic ?
>> let's answer this first...
>>
>> Second
>> while we're at it :) triple buffering isn't always a win. In the case of
>> small, frequent renders (as an example "8x8 pixel square follow my finger")
>> you'll have potentially 2 extra buffers that need their 16ms of fame on the
>> screen in the queue, 1 at session server, 1 at system server. Which can
>> look a little laggy. I'm willing to say in the same breath though, that
>> this may be lunatic fringe. The win for the triple buffering case is likely
>> more common, which is spikey render times (14+ms) amongst more normal
>> render times (9-12ms)
>> +1 on giving empty buffers back to the clients to allow them to have a
>> "queue" of empty buffers at their disposal (i'm not sure if RAOF is correct
>> or duflu in that its "synchronously waiting for a round trip every swap"...can
>> we already have an empty buffer queue on the client side ?)
>>
>>
>> On Wed, Jul 9, 2014 at 4:35 AM, Daniel van Vugt <
>> daniel.van.vugt at canonical.com> wrote:
>>
>>> Forgive me for rambling on but I just had an important realisation...
>>>
>>> Our current desire to get back to double buffering is only because the
>>> Mir protocol is synchronously waiting for a round trip every swap, and
>>> somehow I thought that the buffer queue length affected time spent in the
>>> ready_to_composite state. Now I'm not so sure that's true.
>>>
>>> If we changed the protocol to implement parallelism, then in theory,
>>> keeping triple buffering with a fancy zero-latency swap buffers should
>>> perform better than the current protocol that has to wait for a round trip.
>>>
>>> I cannot remember why I thought the length of the buffer queue affected
>>> the time from client-rendering to server-compositing. Perhaps we really do
>>> need to keep triple-buffering always-on so that the performance gain of a
>>> zero-latency client swap-buffers can be achieved...
>>>
>>> In summary, I'm back to thinking any protocol change from next_buffer()
>>> needs to support parallelism and not be so synchronous.
>>>
>>> - Daniel
>>>
>>>
>>>
>>> On 09/07/14 16:08, Daniel van Vugt wrote:
>>>
>>>> Oops. I keep forgetting that the new BufferQueue disallows the
>>>> compositor to own less than one buffer, so there would no longer be any
>>>> benefit to double buffered clients from a more concurrent protocol :(
>>>>
>>>> Maybe Kevin's suggestion is just fine then. So long as the server is
>>>> able to figure out the surface(Id) from the Buffer struct.
>>>>
>>>>
>>>> On 09/07/14 15:41, Daniel van Vugt wrote:
>>>>
>>>>> Note that we're working on making double-buffering the default again
>>>>> and
>>>>> triple the exception. In that case fixing LP: #1253868 may seem
>>>>> pointless, but it is surprisingly still relevant. Because a fully
>>>>> parallelized design would significantly speed up double buffering
>>>>> too...
>>>>> client swap buffers would no longer have to wait for a round-trip
>>>>> before
>>>>> returning and would instead be almost instant.
>>>>>
>>>>>
>>>>> On 09/07/14 10:00, Daniel van Vugt wrote:
>>>>>
>>>>>> Sounds better to just pass buffers around although I'm not keen on any
>>>>>> change that doesn't make progress on the performance bottleneck LP:
>>>>>> #1253868. The bottleneck is the swapping/exchanging approach which
>>>>>> limits the client to holding only one buffer, so I don't think it's a
>>>>>> good idea for new designs to still have that problem.
>>>>>>
>>>>>> In order to improve parallelism per LP: #1253868 you'd really have to
>>>>>> receive new buffers as soon as they're free, which means getting them
>>>>>> as
>>>>>> MirEvents. Then you only need an RPC function to release them back to
>>>>>> the server:
>>>>>>
>>>>>>     rpc release_buffer(Buffer) returns (Void);
>>>>>>
>>>>>> Keep in mind the inter-process communication is the bottleneck here.
>>>>>> If
>>>>>> you allow a context switch between the server and client then that's
>>>>>> half to one millisecond (see mirping) per RPC round trip. More than
>>>>>> double that for nested servers and you see the protocol delay could
>>>>>> be a
>>>>>> significant factor. So I think any protocol enhancement should have
>>>>>> parallelism designed in.
>>>>>>
>>>>>> I also think we need to be careful about not landing any protocol
>>>>>> changes to RTM candidate series' 0.4-0.5, so the foundation for RTM is
>>>>>> maximally mature (albeit not yet optimal).
>>>>>>
>>>>>> - Daniel
>>>>>>
>>>>>>
>>>>>> On 08/07/14 21:10, Kevin DuBois wrote:
>>>>>>
>>>>>>> Hello mir team,
>>>>>>>
>>>>>>> In order to get the next buffer for the client, we currently have:
>>>>>>>
>>>>>>> rpc next_buffer(SurfaceId) returns (Buffer);
>>>>>>>
>>>>>>> which is problematic for me in working on [1] because this implicitly
>>>>>>> releases the buffer from the client side, whereas in working on that
>>>>>>> performance improvement, I have to send a fd back to the server. So I
>>>>>>> was thinking of adding an rpc method more like:
>>>>>>>
>>>>>>> rpc exchange_buffer(Buffer) returns (Buffer);
>>>>>>>
>>>>>>> This would be sufficient to pass the fd fence back, and the buffer
>>>>>>> id in
>>>>>>> the Buffer protocol message would be sufficient for the server to
>>>>>>> figure
>>>>>>> out which surface has sent back its buffer. (given the global buffer
>>>>>>> id's we're using)
>>>>>>>
>>>>>>> This does not address the problem noted in:
>>>>>>> https://bugs.launchpad.net/mir/+bug/1253868
>>>>>>> but I think that might be better addressed by having an exchange type
>>>>>>> rpc call (explicit or implicit) and negotiating/increasing how many
>>>>>>> buffers the client owns somehow else.
>>>>>>>
>>>>>>> This seems like something that could have diverse opinions, so I'm
>>>>>>> hoping to get some input on the protocol change here first.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Kevin
>>>>>>>
>>>>>>> [1]
>>>>>>> https://blueprints.launchpad.net/ubuntu/+spec/client-1410-
>>>>>>> mir-performance
>>>>>>>
>>>>>>>
>>>>>>> item:
>>>>>>> "[kdub] fencing improvements for clients add the ipc plumbing"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> --
>>> Mir-devel mailing list
>>> Mir-devel at lists.ubuntu.com
>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/
>>> mailman/listinfo/mir-devel
>>>
>>
>>
>> --
>> Mir-devel mailing list
>> Mir-devel at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/mir-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/mir-devel/attachments/20140709/5fba24fa/attachment.html>