request for feedback - I have a load balancing patch for dm-raid1.c
Robert Collins
robertc at robertcollins.net
Wed Jul 6 07:13:58 UTC 2011
On Wed, Jul 6, 2011 at 6:58 PM, Stefan Bader <stefan.bader at canonical.com> wrote:
> Should have been a reply all but only went to the ml, sorry.
No worries; I'm not subscribed so thanks for forwarding to me :)
> -Stefan
>
> -------- Original Message --------
> Subject: Re: request for feedback - I have a load balancing patch for dm-raid1.c
> Date: Tue, 05 Jul 2011 15:44:46 +0200
> From: Stefan Bader <stefan.bader at canonical.com>
> To: kernel-team at lists.ubuntu.com
>
> On 05.07.2011 13:59, Robert Collins wrote:
>> Hi, I recently realised
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/361733 was
>> affecting me - I have my desktop machine setup with an onboard BIOS
>> supported raid setup which ends up running with dm-raid1 targets.
>>
>> Anyhow, doing a load-balancing heuristic seemed pretty approachable,
>> so I put one together; I looked at reusing the md raid1.c logic but
>> the two implementations are pretty far apart. I did borrow what seemed
>> to be the most significant heuristic - reading from the same target if
>> the prior read was adjacent to it.
>>
>> I didn't do the markup-on-read-complete, because with ahci tagging and
>> delayed reads it was more complex than I wanted to reason about :).
>>
>> Anyhow, I'm sure that this is improvable further, but it seems like an
>> unqualified improvement over the status quo: it load balances reads,
>> sequential IO is still plenty fast, and it makes it a little clearer
>> for folk wanting to hack on this whats going on.
>>
>> I would love a review for correctness and concurrency safety, if
>> someone has the time.
>>
>> my patch: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/361733/+attachment/2192144/+files/raid1-loadbalance.patch
>>
>> -Rob
>>
>
> Hi Robert,
>
> I had a quick look at your patch, so here is what comes to my mind. Generally
> the printk's will probably not be liked much. Even with going to debug they will
> be emitted quite often. And that uses more CPU, slows down processing and could
> cause logs to grow.
> The idea of sector distance is not bad but maybe a combination of just not
> switching paths every request and using a merge function would be preferred
> (there is dm-crypt and dm-stripe which do this). There is also dm-mpath which
> was changed from using bios to use requests, which may serve a similar benefit.
> In the end, if possible any read sent to one path should be as large as
> possible. Writes would benefit there as well as the drives could optimize.
>
> But for a minimal intrusive/effort approach it maybe helps to change to switch
> the mirror path every x requests to get a sort of grouping of them...
Yeah, i could see doing those things perhaps helping. I've make the
printks disable like other dm- modules do now, I will refresh the
patch after any other feedback.
The current code results in merges happening after the mirror
selection. I think even with a merge function we'd need mapping still
- so the question in my mind is, if we had a merge fn, would the
mapping be a lot simpler?
It might be I guess: if the merged IO'd are big enough (specifically
if they are cylinder at a time) then you can elevator seek cylinder to
cylinder in a round-robin fashion and get concurrent reading going on
- great for kernel readahead. OTOH I'm not sure how to make sure that
happens, and if the merge is too big we'd end up just dispatching
everything to one drive anyway.
A larger concern is when concurrent reads are happening from two
sources: there the optimal thing is to satisfy one sources reads from
one disk, and the other from the other disk.
I think a thing that is needed is something to prevent one drive being
ignored because someone reads sector 2Billion or something. I don't
think its needed in a first cut (because it degrades to the current
behaviour).
Anyhow, we're deep into tradeoff territory here; what I was most
worried about was style and correctness which your comments on the
printk issue have helped with a lot.
Thanks!
-Rob
More information about the kernel-team
mailing list