[Bug 1057054] Re: poor performance after upgrade to Precise
Peter Petrakis
peter.petrakis at canonical.com
Wed Sep 26 22:53:50 UTC 2012
Hmm, that's interesting, does that mean that after reboot scsi_dh_rdac is loaded?
Please verify.
Yes, it is available with the lucid kernel. Also note that if you were to reference
your vendor documentation, it would probably recommend that you load
rdac driver (been around for a long time actually).
NOTE: multipath is basically the same on every distro, so instructions for RH
provided by your vendor are just as good for Ubuntu etc. However kernels
change frequently which is why vendors and others need to be involved.
Let's assume that rdac is never loaded and life is good on lucid. That leaves
multipath itself and the Linux kernel, both of which have jumped dramatically
between lucid and precise. Since there's no real regression tests for HW SAN, and
the vendors aren't pitching in, it's easily possible that something weird
like this could occur.
It's my opinion that if you're using ALUA, it's up to you to determine whether
an additional device handler is necessary. What may have happened is multipath
in lucid was biasing the primary storage controller and forcing a trespass
unbenounced to you. This would have made ALUA irrelevant and ping ponged your
luns behind the RAID for a period of time, it also means you weren't using both your
storage processors like you intended. That would have been a bonafide bug in lucid.
It's likely that it was rectified in precise considering the outcome. Going back and
finding it is an academic exercise as no matter what I find, it'll probably break
lucid. RDAC must be loaded.
[ALUA architecture example]
http://virtualgeek.typepad.com/virtual_geek/2009/09/a-couple-important-alua-and-srm-notes.html
As for the kernel, it does what it's told, something with that horrendous an
impact would have likely impacted your non-san disks as well, if it affected only
one that would be double weird :) So it likely comes down to how the IO was queued
to begin with, and multipathd is responsible for that.
Between the two MP versions, your SAN actually got a codified config, meaning
you don't need to provide your own if you don't want to, it's built in. 0.4.8 didn't
have this.
[libmultipath/hwtable.c]
{
/* DELL MD3000 */
.vendor = "DELL",
.product = "MD3000",
.getuid = DEFAULT_GETUID,
.features = "2 pg_init_retries 50",
.hwhandler = "1 rdac",
.selector = DEFAULT_SELECTOR,
.pgpolicy = GROUP_BY_PRIO,
.pgfailback = -FAILBACK_IMMEDIATE,
.rr_weight = RR_WEIGHT_NONE,
.no_path_retry = 15,
.minio = DEFAULT_MINIO,
.checker_name = RDAC,
.prio_name = PRIO_RDAC,
.prio_args = NULL,
},
Your config is overridding any member you defined, the rest are coming through, like
minio.
BTW you might wish to double check exactly what your SAN can do. Active/Active
isn't what it used to be and is really "dual active".
http://gestaltit.com/all/tech/storage/stephen/multipath-activepassive-
dual-active-activeactive/
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1057054
Title:
poor performance after upgrade to Precise
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1057054/+subscriptions
More information about the Ubuntu-server-bugs
mailing list