[Bug 1057054] Re: poor performance after upgrade to Precise

Peter Petrakis peter.petrakis at canonical.com
Wed Sep 26 22:53:50 UTC 2012


Hmm, that's interesting, does that mean that after reboot scsi_dh_rdac is loaded?
Please verify.

Yes, it is available with the lucid kernel. Also note that if you were to reference
your vendor documentation, it would probably recommend that you load 
rdac driver (been around for a long time actually).

NOTE: multipath is basically the same on every distro, so instructions for RH 
provided by your vendor are just as good for Ubuntu etc. However kernels
change frequently which is why vendors and others need to be involved.

Let's assume that rdac is never loaded and life is good on lucid. That leaves
multipath itself and the Linux kernel, both of which have jumped dramatically
between lucid and precise. Since there's no real regression tests for HW SAN, and
the vendors aren't pitching in, it's easily possible that something weird
like this could occur. 

It's my opinion that if you're using ALUA, it's up to you to determine whether
an additional device handler is necessary. What may have happened is multipath
in lucid was  biasing the primary storage controller and forcing a trespass
unbenounced to you. This would have made ALUA irrelevant and ping ponged your
luns behind the RAID for a period of time, it also means you weren't using both your
storage processors like you intended. That would have been a bonafide bug in lucid. 
It's likely that it was rectified in precise considering the outcome. Going back and
finding it is an academic exercise as no matter what I find, it'll probably break
lucid. RDAC must be loaded.

[ALUA architecture example]
http://virtualgeek.typepad.com/virtual_geek/2009/09/a-couple-important-alua-and-srm-notes.html

As for the kernel, it does what it's told, something with that horrendous an
impact would have likely impacted your non-san disks as well, if it affected only
one that would be double weird :) So it likely comes down to how the IO was queued
to begin with, and multipathd is responsible for that.

Between the two MP versions, your SAN actually got a codified config, meaning
you don't need to provide your own if you don't want to, it's built in. 0.4.8 didn't
have this.

[libmultipath/hwtable.c]
        {
                /* DELL MD3000 */
                .vendor        = "DELL",
                .product       = "MD3000",
                .getuid        = DEFAULT_GETUID,
                .features      = "2 pg_init_retries 50",
                .hwhandler     = "1 rdac",
                .selector      = DEFAULT_SELECTOR,
                .pgpolicy      = GROUP_BY_PRIO,
                .pgfailback    = -FAILBACK_IMMEDIATE,
                .rr_weight     = RR_WEIGHT_NONE,
                .no_path_retry = 15,
                .minio         = DEFAULT_MINIO,
                .checker_name  = RDAC,
                .prio_name     = PRIO_RDAC,
                .prio_args     = NULL,
        },

Your config is overridding any member you defined, the rest are coming through, like
minio.

BTW you might wish to double check exactly what your SAN can do. Active/Active
isn't what it used to be and is really "dual active".

http://gestaltit.com/all/tech/storage/stephen/multipath-activepassive-
dual-active-activeactive/

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1057054

Title:
  poor performance after upgrade to Precise

Status in “multipath-tools” package in Ubuntu:
  Invalid

Bug description:
  I had a Lucid x64 server working with a Dell MD3000i with 4 paths and
  worked as expected.  I added the "prio rdac" line to the conf file,
  then upgraded to Precise, and removed the old mpath_rdac line and
  reboot one more time, just to be sure.  I did this based on a section
  in https://help.ubuntu.com/12.04/serverguide/serverguide.pdf

  as a "sanity check" test, I'm doing 'pv < /dev/mapper/dellsas1 >
  /dev/null' (friendly names enabled).  On Lucid i'd get about 100MB/s
  After upgrading to Precise I get an almost solid 768kB/s.  If I
  instead use the 4 underlying /dev/sd* devices, 2 give errors as
  expected, and 2 run at about 100MB/s as expected so iscsi seems to be
  working correctly and multipath not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1057054/+subscriptions




More information about the foundations-bugs mailing list