Is multipath essential on Server 20.04?

R C cjvijf at gmail.com
Fri Oct 2 20:50:44 UTC 2020



On 10/2/20 1:58 PM, Liam Proven wrote:
> On Fri, 2 Oct 2020 at 20:50, R C <cjvijf at gmail.com> wrote:
>> multipath is needed for (optimizing) io when there are virtual storage
>> devices that use multiple hardware/actual devices (hence multipath), for
>> example when you are using lustre, scality, san, jbod, ZFS or 'any array'
>>
>> since you are using ZFS, in/on an array, it would be a good idea to use it.
>>
>> You don't NEED to use it, it only makes io more efficient and probably
>> faster.
>>
>> Also, you can shut down some of the cores on your RPI cpu, you DON'T
>> really need those either.
> Thanks for the comments.
>
> My impression was that it was for SAN connections -- things like
> iSCSI, and maybe Ceph. I don't know much about Lustre/Glustre, and
> I've never even heard of Scality before. JBOD? Just a Bunch of Disks?
> Really? How so?
>
> ZFS I'm using, yes, which is why I was concerned. If I end up dropping
> ZFS then I'll probably replace it with a plain old MDRAID. Both are
> multi-device filesystems, so I figured they'd need the kernel
> devicemapper, but I don't see why ZFS on a single box that's not
> connecting to anything else should need it. How come?
>
> (Obligatory "please bottom-post on the list" comment, too...)
>
I wonder why  that "bottom posting" is such an issue,  there was a time 
where no one really cared....    but anyway:


I used to develop large scale storage systems, like Lustre. Scality 
builds large "object stores", basically an

object based storage system, I did some R&D work for that.


Basically when you have multiple devices show up (/dev/sd*) and they are 
used in a virtual device, it would be benificial to use multipath, 
especially when the number of physical devices go up.


I assume you built a raid pool with your ZFS setup, and use multiple, 
actual, drives in your ZFS pool. When writing to the pool, a virtual 
device, at some point a stripe needs to end up on a physical device 
somewhere. (and vice versa, for reading from a pool,  a stripe needs to 
come from a physical device somewhere)

That means you have multiple paths to a virtual device, at least one for 
each actual drive in that pool. That's what you'd want multipath for, 
especially under load when you are hammering that ZFS filesystem. It 
starts to even get more important when you  put a storage system on top 
of ZFS, for example put Lustre on top of ZFS pools, where each pool 
consists of multiple actual drives/devices.

Yes JBODs are big boxes that contain nothing but a boatload of drives, 
and they would all show up as devices/drives, in /dev. Typically  you'd 
make a bunch of pools of drives, array, often done with ZFS and then use 
these virtual/ drives, ZFS pools, for putting a storage system on top 
of. (It is done both for compression and for redundancy. JBODs are used 
for building filesystems many PBs in size.)

I don't think you need multipath simply because you have multiple 
drives,  multipath is needed for optimizing io on a virtual device, 
consisting of more than one actual physical drive/device, where you have 
multiple actual paths to the physical drives. Especially in parallel io 
it becomes a big deal.









More information about the ubuntu-users mailing list