Red Hat Cluster Suite / Handling of shutdown and mounting of GFS

Fri Sep 5 10:51:42 UTC 2008

On Fre, Sep 05, 2008, Ante Karamatic wrote:
> On Thu, 4 Sep 2008 23:55:47 +0200
> Chris Joelly <chris-m-lists at joelly.net> wrote:
> 
> > The cluster with 2 nodes is running, but i don't know how to remove
> > one node the correct way. I can move the active service (an IP
> > address by now) to the second node and then want to remove the other
> > node from the running cluster. cman_tool leave remove should be used
> > for this which is recommended on the RH documentation. But if i try
> > that i get the error message:
> > 
> > root at store02:/etc/cluster# cman_tool leave remove
> > cman_tool: Error leaving cluster: Device or resource busy
> 
> Two node cluster in RHCS is a special case and people should avoid it.
> I have one cluster with two nodes and I just hate it. Splitbrain is
> very common (and only possible) in two node cluster.

ack. But the error i get has nothing to do with split brain. And i'm
trying to figure out what device rhcs use and thus cannot remove the
node. The service which where hosted on store02 was successfully moved
to store01, so this must be an mistake from cman_tool? But how can i
find this device? Using strace and lsof i was not able to track it down
:-/

> > The only way to get out of this problem is to restart the whole
> > cluster which brings down the service(s) and results in unnecessary
> > fencing... Is there a known way to remove one node from the cluster
> > without bringing down the whole cluster?
> 
> I've managed to bring one down, but as soon as it's up, whole rhcs get
> unusable. Reboot helps :/

Which means that the whole rhcs stuff is rather useless? Or may i assume
that the rhcs stuff in RH, CentOS is much better integrated and tested
than in Ubuntu server? And therefore it's worth the subscription costs 
at RH or the switch to CentOS?

> > Another strange thing comes up when i try to use GFS:
> > 
> > i have configured DRBD on a backing HW Raid10 device, use LVM2 to
> > build a clusteraware VG, and on top of that use LVs and GFS across
> > the two cluster nodes.
> > 
> > Using the GFS filesystems without noauto in fstab doesn't mount the
> > filesystems on boot using /etc/init.d/gfs-tools. I think this is due
> > to the ordering the sysv init scripts are started. All RHCS stuff is
> > started from within rcS, and drbd is startet from within rc2. I read
> > the section of the debian-policy to figure out if rcS is meant to run
> > before rc2, but this isn't mentioned in the policy. So i assume that
> > drbd is started in rc2 after rcS, which would mean that every
> > filesystem on top of drbd is not able to mount on boot time... 
> > Can anybody prove this?
> 
> I also use GFS on top of DRBD, and your observations are correct. But,
> you really don't want DRBD started before GFS :D If there's a filesytem
> on DRBD, DRBD client must be primary before you try to mount
> filesystem. If this drbd client was out of sync for a long time,
> becoming primary can take a while.
> 
> This is why I don't set up nodes to boot up automaticaly. I'd rather
> connect to awaken node, start drbd sync and then manually mount
> filesystem and start rhcs.

ack. i'm glad to see that my conclusions are not far from reality ;)

> These things should be easier once we put upstart in use.

upstart? aha. sounds interesting... never heard of this before.

> > The reason why i try to mount a GFS filesystem at boottime is that i 
> > want to build cluster services on top of it, and that services (more 
> > than one) are relying on one fs. A better solution would be to define
> > a shared GFS filesystem resource which could be used across more than
> > one cluster services, but the cluster take care that the filesystem is
> > only mounted once...
> > Can this be achieved with RHCS?
> 
> You can have same filesystem mounted on both nodes at the same time.
> DRBD primary-primary + GFS on top of it.

this is the way i use DRBD-LVM2-GFS on my 2-node cluster. But as i 
understand cluster.conf and system-config-cluster i have to define 
resources for a service. If e.g. i want to create 2 services which both 
rely on the same GFS mount and are expected to run on the same node, then 
i don't know how to share this GFS resource. Does the resource manager
take care if the GFS resource is already mounted when starting service1
on node1 when he decides to bring up service2 on node1 too?
Or e.g. i setup the cluster so that each node is the fail over node for
the other, and the services have a GFS resource defined which would
cause an GFS mount which is already there on the fail over node? Or would
that 'double' mount trigger an failed start of the failed over service?

Chris

-- 
"The greatest proof that intelligent life other that humans exists in
 the universe is that none of it has tried to contact us!"