RFC: Ubuntu HA resource-agents supportability

Dan Streetman dan.streetman at canonical.com
Tue Apr 7 19:54:11 UTC 2020


On Tue, Mar 31, 2020 at 1:09 AM Rafael David Tinoco
<rafaeldtinoco at ubuntu.com> wrote:
>
> Hello,
>
> As many as you know I'm currently revamping Ubuntu High Availability
> Packages
>
> For 20.04, considered HA (or HA related) packages are:
>
> - Core packages:
>
>   - libqb
>   - kronosnet
>   - corosync
>   - pacemaker
>   - resource-agents
>   - fence-agents
>   - crmsh
>   - cluster-glue
>   - drbd-utils
>   - dlm
>   - gfs2-utils
>
> - "Deprecated" packages:
>
>   - heartbeat
>   - keepalived
>   - ocfs2-tools
>
> - Not in "main" packages:
>
>   - pcs (will likely replace crmsh in near future)
>   - csync2
>   - corosync-qdevice
>   - fence-virt
>   - sbd
>   - booth
>
> - Related packages:
>
>   - multipath-tools
>   - open-iscsi
>   - sg3-utils
>   - targetcli-fb
>   - tgt (we're trying to deprecate in favor of LIO)
>   - lvm2
>
> For now, until Beta Freeze, we've been trying to catch up with upstream
> latest
> releases and, from now on, I'm going through existing opened bugs and
> addressing
> them with latest fixes (from upstream) + any needed fix to address the bugs
> (done to kronosnet, with FFE opened, and corosync, about to merge fixes to
> it).
>
> Next step is to document in Server Guide all supported scenarios for HA
> related
> packages. The intention here is to describe exact set of scenarios that we
> know
> are good for the perfect behavior of clustering software AND which scenarios
> we
> cannot support.
>
> OBS: This includes the need, or not, to have odd number of nodes/votes, to
> have
> or not proper fencing mechanisms (and which fencing mechanisms to support)
> AND,
> finally, what *resource agents* to support.
>
> I'll probably ask other feedback soon, but, for this moment, I'm asking
> comments
> for the list of resource agents bellow. I tried to split and explain what
> the
> resources are used for and if they are supported in Ubuntu or not (or if the
> related managed service is in [main] or in [universe]).
>
> So please, take some time to provide feedback about this list, whether we
> should
> move resources from one category to the other. *NOTE* that I'm not giving
> the
> "fence agents" list yet. That will be another list.
>
> I'm particularly interested in feedback from @jamespage and @ddstreet as
> they
> probably have good intel about resources usage BUT anyone is welcome to
> provide
> comments!
>
> Thank you very much in advance!

I added a few comments below, otherwise all the categories look
reasonable to me, thanks!

>
> #### RFC: Ubuntu HA resource-agents supportability
>
> #
> ## FULLY SUPPORTED (managed service is likely in main or is important
> enough)
> #
>
>     # trivial agents
>
> Delay                   - test resource for introducing delay
> MailTo                  - sends email to a sysadmin whenever a takeover
> occurs
> ClusterMon              - runs crm_mon to a html page from time to time
> HealthCPU               - measures CPU idling and updates #health-cpu attr
> HealthIOWait            - measures CPU idling and updates #health-iowait
> attr
> HealthSMART             - measures CPU idling and updates #health-smart attr
>
>     # services
>
> apache                  - apache web server instance
> dovecot                 - dovecot IMAP/POP3 server instance
> dhcpd                   - chrooted ISC dhcp server instance
> mysql                   - MySQL database instance
> mysql-proxy             - MySQL proxy instance
> named                   - bind/named server instance
> nfsnotify               - nfs sm-notify reboot notifications daemon
> nfsserver               - nfs server resource
> nginx                   - Nginx web/proxy server instance
> postfix                 - postfix mail server instance
> rabbitmq-cluster        - cloned rabbitmq cluster instance
> remote                  - pacemaker remote resource agent
> rsyncd                  - rsyncd instance
> rsyslog                 - rsyslogd instance
> slapd                   - stand-alone LDAP daemon instance
> Squid                   - squid proxy server instance
> vsftpd                  - vsftpd server instance
>
>     # storage
>
> Raid1                   - software RAID (MD) devices on shared storage
> iscsi                   - local iscsi initiator and its conns to targets
> iSCSILogicalUnit        - iSCSI logical units
> iSCSITarget             - iSCSI target export agent (implementation: tgt,
> lio)
> LVM                     - LVM volume as an HA resource
> LVM-activate            - LVM activation/deact work for a given VG
>                           (lvmlockd+LVM-activate OR clvm+LVM-activate)
> Filesystem              - filesystem on a shared storage medium
> symlink                 - symbolic link
> ZFS                     - ZFS pools import/export
>
>     # locking & reservations
>
> controld                - distributed lock manager for clustered FSs
> clvm                    - clvmd daemon (cluster logical vol manager)

Was clvm dropped from lvm2?
https://launchpad.net/ubuntu/+source/lvm2/2.03.02-2ubuntu1
I haven't used clustered lvm myself; maybe it was just rolled into lvm2.

> lvmlockd                - agent manages the lvmlockd daemon.
> mpathpersist            - SCSI persistent reservations on mpath devs
> sg_persist              - master/slave resource for SCSI3 reservations
>
>     # networking
>
> Route                   - network routes
> iface-bridge            - bridge network interfaces
> iface-vlan              - vlan network interfaces
> IPaddr2                 - virtual IPv4 and IPv6 addresses
> ipsec                   - ipsec tunnels for VIPs
> IPsrcaddr               - preferred source address modification
> IPv6addr                - IPv6 aliases
> conntrackd              - conntrackd instance
> SendArp                 - send gratuitous ARP for IP address
> VIPArip                 - virtual IP address through RIP2
> ifspeed                 - monitor action runs -> updates CIB with if speed
>
>     # virtualization
>
> VirtualDomain           - manages virtual domains through libvirt
>                           (virtual machine only)
>
>     # containers
>
> docker                  - docker container resource agent

as cpaelzer said, docker itself shouldn't be in the fully supported list.

> lxc                     - allows LXC containers to be managed by the cluster

presumably, this includes lxc and lxd?

>
> #
> ## BEST EFFORT SUPPORT (managed service is likely in universe or is
> interesting)
> #
>
>   # trivial agents
>
> anything                - generic agent to manage virtually *anything*
> Dummy                   - testing dummy resource agent (template for RA
> writers)
> AudibleAlarm            - audible beeps at interval
> Stateful                - example agent that supports two states
> WinPopup                - sends a SMB notification msg (popup) to a host
>
>   # services
>
> asterisk                - asterisk PBX
> CTDB                    - clustered samba (for needed clustered underlying)
> dnsupdate               - ip take-over via dynamic dns updates
> exportfs                - nfs exports (not the nfs server)

wouldn't this be fully supported?

> fio                     - fio instance
> galera                  - galera instance
> garbd                   - galera arbitrator instance
> jboss                   - JBoss application server instance
> jira                    - JIRA server instance
> kamailio                - kamailio SIP proxy/registrar instance
> mariadb                 - MariaDB master/slave instance
> nagios                  - nagios instance
> ovsmonitor              - clone resource to monitor network bonds on diff
> nodes
> pgagent                 - pgagent instance
> pgsql                   - pgsql database instance

shouldn't this be in fully supported?

also Brett (I added to cc) brought up that resource-agents-paf might
be worth considering supporting:
https://launchpad.net/ubuntu/+source/resource-agents-paf

> pound                   - pound reverse proxy load-bal server instance
> proftpd                 - proftpd instance
> Pure-FTPd               - pure-ftpd instance
> redis                   - redis server (supports master/slave replicas)
> instance
> syslog-ng               - syslog-ng instance
> tomcat                  - tomcat servlet environment instance
> varnish                 - varnish instance
>
>     # storage
>
> AoEtarget               - ata over ethernet
>
>     # networking
>
> IPaddr                  - virtual IPv4 addresses
> ocf:pacemaker:ping      - records in CIB number of nodes host can connect to
> portblock               - temporarily block/unblock access to tcp/udp ports
>
>     # openstack
>
> openstack-cinder-volume - attach cinder vol to an instance (os-info <->)
> openstack-floating-ip   - move a floating IP from an instance to another

I would expect both these to be in the fully supported category?

>
>     # registration (CIB)
>
> lxd-info                - nr of lxd containers running in CIB
> machine-info            - records various node attributes in CIB
> NodeUtilization         - cpu, host mem, hypervisor mem etc... into CIB
> openstack-info          - records attributes of a node into CIB
> SysInfo                 - records various node attributes into CIB
> SystemHealth            - monitors health of system using IPMI
> attribute               - sets node attr one way when started and vice-versa
>
> #
> ## COMMUNITY SUPPORT (bugs opened here will be forwarded to upstream
> directly)
> #
>
>     # services
>
> SphinxSearchDaemon      - sphix search daemon
> Xinetd                  - start/stop services managed by xinetd
> zabbixserver            - zabbix server instance
>
>     # storage
>
> o2cb                    - oracle cluster filesystem userspace daemon
> (oracle)
> sfex                    - excl access to shared storage using SF-EX
>
>     # virtualization
>
> aliyun-vpc-move-ip      - move ip within a vpc of the aliyum ecs (alibaba)
> awseip                  - manages aws elastic IP address (aws)
> awsvip                  - manages aws secondary private ip addresses (aws)
> aws-vpc-move-ip         - move ip within a vpc of the aws ec2 (aws)
> aws-vpc-route53         - update route53 vpc record for aws ec2 (aws)
> azure-events            - monitor for scheduled events for azure vm (azure)
> azure-lb                - answers azure load balancer health probe req
> (azure)
> gcp-vpc-move-ip         - floating ip address within a GCP VPC (google)
> ManageVE                - openVZ virtual environment (virtuozzo)
> minio                   - minio server instance
> podman                  - creates/launches podman containers
> rkt                     - creates/launches container based on supplied image
>
> #
> ## UNSUPPORTED (Ubuntu does not support it)
> #
>
> db2                     - manages IBM DB2 LUW databases (IBM)
> eDir88                  - Novell eDirectory directory server instance
> (novell)
> ICP                     - ICP vortex clustered host drive (intel)
> ids                     - IBM informix dynamic server (IDS) (IBM)
> SAPDatabase             - SAP database (of any type) instance agent (SAP)
> SAPInstance             - SAP application server instances agent (SAP)
> ServeRAID               - enables/disables shared serveRAID merge groups
> (IBM)
> ManageRAID              - raid devices (/etc/conf.d/HB-ManageRAID)
> oraasm                  - oracle asm agent, uses ohasd for asm disk grp
> (oracle)
> oracle                  - oracle database instance (oracle)
> oralsnr                 - oracle TNS listener (oracle)
> sybaseASE               - sybase ASE failover instance (Sybase)
> vdo-vol                 - https://bugs.launchpad.net/ubuntu/+bug/1869825
> WAS                     - websphere application server instance (IBM)
> WAS6                    - websphere application server instance (IBM)
> Xen                     - xen unprivileged domains

as cpaelzer mentioned, Xen should probably move up to the 'best
effort' section; this was just moved out of main in focal.

>
> #
> ## DEPRECATED (do not use)
> #
>
> Evmsd                   - clustered evms vol mgmt (evms is not maintained)
> EvmsSCC                 - clustered evms vol mgmt (evms is not maintained)
> LinuxSCSI               - enables/disables scsi devs through kernel scsi
> hotplug
> scsi2reservation        - SCSI-2 reservation agent (depends on
> "scsi_reserve")
> ocf:heartbeat:pingd     - monitors connectivity to specific hosts
> ocf:pacemaker:pingd     - replaced by pacemaker:ping (this is broken)
> vmware                  - control vmware server 2.0 virtual machines (2009)
>
>



More information about the ubuntu-devel mailing list