[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Jorge Niedbalski 1885430 at bugs.launchpad.net
Thu Mar 18 21:26:45 UTC 2021


---> Installed version

root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# dpkg -l |grep -i ceilometer
ii  ceilometer-agent-compute             1:12.1.1-0ubuntu1~cloud1                    all          ceilometer compute agent
ii  ceilometer-common                    1:12.1.1-0ubuntu1~cloud1                    all          ceilometer common files
ii  python3-ceilometer                   1:12.1.1-0ubuntu1~cloud1                    all          ceilometer python libraries


Run through 2 cases

1) Service restart
2) Reboot

---> Service restart case


root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:20:01 UTC; 2min 35s ago
 Main PID: 27650 (ceilometer-poll)
    Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
           ├─27650 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/cei
           └─27735 ceilometer-polling: AgentManager worker(0)

Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer Agent Compute.
Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:20:03 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[27650]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 18:46:56 UTC; 2h 35min ago
 Main PID: 2199 (nova-compute)
    Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
           └─2199 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log

Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started
OpenStack Compute.


--

root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl stop nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl disable nova-compute.service
Synchronizing state of nova-compute.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: inactive (dead) since Thu 2021-03-18 21:23:30 UTC; 7s ago
 Main PID: 2199 (code=exited, status=0/SUCCESS)

Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack Compute.
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping OpenStack Compute...
Mar 18 21:23:30 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped OpenStack Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# 


root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Thu 2021-03-18 21:23:24 UTC; 29s ago
 Main PID: 761 (code=exited, status=0/SUCCESS)

Mar 18 21:23:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:23:14 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[761]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping Ceilometer Agent Compute...
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer Agent Compute.



root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# /etc/init.d/ceilometer-agent-compute restart
[ ok ] Restarting ceilometer-agent-compute (via systemctl): ceilometer-agent-compute.service.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:24:13 UTC; 2s ago
 Main PID: 1549 (ceilometer-poll)
    Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
           ├─1549 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
           └─1604 ceilometer-polling: AgentManager worker(0)

Mar 18 21:24:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:24:14 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[1549]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT"
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:24:13 UTC; 6s ago
 Main PID: 1548 (nova-compute)
    Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
           └─1548 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log

Mar 18 21:24:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started
OpenStack Compute.

---> Reboot testing


root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl disable nova-compute.service
Synchronizing state of nova-compute.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl stop nova-compute
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# reboot


root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# uptime
 21:25:44 up 0 min,  1 user,  load average: 1.60, 0.38, 0.13
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:25:32 UTC; 13s ago
 Main PID: 3099 (nova-compute)
    Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
           └─3099 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log

Mar 18 21:25:32 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:25:32 UTC; 15s ago
 Main PID: 3100 (ceilometer-poll)
    Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
           ├─3100 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
           └─3229 ceilometer-polling: AgentManager worker(0)

Mar 18 21:25:32 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 18 21:25:35 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[3100]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT"
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# 




** Tags removed: verification-stein-needed
** Tags added: verification-stein-done

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceilometer in Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Status in OpenStack ceilometer-agent charm:
  Confirmed
Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive stein series:
  Fix Committed
Status in Ubuntu Cloud Archive train series:
  Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
  Fix Committed
Status in Ubuntu Cloud Archive victoria series:
  Fix Committed
Status in ceilometer package in Ubuntu:
  Fix Released
Status in ceilometer source package in Focal:
  Fix Committed
Status in ceilometer source package in Groovy:
  Fix Committed
Status in ceilometer source package in Hirsute:
  Fix Released

Bug description:
  Bionic/Stein - stable 20.05 charms
  Juju 2.7.6

  I am aware of: https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1850846
  Decided to open a new bug since there was no activity on the previous one and it expired.

  After rebooting my cloud (rack-by-rack), I got into a situation where
  I could not collect memory.usage from VMs anymore.

  Looking into: openstack metric resource --type instance <ID>
  I could not see memory.usage there.

  Access to ceilometer-agent and I could see the services were on active/running status, but following log was present:
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".                                       
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory                               
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: message repeated 33 times: [ libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory] 

  
  stat on that /var/run file shows me:
  stat /var/run/libvirt/libvirt-sock-ro
    File: /var/run/libvirt/libvirt-sock-ro
    Size: 0               Blocks: 0          IO Block: 4096   socket
  Device: 17h/23d Inode: 1289        Links: 1
  Access: (0777/srwxrwxrwx)  Uid: (    0/    root)   Gid: (  118/ libvirt)
  Access: 2020-06-28 14:28:47.292838669 +0000
  Modify: 2020-06-27 22:34:11.010520529 +0000
  Change: 2020-06-27 22:34:11.010520529 +0000
   Birth: -

  
  So, I guess there is a race-condition here, where libvirt is opening the socket after ceilometer-agent-compute tried to reach out for it; which gives up and stop working.

  Restarting it restores memory.usage back to normal.

  However, I still cannot see all the metrics as shown in:
  https://bugzilla.redhat.com/show_bug.cgi?id=1437927

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list