[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Jorge Niedbalski 1885430 at bugs.launchpad.net
Thu Mar 11 21:57:56 UTC 2021


With the proposed patch on stein:

--- nova-compute disabled / no requires  --- machine rebooted


root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
   
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:54:08 UTC; 57s ago
 Main PID: 851 (ceilometer-poll)
    Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
           ├─ 851 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
           └─3114 ceilometer-polling: AgentManager worker(0)

Mar 11 21:54:08 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 11 21:54:25 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[851]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".


--- nova-compute disabled / no required --- machine rebooted


ubuntu at juju-bf8c6a-lm-ceilometer-7:~$ sudo su
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# uptime
 21:56:25 up 0 min,  1 user,  load average: 1.67, 0.41, 0.14
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:56:07 UTC; 20s ago
 Main PID: 2743 (nova-compute)
    Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
           └─2743 /usr/bin/python3 /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf --log-file=/var/log/nova/nova-compute.log

Mar 11 21:56:07 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack Compute.
root at juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:56:00 UTC; 32s ago
 Main PID: 861 (ceilometer-poll)
    Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
           ├─ 861 ceilometer-polling: master process [/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf --polling-namespaces compute --log-file=/var/log/ceil
           └─1583 ceilometer-polling: AgentManager worker(0)

Mar 11 21:56:00 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer Agent Compute.
Mar 11 21:56:05 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[861]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceilometer in Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Status in OpenStack ceilometer-agent charm:
  Confirmed
Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive stein series:
  Triaged
Status in Ubuntu Cloud Archive train series:
  Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in Ubuntu Cloud Archive victoria series:
  Triaged
Status in ceilometer package in Ubuntu:
  Triaged
Status in ceilometer source package in Focal:
  Triaged
Status in ceilometer source package in Groovy:
  Triaged
Status in ceilometer source package in Hirsute:
  Triaged

Bug description:
  Bionic/Stein - stable 20.05 charms
  Juju 2.7.6

  I am aware of: https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1850846
  Decided to open a new bug since there was no activity on the previous one and it expired.

  After rebooting my cloud (rack-by-rack), I got into a situation where
  I could not collect memory.usage from VMs anymore.

  Looking into: openstack metric resource --type instance <ID>
  I could not see memory.usage there.

  Access to ceilometer-agent and I could see the services were on active/running status, but following log was present:
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".                                       
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory                               
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: message repeated 33 times: [ libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory] 

  
  stat on that /var/run file shows me:
  stat /var/run/libvirt/libvirt-sock-ro
    File: /var/run/libvirt/libvirt-sock-ro
    Size: 0               Blocks: 0          IO Block: 4096   socket
  Device: 17h/23d Inode: 1289        Links: 1
  Access: (0777/srwxrwxrwx)  Uid: (    0/    root)   Gid: (  118/ libvirt)
  Access: 2020-06-28 14:28:47.292838669 +0000
  Modify: 2020-06-27 22:34:11.010520529 +0000
  Change: 2020-06-27 22:34:11.010520529 +0000
   Birth: -

  
  So, I guess there is a race-condition here, where libvirt is opening the socket after ceilometer-agent-compute tried to reach out for it; which gives up and stop working.

  Restarting it restores memory.usage back to normal.

  However, I still cannot see all the metrics as shown in:
  https://bugzilla.redhat.com/show_bug.cgi?id=1437927

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list