[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Corey Bryant 1885430 at bugs.launchpad.net
Tue Mar 9 20:44:40 UTC 2021


I agree with Drew that a Should-Start: nova-compute would make sense for
ceilometer-agent. This is a fairly non-invasive change as well since it
doesn't block ceilometer-agent from starting if nova-compute didn't
happen to exist, in case that ever made sense.

** Also affects: ceilometer (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: ceilometer (Ubuntu Hirsute)
   Importance: Undecided
       Status: New

** Also affects: ceilometer (Ubuntu Focal)
   Importance: Undecided
       Status: New

** Also affects: ceilometer (Ubuntu Groovy)
   Importance: Undecided
       Status: New

** Changed in: ceilometer (Ubuntu Focal)
       Status: New => Triaged

** Changed in: ceilometer (Ubuntu Groovy)
       Status: New => Triaged

** Changed in: ceilometer (Ubuntu Hirsute)
       Status: New => Triaged

** Changed in: ceilometer (Ubuntu Focal)
   Importance: Undecided => Medium

** Changed in: ceilometer (Ubuntu Groovy)
   Importance: Undecided => Medium

** Changed in: ceilometer (Ubuntu Hirsute)
   Importance: Undecided => Medium

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/train
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/victoria
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/stein
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/ussuri
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/stein
   Importance: Undecided => Medium

** Changed in: cloud-archive/stein
       Status: New => Triaged

** Changed in: cloud-archive/train
   Importance: Undecided => Medium

** Changed in: cloud-archive/train
       Status: New => Triaged

** Changed in: cloud-archive/ussuri
   Importance: Undecided => Medium

** Changed in: cloud-archive/ussuri
       Status: New => Triaged

** Changed in: cloud-archive/victoria
   Importance: Undecided => Medium

** Changed in: cloud-archive/victoria
       Status: New => Triaged

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceilometer in Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

Status in OpenStack ceilometer-agent charm:
  Confirmed
Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive stein series:
  Triaged
Status in Ubuntu Cloud Archive train series:
  Triaged
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in Ubuntu Cloud Archive victoria series:
  Triaged
Status in ceilometer package in Ubuntu:
  Triaged
Status in ceilometer source package in Focal:
  Triaged
Status in ceilometer source package in Groovy:
  Triaged
Status in ceilometer source package in Hirsute:
  Triaged

Bug description:
  Bionic/Stein - stable 20.05 charms
  Juju 2.7.6

  I am aware of: https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1850846
  Decided to open a new bug since there was no activity on the previous one and it expired.

  After rebooting my cloud (rack-by-rack), I got into a situation where
  I could not collect memory.usage from VMs anymore.

  Looking into: openstack metric resource --type instance <ID>
  I could not see memory.usage there.

  Access to ceilometer-agent and I could see the services were on active/running status, but following log was present:
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".                                       
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory                               
  Jun 27 22:34:09 sgdemr0114bp033 ceilometer-agent-compute[2244]: message repeated 33 times: [ libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory] 

  
  stat on that /var/run file shows me:
  stat /var/run/libvirt/libvirt-sock-ro
    File: /var/run/libvirt/libvirt-sock-ro
    Size: 0               Blocks: 0          IO Block: 4096   socket
  Device: 17h/23d Inode: 1289        Links: 1
  Access: (0777/srwxrwxrwx)  Uid: (    0/    root)   Gid: (  118/ libvirt)
  Access: 2020-06-28 14:28:47.292838669 +0000
  Modify: 2020-06-27 22:34:11.010520529 +0000
  Change: 2020-06-27 22:34:11.010520529 +0000
   Birth: -

  
  So, I guess there is a race-condition here, where libvirt is opening the socket after ceilometer-agent-compute tried to reach out for it; which gives up and stop working.

  Restarting it restores memory.usage back to normal.

  However, I still cannot see all the metrics as shown in:
  https://bugzilla.redhat.com/show_bug.cgi?id=1437927

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list