[Bug 1909399] Re: Exception during removal of OSD blacklist entries

Thorbjørn Weidemann 1909399 at bugs.launchpad.net
Mon Jan 4 19:54:24 UTC 2021


Thanks for responding :-)

I am not a Ceph expert either.

Steps to reproduce:
Short version: On a running ceph-cluster with at least one blacklisted osd: install ceph-iscsi.

Long version:
Installing Ceph is complicated, so here is a way to do it with ansible. I know this is a lot, but believe me: this is the EASY way.

Normally you would install a ceph-cluster with at least 3 servers, but
below is described how to do it on one machine.

Fresh install of Ubuntu 20.04.1 LTS server. Can be in a VM, but make sure you have 3 exstra disks attached and at least 4GB RAM. Below I assume install is on sda and sdb, sdc and sdd are attached blank hds ( I have used 10GB for each).
Make sure install is up-to-date:
thw at ff-ceph-4:~$ sudo apt update
thw at ff-ceph-4:~$ sudo apt dist-upgrade
thw at ff-ceph-4:~$ sudo reboot

Here I have the default admin user thw.
Assuming hostname ff-ceph-4
make sure ff-ceph-4 can resolve to the external ip-address of the local machine, eg. by adding it to /etc/hosts

Make sure user thw can sudo any command without password by sudo visudo and adding the line:
thw    ALL=(ALL) NOPASSWD:ALL

thw at ff-ceph-4:~$ sudo apt install ansible git

Create user ansible with password ansible:
thw at ff-ceph-4:~$ sudo adduser ansible

thw at ff-ceph-4:~$ su - ansible

get Ceph ansible playbooks:
ansible at ff-ceph-4:~$ git clone https://github.com/ceph/ceph-ansible.git
ansible at ff-ceph-4:~$ cd ceph-ansible
ansible at ff-ceph-4:~$ git checkout stable-5.0

Copy the attached all.yml to group_vars dir in ceph-ansible. You can diff to all.yml.sample to see what I have changed. I advise you to do this. Make sure the monitor_interface: line lists the name of your network interface, and that public_network: is the network that interface is on.
Copy the attached inventory to current-dir (/home/ansible/ceph-ansible)
NOTE: I could only attach ONE file, so this will be attached in new comment below.

make sure user ansible kan login to thw account with ssh-key:
ansible at ff-ceph-4:~$ ssh-keygen
ansible at ff-ceph-4:~$ ssh-copy-id thw at ff-ceph-4
ansible at ff-ceph-4:~$ cp site.yml.sample site.yml
ansible at ff-ceph-4:~$ ansible-playbook -i inventory site.yml

This will take a few minutes to run.
If something goes wrong see if you can fix it, and re-run the ansible-playbook command.

At the end you should hopefully have a running ceph-"cluster".
Go back to thw user (or add ansible user to sudo-list) and run
thw at ff-ceph-4:~$ sudo ceph -s

Line 3 should read:
    health: HEALTH_OK
    
To reproduce the bug, you should have some blacklist entries. I have them on a new install - I don't know why. Check with:
thw at ff-ceph-4:~$ sudo ceph osd blacklist ls

If there are entries listed, fine.
If not, create an entry with:
thw at ff-ceph-4:~$ sudo ceph osd blacklist add  <ip-address-of-your host>

Now:
thw at ff-ceph-4:~$ sudo ceph osd pool create rbd
thw at ff-ceph-4:~$ sudo rbd pool init rbd
thw at ff-ceph-4:~$ sudo apt install ceph-iscsi

You should now see the exceptions in journalctl


** Attachment added: "variables for ansible playbook"
   https://bugs.launchpad.net/ubuntu/+source/ceph-iscsi/+bug/1909399/+attachment/5449256/+files/all.yml

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/1909399

Title:
  Exception during removal of OSD blacklist entries

Status in ceph-iscsi package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 20.04, official Ubuntu packages, ceph 15.2.7-0ubuntu0.20.04.1, ceph-iscsi 3.4-0ubuntu2
  See stacktrace below.
  Looking at the code I'm guessing this is related to https://stackoverflow.com/questions/33054527/typeerror-a-bytes-like-object-is-required-not-str-when-writing-to-a-file-in
  ie. binary vs. text
  I'm not a python programmer, but from the stackoverflow article it would seem that the output of subprocess.check_output is considered binary in python3, and a backwards compatible fix would be do .encode() the strings constants "un-blacklisting" and "isn't blacklisted".

  Dec 26 09:19:09 ff-ceph-2 rbd-target-api[8886]: Started the configuration object watcher
  Dec 26 09:19:09 ff-ceph-2 rbd-target-api[8886]: Processing osd blacklist entries for this node
  Dec 26 09:19:09 ff-ceph-2 rbd-target-api[8886]: Checking for config object changes every 1s
  Dec 26 09:19:10 ff-ceph-2 rbd-target-api[8886]: Removing blacklisted entry for this host : 192.168.1.61:6819/881
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]: Traceback (most recent call last):
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:   File "/usr/bin/rbd-target-api", line 2952, in <module>
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:     main()
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:   File "/usr/bin/rbd-target-api", line 2862, in main
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:     osd_state_ok = ceph_gw.osd_blacklist_cleanup()
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:   File "/usr/lib/python3/dist-packages/ceph_iscsi_config/gateway.py", line 110, in osd_blacklist_cleanup
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:     rm_ok = self.ceph_rm_blacklist(blacklist_entry.split(' ')[0])
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:   File "/usr/lib/python3/dist-packages/ceph_iscsi_config/gateway.py", line 46, in ceph_rm_blacklist
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]:     if ("un-blacklisting" in result) or ("isn't blacklisted" in result):
  Dec 26 09:19:12 ff-ceph-2 rbd-target-api[8886]: TypeError: a bytes-like object is required, not 'str'
  Dec 26 09:19:12 ff-ceph-2 systemd[1]: rbd-target-api.service: Main process exited, code=exited, status=1/FAILURE
  Dec 26 09:19:12 ff-ceph-2 systemd[1]: rbd-target-api.service: Failed with result 'exit-code'.
  Dec 26 09:19:12 ff-ceph-2 systemd[1]: rbd-target-api.service: Scheduled restart job, restart counter is at 1.
  Dec 26 09:19:12 ff-ceph-2 systemd[1]: Stopped Ceph iscsi target configuration API.
  Dec 26 09:19:12 ff-ceph-2 systemd[1]: Started Ceph iscsi target configuration API.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: ceph-iscsi (not installed)
  ProcVersionSignature: Ubuntu 5.4.0-58.64-generic 5.4.73
  Uname: Linux 5.4.0-58-generic x86_64
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: pass
  Date: Sun Dec 27 12:17:49 2020
  InstallationDate: Installed on 2020-12-24 (2 days ago)
  InstallationMedia: Ubuntu-Server 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
  SourcePackage: ceph-iscsi
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph-iscsi/+bug/1909399/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list