[Archived] Shared Storage (Ceph) Jewel

funkypenguin · September 28, 2017, 5:03pm

While Docker Swarm is great for keeping containers running (and restarting those that fail), it does nothing for persistent storage. This means if you actually want your containers to keep any data persistent across restarts (hint: you do!), you need to provide shared storage to every docker node.

This is a companion discussion topic for the original entry at https://geek-cookbook.funkypenguin.co.nz/ha-docker-swarm/shared-storage-ceph/

ggilley · September 28, 2017, 5:15pm

I installed the iso Centos Atomic image. /etc/ceph exists, but /var/lib/ceph does not. Is this expected?

funkypenguin · September 28, 2017, 6:41pm

Yes, you’re right. /var/lib/ceph doesn’t exist on a CentOS Atomic host, since the server components of Ceph aren’t included. (Just /etc/ceph in preparation for a client install).

It’s safe to mkdir /var/lib/ceph to resolve.

I’ve updated the recipe to reflect this

ggilley · September 28, 2017, 9:17pm

I tried to grab the keyring, but get an error:

$ sudo ceph auth get client.bootstrap-osd -o /var/lib/ceph/bootstrap-osd/ceph.keyring
2017-09-28 14:15:18.259983 7fc56c6de700 0 – 192.168.2.11:0/1017529 >> 192.168.2.11:6789/0 pipe(0x7fc568067010 sd=4 :44144 s=1 pgs=0 cs=0 l=1 c=0x7fc56805da10).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000

funkypenguin · September 28, 2017, 10:29pm

Have you got iptables rules to permit any traffic within the swarm?

ggilley · September 28, 2017, 11:00pm

Yes. And is that command supposed to be run on one node, or all nodes?

Could it be a version difference between the installed version on the base os and the docker container?

[ggilley@orange ~]$ ceph --version
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)

[ggilley@orange ~]$ sudo docker exec -it 56f4c5e2b727 bash
root@orange:/# ceph --version
ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)

Greg

funkypenguin · September 28, 2017, 11:38pm

Mm… I think you’re right… http://ceph.com/planet/feature-set-mismatch-error-on-ceph-kernel-client/ shows that this error is to do with mismatched features between your client and server. Try the workaround described at Re: [ceph-users] How to change setting for tunables "require_feature_tunables5" ?

ggilley · September 29, 2017, 12:16am

Okay, I’ll give it a try. It is necessary with the more recent ceph builds? I’m looking at Docker Hub

 Greg

funkypenguin · September 29, 2017, 12:21am

Well, Ceph 12.2.0 is the most recent Ceph build (they just announced 12.2.1 a few days ago, a minor bugfix version).

The problem, as I see, is that CentOS Atomic 7 includes too-old Ceph client packages. When I wrote this recipe, Ceph was on v11, and there was no compatibility issue with CentOS Atomic.

If the workaround works, I’d stick with it. When Atomic eventually catches up, you could remove the workaround, but it won’t impact your use case.

ggilley · September 29, 2017, 12:37am

I mean is the step to get the keyring necessary with later ceph deployments.

BTW, the first command in the work-around fails in the same way. Is there a way to upgrade the ceph client in the Centos Atomic 7?

  Greg

funkypenguin · September 29, 2017, 1:12am

Aah, yes. Getting the keyring will still be required. You’ll need it when you mount cephFS via MDS.

I think there may be a way to change CentOS Atomic, but since it’s designed to be immutable, it’s a PITA. I haven’t done it personally. Something about adding overlays.

A safer workaround might be to use an older version of the ceph container, anything from version 11 should do. You’d have to blow away your MONs and recreate them, but if you don’t have any OSDs yet, that’s probably not a big deal?

ggilley · September 29, 2017, 1:26am

Okay, I’ll blow the mons away and start over. BTW, i see these files in /etc/ceph:
ceph.client.admin.keyring ceph.conf ceph.mon.keyring

funkypenguin · September 29, 2017, 1:47am

You can probably re-use the keyrings and .conf, I doubt there’s anything there which will be specific to Ceph v12…

ggilley · September 29, 2017, 1:59am

Okay, I cleaned everything and installed the kraken docker image. The keyring command works fine as you expected. My question is do I need to run that command on each node?

   Greg

funkypenguin · September 29, 2017, 2:10am

Yes, you need to dump the keys on each node. You’ll need them for the ceph clients on the node to connect to the MDS, to mount the cephfs volume.

ggilley · September 29, 2017, 3:50am

Not getting very far :-/

When I try to run the osd daemon, I get “Waiting for /dev/nvme1n1p2 to show up” in the logs.

I pass in /dev/nvme1n1 which should be the whole disk according to lsblk

   nvme1n1                                     259:4    0 953.9G  0 disk

Any ideas? Thanks,

  Greg

funkypenguin · September 29, 2017, 4:15am

That sounds right - what’s the full docker command you gave it when launching the OSD?

ggilley · September 29, 2017, 4:32am

sudo docker run -d --net=host --privileged=true --pid=host -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev/:/dev/ -e OSD_DEVICE=/dev/nvme1n1 -e OSD_TYPE=disk --name=“ceph-osd” --restart=always ceph/daemon:tag-build-master-kraken-centos-7 osd

funkypenguin · September 29, 2017, 5:08am

Long shot, but have you tried the “zapping” command?

ggilley · September 29, 2017, 6:07am

Yes, many times

  Greg