Hi I am totally new to this topic, go gentle on me.
I am trying to create a one node cluster just to see if i got basics right.
(and do i really need 3 cluster hosts, if so i might have to give up now as i only have two)
I get this error (and yes i had to do the secret export):
root@docker01:/etc# mount -a
mount error: no mds server is up or the cluster is laggy
Hmm, Iâve not tried with a single node - but it does look unhappy about only having 1 OSD. Iâd still have expected the MDS to start though - whatâs the specs of the node youâre using? If itâs too old/slow, you may be bumping into some default IO limitsâŚ
You could also try checking the logs of the mds contain, incase that gives a clue?
Thanks for the pointers, will try again soon. I tore down the docker host completely to start again incase I made some muck up the first time through.
The docker host is actually a Debian VM running on a synology VM host. The data volume is an iscsi volume on the vmhost mapped into the Debian VM.
I could see how that could cause issues if it is expecting extremely high IO, but mapping the iscsi volume into windows to test speed showed it was reasonably fast.
tl;dr i lost the logs in the process of tearing it all down
If youâre doing a 1-node cluster, you donât really even need Ceph. Just create /var/data on the host and have the Docker containers access it there. Ceph is for keeping data highly available between nodes. So, unless youâre setting this up now and meaning to add more nodes to the Swarm later, I think you could just ignore Ceph and move on.
The _netdev mount option (which waits for network before mounting) doesnât seem to be available in newer distros. It didnât work for me on ubuntu 20.04 and it doesnât work on 18.04 latest as far as i can tell. You can replace _netdev with x-systemd.automount in fstab, which waits for remote-fs.target. Iâm not sure whether waiting for remote-fs.target or network.target is a better option (if you manually wrote a .mount file instead of using fstab to generate it, you could instead specify to wait on the network specifically), but x-systemd.automount seemed to work for me. If you donât do this, (depending on your system) your mount (at boot) may fail due to the network not being up before systemd tries to mount cephfs.
Also perhaps consider moving the section of ceph-common to the start considering things like ceph orch will require it to be installed. Youâll also need either docker or podman to be installed before bootstrapping using cephadm, so maybe that should be mentioned as well espcially considering the docker setup comes after the ceph setupâŚ
Wanted to share something I learned. Iâve been trying to figure out why my ceph installation kept failing on fresh, vanilla instances of Ubuntu Server 20.04. Ceph kept insisting /var/lib/ceph was mounted as a read-only fs, despite the fact that it also sometimes wrote to it. Turns out if you choose to install docker during the OS installation using Ubuntuâs fancy new installer, it installs the snap version of docker - which doesnât play nice with ceph, and also doesnât produce very helpful error messages. Removing it and installing docker normally (Install Docker Engine on Ubuntu | Docker Documentation) fixes it right up, though.
Iâm trying to follow this recipe and Iâm stuck at mounting the filesystem.
Everything works fine until:
mkdir /var/data
MYNODES="<node1>,<node2>,<node3>" # Add your own nodes here, comma-delimited
MYHOST=`ip route get 1.1.1.1 | grep -oP 'src \K\S+'`
echo -e "
# Mount cephfs volume \n
raphael,donatello,leonardo:/ /var/data ceph name=admin,noatime,_netdev 0 0" >> /etc/fstab
mount -a
MYNODES and MYHOST are not used so probably obsolete ?
The ninja turtles are probably your nodes. But anyway. I canât add the ceph mount line on any node because there is no ceph installed on the nodes or did I miss something.
Iâm also having issues with this recipe on Ubuntu 20.04 â which may be related to ceph making changes, because none of the ceph commands work outside of cephadm shell anymore, this is the result for any of them:[errno 13] RADOS permission denied (error connecting to the cluster)