Deploying Ceph in lab 11.1 fails

I'm trying to do the ceph deploy installation in lab 11.1. Step 4 under 'deploy a monitor'. But it looks like the ceph-jewel repo is not valid:
[rdo-cc][INFO ] Running command: sudo yum -y install epel-release
[rdo-cc][DEBUG ] Loaded plugins: fastestmirror, priorities
[rdo-cc][DEBUG ] Determining fastest mirrors
[rdo-cc][DEBUG ] * base: mirror.fra10.de.leaseweb.net
[rdo-cc][DEBUG ] * epel: mirror.de.leaseweb.net
[rdo-cc][DEBUG ] * extras: mirror.checkdomain.de
[rdo-cc][DEBUG ] * updates: centosmirror.netcup.net
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] Trying other mirror.
[rdo-cc][WARNIN]
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] One of the configured repositories failed (CentOS-7 - Ceph Jewel),
[rdo-cc][WARNIN] and yum doesn't have enough cached data to continue. At this point the only
[rdo-cc][WARNIN] safe thing yum can do is fail. There are a few ways to work "fix" this:
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] 1. Contact the upstream for the repository and get them to fix the problem.
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] 2. Reconfigure the baseurl/etc. for the repository, to point to a working
[rdo-cc][WARNIN] upstream. This is most often useful if you are using a newer
[rdo-cc][WARNIN] distribution release than is supported by the repository (and the
[rdo-cc][WARNIN] packages for the previous distribution release still work).
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] 3. Run the command with the repository temporarily disabled
[rdo-cc][WARNIN] yum --disablerepo=centos-ceph-jewel ...
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] 4. Disable the repository permanently, so yum won't use it by default. Yum
[rdo-cc][WARNIN] will then just ignore the repository until you permanently enable it
[rdo-cc][WARNIN] again or use --enablerepo for temporary usage:
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] yum-config-manager --disable centos-ceph-jewel
[rdo-cc][WARNIN] or
[rdo-cc][WARNIN] subscription-manager repos --disable=centos-ceph-jewel
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] 5. Configure the failing repository to be skipped, if it is unavailable.
[rdo-cc][WARNIN] Note that yum will try to contact the repo. when it runs most commands,
[rdo-cc][WARNIN] so will have to try and fail each time (and thus. yum will be be much
[rdo-cc][WARNIN] slower). If it is a very temporary problem though, this is often a nice
[rdo-cc][WARNIN] compromise:
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] yum-config-manager --save --setopt=centos-ceph-jewel.skip_if_unavailable=true
[rdo-cc][WARNIN]
[rdo-cc][WARNIN] failure: repodata/repomd.xml from centos-ceph-jewel: [Errno 256] No more mirrors to try.
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][WARNIN] http://mirror.centos.org/centos/7/storage/x86_64/ceph-jewel/repodata/repomd.xml: [Errno 14] HTTP Error 503 - Service Unavailable
[rdo-cc][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y install epel-release
[[email protected] ceph-cluster]$
Comments
-
Hello,
Which step was this issue? I see you wrote 11.1, step 4, but I show step 4 as being timedatectl It would help if you could show the command you ran in addition the the error. When I just tried to install ceph-deploy, in step 3, I also saw the HTTP 503 errors, but it worked after those errors.
Regards,
0 -
@serewicz said:
Hello,Which step was this issue? I see you wrote 11.1, step 4, but I show step 4 as being timedatectl It would help if you could show the command you ran in addition the the error. When I just tried to install ceph-deploy, in step 3, I also saw the HTTP 503 errors, but it worked after those errors.
Regards,
Hi, it was step 4 of deploying a monitor. The command is:
[[email protected] ceph-cluster]$ ceph-deploy install --release luminous \
rdo-cc storage1 storage2 storage3This command fails pretty close to the beginning with the above errors, because it seems to be unable to install epel-release (I assume because epel-release is actually in the ceph-jewel repo).
All the steps leading up to this one succeeded fine. The yum steps before this one gave the same ceph-jewel repo errors but still succeeded, but step 4 of deploying a monitor fails after the errors. I tried running yum -y install epel-release as a separate command and got the same error.
0 -
Hello,
Thank you. I have just tried these steps and did not have any errors. There are a few warnings, but that is typical.
I see a few mentions of Jewel in your previous post output. I think there may be a typo or missing character in your start-ceph.repo file, which is why you are not seeing messages about Luminous instead. The most common one being if you type e17 (e-seventeen) instead of el7 (e-ell-seven), which it should be. Could you paste your start-ceph.repo file here. I'll copy it and see if I get the same errors.
Regards,
0 -
@serewicz said:
Hello,Thank you. I have just tried these steps and did not have any errors. There are a few warnings, but that is typical.
I see a few mentions of Jewel in your previous post output. I think there may be a typo or missing character in your start-ceph.repo file, which is why you are not seeing messages about Luminous instead. The most common one being if you type e17 (e-seventeen) instead of el7 (e-ell-seven), which it should be. Could you paste your start-ceph.repo file here. I'll copy it and see if I get the same errors.
Regards,
I logged back into the cluster and since it had been some time, it had been reset. So, I went through the steps again just like yesterday but I did not get any errors (even the initial repo errors as before).
However, now I am getting to step 1 of "Deploy OSD nodes for the cluster" and have a couple problems.
The command 'ceph-deploy osd create --data /dev/xvdb storage1' fails with a message about /dev/xvdb not existing.
I logged into storage1 and did an lvmdiskscan and see that the devices are /dev/vda and /dev/vdb, so I assume /dev/vdb is the correct once since it is 30G.
I tried rerunning the command as:
ceph-deploy osd create --data /dev/vdb storage1
This time it gets further but:
[[email protected] ceph-cluster]$ ceph-deploy osd create --data /dev/vdb storage1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy osd create --data /dev/vdb storage1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fddb117a128>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] block_wal : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] journal : None
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] host : storage1
[ceph_deploy.cli][INFO ] filestore : None
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.cli][INFO ] data : /dev/vdb
[ceph_deploy.cli][INFO ] block_db : None
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/vdb
[storage1][DEBUG ] connection detected need for sudo
[storage1][DEBUG ] connected to host: storage1
[storage1][DEBUG ] detect platform information from remote host
[storage1][DEBUG ] detect machine type
[storage1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to storage1
[storage1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[storage1][DEBUG ] find the location of an executable
[storage1][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
[storage1][WARNIN] No data was received after 300 seconds, disconnecting...
[storage1][INFO ] checking OSD status...
[storage1][DEBUG ] find the location of an executable
[storage1][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json[storage1][WARNIN] No data was received after 300 seconds, disconnecting...
[ceph_deploy.osd][DEBUG ] Host storage1 is now ready for osd use.But the actual OSD is not ready:
[[email protected] ceph-cluster]$ ceph -s
cluster:
id: 4165bd5f-f38d-4c6b-b1e3-287f800435b8
health: HEALTH_OKservices:
mon: 1 daemons, quorum rdo-cc
mgr: rdo-cc(active)
osd: 0 osds: 0 up, 0 indata:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 0B used, 0B / 0B avail
pgs:I tried the same thing on storage2 and got the same timeouts and the same end result.
0 -
I tried running the command:
sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
directly on storage1 to see what happens:
[[email protected] ~]$ sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e5682c7a-ada1-48a9-bfb2-d2ab4a46aba0
stderr: 2018-09-21 18:41:51.464896 7fda7cbf6700 0 monclient(hunting): authenticate timed out after 300
stderr: 2018-09-21 18:41:51.465000 7fda7cbf6700 0 librados: client.bootstrap-osd authentication error (110) Connection timed out
stderr: [errno 110] error connecting to the cluster
--> RuntimeError: Unable to create a new OSD idNot sure if this is actually related to the problem running the ceph-deploy command from rdo-cc or not.
0 -
@MichaelVonderbecke said:
I tried running the command:sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
directly on storage1 to see what happens:
[[email protected] ~]$ sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e5682c7a-ada1-48a9-bfb2-d2ab4a46aba0
stderr: 2018-09-21 18:41:51.464896 7fda7cbf6700 0 monclient(hunting): authenticate timed out after 300
stderr: 2018-09-21 18:41:51.465000 7fda7cbf6700 0 librados: client.bootstrap-osd authentication error (110) Connection timed out
stderr: [errno 110] error connecting to the cluster
--> RuntimeError: Unable to create a new OSD idNot sure if this is actually related to the problem running the ceph-deploy command from rdo-cc or not.
The timeout is caused by the IPTABLES on rdo-cc.
Quick Fix: run on rdo-cc # sudo iptables -F.
Long fix: create a rule in IP-Tables to get pass this.Thanks!
Vipinsagar0 -
@vipinsagar said:
@MichaelVonderbecke said:
I tried running the command:sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
directly on storage1 to see what happens:
[[email protected] ~]$ sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e5682c7a-ada1-48a9-bfb2-d2ab4a46aba0
stderr: 2018-09-21 18:41:51.464896 7fda7cbf6700 0 monclient(hunting): authenticate timed out after 300
stderr: 2018-09-21 18:41:51.465000 7fda7cbf6700 0 librados: client.bootstrap-osd authentication error (110) Connection timed out
stderr: [errno 110] error connecting to the cluster
--> RuntimeError: Unable to create a new OSD idNot sure if this is actually related to the problem running the ceph-deploy command from rdo-cc or not.
The timeout is caused by the IPTABLES on rdo-cc.
Quick Fix: run on rdo-cc # sudo iptables -F.
Long fix: create a rule in IP-Tables to get pass this.Thanks!
VipinsagarThis did fix the problem, although I found it strange because sudo iptables -L had no rules in any chains, so I'm not sure why sudo iptables -F would have actually fixed antyhing
0 -
It could be that the default policy of a chain somehow has been changed. If you flush it with iptables -F it will change the Policy back to default, which is wide open. I'll continue to investigate the issue.
Thanks for posting the fix!
Regards,
0
Categories
- 10.1K All Categories
- 35 LFX Mentorship
- 88 LFX Mentorship: Linux Kernel
- 503 Linux Foundation Boot Camps
- 279 Cloud Engineer Boot Camp
- 102 Advanced Cloud Engineer Boot Camp
- 48 DevOps Engineer Boot Camp
- 41 Cloud Native Developer Boot Camp
- 2 Express Training Courses
- 2 Express Courses - Discussion Forum
- 1.7K Training Courses
- 17 LFC110 Class Forum
- 5 LFC131 Class Forum
- 19 LFD102 Class Forum
- 148 LFD103 Class Forum
- 12 LFD121 Class Forum
- 61 LFD201 Class Forum
- LFD210 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 23 LFD254 Class Forum
- 569 LFD259 Class Forum
- 100 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 1 LFS145 Class Forum
- 23 LFS200 Class Forum
- 739 LFS201 Class Forum
- 1 LFS201-JP クラス フォーラム
- 1 LFS203 Class Forum
- 45 LFS207 Class Forum
- 298 LFS211 Class Forum
- 53 LFS216 Class Forum
- 46 LFS241 Class Forum
- 41 LFS242 Class Forum
- 37 LFS243 Class Forum
- 10 LFS244 Class Forum
- 27 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- 131 LFS253 Class Forum
- 994 LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 87 LFS260 Class Forum
- 126 LFS261 Class Forum
- 31 LFS262 Class Forum
- 79 LFS263 Class Forum
- 15 LFS264 Class Forum
- 10 LFS266 Class Forum
- 17 LFS267 Class Forum
- 17 LFS268 Class Forum
- 21 LFS269 Class Forum
- 200 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 212 LFW211 Class Forum
- 153 LFW212 Class Forum
- 899 Hardware
- 217 Drivers
- 74 I/O Devices
- 44 Monitors
- 115 Multimedia
- 208 Networking
- 101 Printers & Scanners
- 85 Storage
- 749 Linux Distributions
- 88 Debian
- 64 Fedora
- 14 Linux Mint
- 13 Mageia
- 24 openSUSE
- 133 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 355 Ubuntu
- 473 Linux System Administration
- 38 Cloud Computing
- 69 Command Line/Scripting
- Github systems admin projects
- 94 Linux Security
- 77 Network Management
- 108 System Management
- 49 Web Management
- 63 Mobile Computing
- 22 Android
- 27 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 528 Off Topic
- 127 Introductions
- 213 Small Talk
- 20 Study Material
- 794 Programming and Development
- 262 Kernel Development
- 498 Software Development
- 922 Software
- 257 Applications
- 182 Command Line
- 2 Compiling/Installing
- 76 Games
- 316 Installation
- 53 All In Program
- 53 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)