How to solved ceph-mgr error “clock skew detected”

2 min readNov 11, 2024

The clock skew error message indicates that Ceph Monitors’ clocks are not synchronized. Clock synchronization is important because Ceph Monitors depend on time precision and behave unpredictably if their clocks are not synchronized.

The mon_clock_drift_allowed parameter determines what disparity between the clocks is tolerated. By default, this parameter is set to 0.05 seconds.

Important: Do not change the default value of mon_clock_drift_allowed without previous testing. Changing this value might affect the stability of the Ceph Monitors and the Ceph Storage Cluster in general.

Possible causes of the clock skew error include network problems or problems with chrony Network Time Protocol (NTP) synchronization if that is configured. In addition, time synchronization does not work properly on Ceph Monitors deployed on virtual machines.

How to resolve MON clock skew issue in OCS 4.x

Run from the Bastion server

oc rsh -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-operator)

From the shell, export the openshift storage configuration

sh-5.1$ export CEPH_ARGS='-c /var/lib/rook/openshift-storage/openshift-storage.config'

And then execute the ceph command.

sh-5.1$  ceph -s
  cluster:
    id:     496246f1-f423-4366-8543-2ac1fa5bbbf5
    health: HEALTH_WARN
            clock skew detected on mon.b, mon.c

  services:
    mon: 3 daemons, quorum a,b,c (age 2d)
    mgr: a(active, since 2d)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 4w), 3 in (since 5M)
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   12 pools, 169 pgs
    objects: 6.73k objects, 22 GiB
    usage:   63 GiB used, 1.4 TiB / 1.5 TiB avail
    pgs:     169 active+clean

  io:
    client:   853 B/s rd, 129 KiB/s wr, 1 op/s rd, 12 op/s wr

The command ceph -s showing one or more mons are out of time sync.

Resolution

Manually force chronyc to sync the clocks by running the following…
Connect to the ODF (or ceph) node that’s reporting one of the issues above and turn off selinux. For ODF, you can either use ‘oc debug node/’ or use ssh (if keys are configured for the core user).

NOTE: Please make sure you turn back on selinux, you DO NOT want to keep this off for an extended period of time

Temporarily disable SELinux

$ setenforce 0

Then run the makestep command, manually force adjust timesync using chronyc:

$ chronyc -a makestep

$ systemctl stop chronyd; systemctl start chronyd; systemctl enable chronyd

Now you can re-enable SELinux

$ setenforce 1

The Root Cause is because the ODF nodes are unable to sync with the NTP servers.

Reference : https://access.redhat.com/solutions/5244631 (This article add more details steps)

How to solved ceph-mgr error “clock skew detected”

How to resolve MON clock skew issue in OCS 4.x

Resolution

Written by Danang Priabada

No responses yet