[openstack-dev] [Fuel][Bugs] Time sync problem when testing.

Stanislaw Bogatkin sbogatkin at mirantis.com
Wed Jan 27 08:21:32 UTC 2016


Yes, I have created custom iso with debug output. It didn't help, so
another one with strace was created.
On Jan 27, 2016 00:56, "Alex Schultz" <aschultz at mirantis.com> wrote:

> On Tue, Jan 26, 2016 at 2:16 PM, Stanislaw Bogatkin
> <sbogatkin at mirantis.com> wrote:
> > When there is too high strata, ntpdate can understand this and always
> write
> > this into its log. In our case there are just no log - ntpdate send first
> > packet, get an answer - that's all. So, fudging won't save us, as I
> think.
> > Also, it's a really bad approach to fudge a server which doesn't have a
> real
> > clock onboard.
>
> Do you have a debug output of the ntpdate somewhere? I'm not finding
> it in the bugs or in some of the snapshots for the failures. I did
> find one snapshot with the -v change that didn't have any response
> information so maybe it's the other problem where there is some
> network connectivity isn't working correctly or the responses are
> getting dropped somewhere?
>
> -Alex
>
> >
> > On Tue, Jan 26, 2016 at 10:41 PM, Alex Schultz <aschultz at mirantis.com>
> > wrote:
> >>
> >> On Tue, Jan 26, 2016 at 11:42 AM, Stanislaw Bogatkin
> >> <sbogatkin at mirantis.com> wrote:
> >> > Hi guys,
> >> >
> >> > for some time we have a bug [0] with ntpdate. It doesn't reproduced
> 100%
> >> > of
> >> > time, but breaks our BVT and swarm tests. There is no exact point
> where
> >> > problem root located. To better understand this, some verbosity to
> >> > ntpdate
> >> > output was added but in logs we can see only that packet exchange
> >> > between
> >> > ntpdate and server was started and was never completed.
> >> >
> >>
> >> So when I've hit this in my local environments there is usually one or
> >> two possible causes for this. 1) lack of network connectivity so ntp
> >> server never responds or 2) the stratum is too high.  My assumption is
> >> that we're running into #2 because of our revert-resume in testing.
> >> When we resume, the ntp server on the master may take a while to
> >> become stable. This sync in the deployment uses the fuel master for
> >> synchronization so if the stratum is too high, it will fail with this
> >> lovely useless error.  My assumption on what is happening is that
> >> because we aren't using a set of internal ntp servers but rather
> >> relying on the standard ntp.org pools.  So when the master is being
> >> resumed it's struggling to find a good enough set of servers so it
> >> takes a while to sync. This then causes these deployment tasks to fail
> >> because the master has not yet stabilized (might also be geolocation
> >> related).  We could either address this by fudging the stratum on the
> >> master server in the configs or possibly introducing our own more
> >> stable local ntp servers. I have a feeling fudging the stratum might
> >> be better when we only use the master in our ntp configuration.
> >>
> >> > As this bug is blocker, I propose to merge [1] to better understanding
> >> > what's going on. I created custom ISO with this patchset and tried to
> >> > run
> >> > about 10 BVT tests on this ISO. Absolutely with no luck. So, if we
> will
> >> > merge this, we would catch the problem much faster and understand root
> >> > cause.
> >> >
> >>
> >> I think we should merge the increased logging patch anyway because
> >> it'll be useful in troubleshooting but we also might want to look into
> >> getting an ntp peers list added into the snapshot.
> >>
> >> > I appreciate your answers, folks.
> >> >
> >> >
> >> > [0] https://bugs.launchpad.net/fuel/+bug/1533082
> >> > [1] https://review.openstack.org/#/c/271219/
> >> > --
> >> > with best regards,
> >> > Stan.
> >> >
> >>
> >> Thanks,
> >> -Alex
> >>
> >>
> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> >
> > --
> > with best regards,
> > Stan.
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160127/affc8b77/attachment.html>


More information about the OpenStack-dev mailing list