[openstack-dev] [Fuel][Bugs] Time sync problem when testing.
Stanislaw Bogatkin
sbogatkin at mirantis.com
Tue Jan 26 21:16:49 UTC 2016
When there is too high strata, ntpdate can understand this and always write
this into its log. In our case there are just no log - ntpdate send first
packet, get an answer - that's all. So, fudging won't save us, as I think.
Also, it's a really bad approach to fudge a server which doesn't have a
real clock onboard.
On Tue, Jan 26, 2016 at 10:41 PM, Alex Schultz <aschultz at mirantis.com>
wrote:
> On Tue, Jan 26, 2016 at 11:42 AM, Stanislaw Bogatkin
> <sbogatkin at mirantis.com> wrote:
> > Hi guys,
> >
> > for some time we have a bug [0] with ntpdate. It doesn't reproduced 100%
> of
> > time, but breaks our BVT and swarm tests. There is no exact point where
> > problem root located. To better understand this, some verbosity to
> ntpdate
> > output was added but in logs we can see only that packet exchange between
> > ntpdate and server was started and was never completed.
> >
>
> So when I've hit this in my local environments there is usually one or
> two possible causes for this. 1) lack of network connectivity so ntp
> server never responds or 2) the stratum is too high. My assumption is
> that we're running into #2 because of our revert-resume in testing.
> When we resume, the ntp server on the master may take a while to
> become stable. This sync in the deployment uses the fuel master for
> synchronization so if the stratum is too high, it will fail with this
> lovely useless error. My assumption on what is happening is that
> because we aren't using a set of internal ntp servers but rather
> relying on the standard ntp.org pools. So when the master is being
> resumed it's struggling to find a good enough set of servers so it
> takes a while to sync. This then causes these deployment tasks to fail
> because the master has not yet stabilized (might also be geolocation
> related). We could either address this by fudging the stratum on the
> master server in the configs or possibly introducing our own more
> stable local ntp servers. I have a feeling fudging the stratum might
> be better when we only use the master in our ntp configuration.
>
> > As this bug is blocker, I propose to merge [1] to better understanding
> > what's going on. I created custom ISO with this patchset and tried to run
> > about 10 BVT tests on this ISO. Absolutely with no luck. So, if we will
> > merge this, we would catch the problem much faster and understand root
> > cause.
> >
>
> I think we should merge the increased logging patch anyway because
> it'll be useful in troubleshooting but we also might want to look into
> getting an ntp peers list added into the snapshot.
>
> > I appreciate your answers, folks.
> >
> >
> > [0] https://bugs.launchpad.net/fuel/+bug/1533082
> > [1] https://review.openstack.org/#/c/271219/
> > --
> > with best regards,
> > Stan.
> >
>
> Thanks,
> -Alex
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
--
with best regards,
Stan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160127/680a0a4f/attachment.html>
More information about the OpenStack-dev
mailing list