[kolla][nova][neutron] Access to VMs is slow when running on a remote compute host

Giuseppe Sannino km.giuseppesannino at gmail.com
Wed Jul 17 07:41:28 UTC 2019


Hi Radoslaw, all,
I applied the GSSAPIAuthentication setting along with the UseDNS one. No
luck.

One thing I want to share and that maybe goes in a "no network" direction.
I disabled the execution of the motd script during the login phase via
"chmod -x /etc/update-motd.d/*".
I'm able to reduce the login time from ~12secs down to ~3sec.

To me, it looks like the issue is with the access in RW to the Guest VM
filesystem which takes quite a while.

One more thing, in the tcpdump trace, during the SSH login, I can see that
when the procedure gets stuck (that pledge network... log in ssh) is the
Server (so the Guest OS) that takes time to reply with a "Server packet.
But, I can see a TCP ACK sent back from the server. This makes me quite
sure that from a networking point of view the things are handled with no
delay.

/G


On Mon, 15 Jul 2019 at 19:27, Radosław Piliszek <radoslaw.piliszek at gmail.com>
wrote:

> For completeness - always also ensure that 'GSSAPIAuthentication' is set
> to 'no' because in default config it might require DNS lookups too.
> (Obviously you can run GSSAPIAuthentication and avoid DNS lookups by
> configuring GSSAPI appropriately ;-) ).
>
> Kind regards,
> Radek
>
> pon., 15 lip 2019 o 19:14 Giuseppe Sannino <km.giuseppesannino at gmail.com>
> napisał(a):
>
>> Hi Alex,
>> yeah, it was the first suspect also based on the various research on
>> internet.
>> I currently have the "useDNS" set to no but still the issue persists.
>>
>> /G
>>
>> On Mon, 15 Jul 2019 at 18:46, Alex Schultz <aschultz at redhat.com> wrote:
>>
>>>
>>> On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino <
>>> km.giuseppesannino at gmail.com> wrote:
>>>
>>>> Hi Sean,
>>>> the ssh to localhost is slow as well.
>>>> "telnet localhost" is also slow.
>>>>
>>>>
>>> Are you having dns issues? Historically if you have UseDNS set to true
>>> and your dns servers are bad it can just be slow to connect as it tries to
>>> do the reverse lookup.
>>>
>>>
>>>> /Giuseppe
>>>>
>>>> On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney at redhat.com> wrote:
>>>>
>>>>> On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
>>>>> > Hi!
>>>>> > first of all, thanks for the fast replies. I do appreciate that.
>>>>> >
>>>>> > I did some more test trying to figure out the issue.
>>>>> > - Set UseDNS to "no" in sshd_config  => Issue persists
>>>>> > - Installed and configured Telnet => Telnet login is slow as well
>>>>> >
>>>>> > From the "top" or "auth.log"nothing specific popped up. I can sshd
>>>>> taking
>>>>> > some cpu for a short while but nothing more than that.
>>>>> >
>>>>> > Once logged in the VM is not too slow. CLI doesn't get stuck or
>>>>> similar.
>>>>> > One thing worthwhile to mention, it seems like the writing
>>>>> throughput on
>>>>> > the disk is a bit slow: 67MB/s  wrt around 318MB/s of another VM
>>>>> running on
>>>>> > a "datacenter" Openstack installation.
>>>>> unless you see iowait in the guest its likely not related to the disk
>>>>> speed.
>>>>> you might be able to improve the disk performace by changeing the
>>>>> chache mode
>>>>> but unless you are seeing io wait that is just an optimisation to try
>>>>> later.
>>>>>
>>>>> when you are logged into the vm have you tried ssh again via localhost
>>>>> to
>>>>> determin if the long login time is related to the network or the vm.
>>>>>
>>>>> if its related to the network it will be fast over localhost
>>>>> if its related to the vm, e.g. because of disk, cpu load, memory load
>>>>> or ssh server configuration
>>>>> then the local ssh will be slow.
>>>>>
>>>>> >
>>>>> > The Cinder Volume docker is running on the Compute Host and Cinder
>>>>> is using
>>>>> > the filesystem as backend.
>>>>> >
>>>>> > BR
>>>>> > /Giuseppe
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons at redhat.com>
>>>>> wrote:
>>>>> >
>>>>> > > Hi,
>>>>> > >
>>>>> > > I suspect some problems with names resolving. Can You check if You
>>>>> have
>>>>> > > also such delay when doing e.g. “sudo” commands after You ssh to
>>>>> the
>>>>> > > instance?
>>>>> > >
>>>>> > > > On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev at gmail.com>
>>>>> wrote:
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
>>>>> > > > > Hi community,
>>>>> > > > > I need your help ,tips, advices.
>>>>> > > > > *> Environment <*
>>>>> > > > > I have deployed Openstack "Stein" using the latest
>>>>> kolla-ansible on the
>>>>> > >
>>>>> > > following deployment topology:
>>>>> > > > > 1) OS Controller running as VM on a "cloud" location
>>>>> > > > > 2) OS Compute running on a baremetal server remotely (wrt OS
>>>>> > >
>>>>> > > Controller) location
>>>>> > > > > 3) Network node running on the Compute host
>>>>> > > > > As per the above info, Controller and compute run on two
>>>>> different
>>>>> > >
>>>>> > > networks.
>>>>> > > > > Kolla-Ansible is not really designed for such scenario but
>>>>> after
>>>>> > >
>>>>> > > manipulating the globals.yml and the inventory files (basically I
>>>>> had to
>>>>> > > move node specific network settings from the globals to the
>>>>> inventory
>>>>> > > file), eventually the deployment works fine.
>>>>> > > > > *> Problem <*
>>>>> > > > > I have no specific issue working with this deployment except
>>>>> the
>>>>> > >
>>>>> > > following:
>>>>> > > > > "SSH connection to the VM is quite slow".
>>>>> > > > > It takes around 20 seconds for me to log into the VM (Ubuntu,
>>>>> CentOS,
>>>>> > >
>>>>> > > whatever).
>>>>> > > >
>>>>> > > > But once logged-in things are OK?  For example, an scp stalls
>>>>> the same
>>>>> > >
>>>>> > > way, but the transfer is fast?
>>>>> > > >
>>>>> > > > > *> Observations <*
>>>>> > > > >  * Except for the slowness during the SSH login, I don't have
>>>>> any
>>>>> > > > >    further specific issue working with this envirorment
>>>>> > > > >  * With the Network on the Compute I can turn the OS
>>>>> controller off
>>>>> > > > >    with no impact on the VM. Still the connection is slow
>>>>> > > > >  * I tried different type of images (Ubuntu, CentOS, Windows)
>>>>> always
>>>>> > > > >    with the same result.
>>>>> > > > >  * SSH connection is slow even if I try to login into the VM
>>>>> within the
>>>>> > > > >    IP Namespace
>>>>> > > > > From the ssh -vvv, I can see that the authentication gets
>>>>> stuck here:
>>>>> > > > > debug1: Authentication succeeded (publickey).
>>>>> > > > > Authenticated to *****
>>>>> > > > > debug1: channel 0: new [client-session]
>>>>> > > > > debug3: ssh_session2_open: channel_new: 0
>>>>> > > > > debug2: channel 0: send open
>>>>> > > > > debug3: send packet: type 90
>>>>> > > > > debug1: Requesting no-more-sessions at openssh.com <mailto:
>>>>> > >
>>>>> > > no-more-sessions at openssh.com>
>>>>> > > > > debug3: send packet: type 80
>>>>> > > > > debug1: Entering interactive session.
>>>>> > > > > debug1: pledge: network
>>>>> > > > > > > > > > 10 to 15 seconds later
>>>>> > > >
>>>>> > > > What is sshd doing at this time?  Have you tried enabling debug
>>>>> or
>>>>> > >
>>>>> > > running tcpdump when a new connection is attempted?  At first
>>>>> glance I'd
>>>>> > > say it's a DNS issue since it eventually succeeds, the logs would
>>>>> help to
>>>>> > > point in a direction.
>>>>> > > >
>>>>> > > > -Brian
>>>>> > > >
>>>>> > > >
>>>>> > > > > debug3: receive packet: type 80
>>>>> > > > > debug1: client_input_global_request: rtype
>>>>> hostkeys-00 at openssh.com
>>>>> > >
>>>>> > > <mailto:hostkeys-00 at openssh.com> want_reply 0
>>>>> > > > > debug3: receive packet: type 91
>>>>> > > > > debug2: callback start
>>>>> > > > > debug2: fd 3 setting TCP_NODELAY
>>>>> > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10
>>>>> > > > > debug2: client_session2_setup: id 0
>>>>> > > > > debug2: channel 0: request pty-req confirm 1
>>>>> > > > > Have you ever experienced such issue ?
>>>>> > > > > Any suggestion?
>>>>> > > > > Many thanks
>>>>> > > > > /Giuseppe
>>>>> > >
>>>>> > > —
>>>>> > > Slawek Kaplonski
>>>>> > > Senior software engineer
>>>>> > > Red Hat
>>>>> > >
>>>>> > >
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190717/053eeaa0/attachment-0001.html>


More information about the openstack-discuss mailing list