[kolla][nova][neutron] Access to VMs is slow when running on a remote compute host
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology: 1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host As per the above info, Controller and compute run on two different networks. Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. *> Problem <* I have no specific issue working with this deployment except the following: "SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever). *> Observations <* - Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment - With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow - I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. - SSH connection is slow even if I try to login into the VM within the IP Namespace
From the ssh -vvv, I can see that the authentication gets stuck here:
debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
10 to 15 seconds later
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices.
*> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology:
1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host
As per the above info, Controller and compute run on two different networks.
Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
*> Problem <* I have no specific issue working with this deployment except the following:
"SSH connection to the VM is quite slow".
It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever).
But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast?
*> Observations <*
* Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace
From the ssh -vvv, I can see that the authentication gets stuck here:
debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:no-more-sessions@openssh.com> debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction. -Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com <mailto:hostkeys-00@openssh.com> want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1
Have you ever experienced such issue ? Any suggestion?
Many thanks
/Giuseppe
On Fri, 12 Jul 2019 at 15:24, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices.
*> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology:
1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host
As per the above info, Controller and compute run on two different networks.
Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
*> Problem <* I have no specific issue working with this deployment except the following:
"SSH connection to the VM is quite slow".
It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever).
But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast?
*> Observations <*
* Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace
From the ssh -vvv, I can see that the authentication gets stuck here:
debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:no-more-sessions@openssh.com> debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
> 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
+1 - ~30s timeout on SSH login is normally a DNS issue.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com <mailto:hostkeys-00@openssh.com> want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1
Have you ever experienced such issue ? Any suggestion?
Many thanks
/Giuseppe
Hi, I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology: 1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host As per the above info, Controller and compute run on two different networks. Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. *> Problem <* I have no specific issue working with this deployment except the following: "SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever).
But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast?
*> Observations <* * Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:no-more-sessions@openssh.com> debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
> 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com <mailto:hostkeys-00@openssh.com> want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
Hi! first of all, thanks for the fast replies. I do appreciate that. I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. The Cinder Volume docker is running on the Compute Host and Cinder is using the filesystem as backend. BR /Giuseppe On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology: 1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host As per the above info, Controller and compute run on two different networks. Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. *> Problem <* I have no specific issue working with this deployment except the following: "SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever).
But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast?
*> Observations <* * Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto: no-more-sessions@openssh.com> debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
>> 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com <mailto:hostkeys-00@openssh.com> want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm. if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder is using the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the
following deployment topology:
1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS
Controller) location
3) Network node running on the Compute host As per the above info, Controller and compute run on two different
networks.
Kolla-Ansible is not really designed for such scenario but after
manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
*> Problem <* I have no specific issue working with this deployment except the
following:
"SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS,
whatever).
But once logged-in things are OK? For example, an scp stalls the same
way, but the transfer is fast?
*> Observations <* * Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:
no-more-sessions@openssh.com>
debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
> > > 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or
running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com
<mailto:hostkeys-00@openssh.com> want_reply 0
debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow. /Giuseppe On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney@redhat.com> wrote:
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm.
if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder is
using
the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the
following deployment topology:
1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS
Controller) location
3) Network node running on the Compute host As per the above info, Controller and compute run on two different
networks.
Kolla-Ansible is not really designed for such scenario but after
manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
*> Problem <* I have no specific issue working with this deployment except the
following:
"SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS,
whatever).
But once logged-in things are OK? For example, an scp stalls the
same
way, but the transfer is fast?
*> Observations <* * Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows)
always
with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:
no-more-sessions@openssh.com>
debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
> > > > 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or
running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00@openssh.com
<mailto:hostkeys-00@openssh.com> want_reply 0
debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow.
Are you having dns issues? Historically if you have UseDNS set to true and your dns servers are bad it can just be slow to connect as it tries to do the reverse lookup.
/Giuseppe
On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney@redhat.com> wrote:
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm.
if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder is
using
the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote:
Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the
following deployment topology:
1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS
Controller) location
3) Network node running on the Compute host As per the above info, Controller and compute run on two different
networks.
Kolla-Ansible is not really designed for such scenario but after
manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
*> Problem <* I have no specific issue working with this deployment except the
following:
"SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS,
whatever).
But once logged-in things are OK? For example, an scp stalls the
same
way, but the transfer is fast?
*> Observations <* * Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment * With the Network on the Compute I can turn the OS controller
off
with no impact on the VM. Still the connection is slow * I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. * SSH connection is slow even if I try to login into the VM within the IP Namespace From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com <mailto:
no-more-sessions@openssh.com>
debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network > > > > > 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or
running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
debug3: receive packet: type 80 debug1: client_input_global_request: rtype
hostkeys-00@openssh.com
<mailto:hostkeys-00@openssh.com> want_reply 0
debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
Hi Alex, yeah, it was the first suspect also based on the various research on internet. I currently have the "useDNS" set to no but still the issue persists. /G On Mon, 15 Jul 2019 at 18:46, Alex Schultz <aschultz@redhat.com> wrote:
On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow.
Are you having dns issues? Historically if you have UseDNS set to true and your dns servers are bad it can just be slow to connect as it tries to do the reverse lookup.
/Giuseppe
On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney@redhat.com> wrote:
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm.
if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder is
using
the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote:
On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > Hi community, > I need your help ,tips, advices. > *> Environment <* > I have deployed Openstack "Stein" using the latest kolla-ansible on the
following deployment topology:
> 1) OS Controller running as VM on a "cloud" location > 2) OS Compute running on a baremetal server remotely (wrt OS
Controller) location
> 3) Network node running on the Compute host > As per the above info, Controller and compute run on two different
networks.
> Kolla-Ansible is not really designed for such scenario but after
manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine.
> *> Problem <* > I have no specific issue working with this deployment except the
following:
> "SSH connection to the VM is quite slow". > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS,
whatever).
But once logged-in things are OK? For example, an scp stalls the
same
way, but the transfer is fast?
> *> Observations <* > * Except for the slowness during the SSH login, I don't have any > further specific issue working with this envirorment > * With the Network on the Compute I can turn the OS controller
off
> with no impact on the VM. Still the connection is slow > * I tried different type of images (Ubuntu, CentOS, Windows) always > with the same result. > * SSH connection is slow even if I try to login into the VM within the > IP Namespace > From the ssh -vvv, I can see that the authentication gets stuck here: > debug1: Authentication succeeded (publickey). > Authenticated to ***** > debug1: channel 0: new [client-session] > debug3: ssh_session2_open: channel_new: 0 > debug2: channel 0: send open > debug3: send packet: type 90 > debug1: Requesting no-more-sessions@openssh.com <mailto:
no-more-sessions@openssh.com>
> debug3: send packet: type 80 > debug1: Entering interactive session. > debug1: pledge: network > > > > > > 10 to 15 seconds later
What is sshd doing at this time? Have you tried enabling debug or
running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction.
-Brian
> debug3: receive packet: type 80 > debug1: client_input_global_request: rtype
hostkeys-00@openssh.com
<mailto:hostkeys-00@openssh.com> want_reply 0
> debug3: receive packet: type 91 > debug2: callback start > debug2: fd 3 setting TCP_NODELAY > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > debug2: client_session2_setup: id 0 > debug2: channel 0: request pty-req confirm 1 > Have you ever experienced such issue ? > Any suggestion? > Many thanks > /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
For completeness - always also ensure that 'GSSAPIAuthentication' is set to 'no' because in default config it might require DNS lookups too. (Obviously you can run GSSAPIAuthentication and avoid DNS lookups by configuring GSSAPI appropriately ;-) ). Kind regards, Radek pon., 15 lip 2019 o 19:14 Giuseppe Sannino <km.giuseppesannino@gmail.com> napisał(a):
Hi Alex, yeah, it was the first suspect also based on the various research on internet. I currently have the "useDNS" set to no but still the issue persists.
/G
On Mon, 15 Jul 2019 at 18:46, Alex Schultz <aschultz@redhat.com> wrote:
On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow.
Are you having dns issues? Historically if you have UseDNS set to true and your dns servers are bad it can just be slow to connect as it tries to do the reverse lookup.
/Giuseppe
On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney@redhat.com> wrote:
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm.
if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder is
using
the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance?
> On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote: > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > > Hi community, > > I need your help ,tips, advices. > > *> Environment <* > > I have deployed Openstack "Stein" using the latest kolla-ansible on the
following deployment topology: > > 1) OS Controller running as VM on a "cloud" location > > 2) OS Compute running on a baremetal server remotely (wrt OS
Controller) location > > 3) Network node running on the Compute host > > As per the above info, Controller and compute run on two different
networks. > > Kolla-Ansible is not really designed for such scenario but after
manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. > > *> Problem <* > > I have no specific issue working with this deployment except the
following: > > "SSH connection to the VM is quite slow". > > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS,
whatever). > > But once logged-in things are OK? For example, an scp stalls the same
way, but the transfer is fast? > > > *> Observations <* > > * Except for the slowness during the SSH login, I don't have any > > further specific issue working with this envirorment > > * With the Network on the Compute I can turn the OS controller off > > with no impact on the VM. Still the connection is slow > > * I tried different type of images (Ubuntu, CentOS, Windows) always > > with the same result. > > * SSH connection is slow even if I try to login into the VM within the > > IP Namespace > > From the ssh -vvv, I can see that the authentication gets stuck here: > > debug1: Authentication succeeded (publickey). > > Authenticated to ***** > > debug1: channel 0: new [client-session] > > debug3: ssh_session2_open: channel_new: 0 > > debug2: channel 0: send open > > debug3: send packet: type 90 > > debug1: Requesting no-more-sessions@openssh.com <mailto:
no-more-sessions@openssh.com> > > debug3: send packet: type 80 > > debug1: Entering interactive session. > > debug1: pledge: network > > > > > > > 10 to 15 seconds later > > What is sshd doing at this time? Have you tried enabling debug or
running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction. > > -Brian > > > > debug3: receive packet: type 80 > > debug1: client_input_global_request: rtype hostkeys-00@openssh.com
<mailto:hostkeys-00@openssh.com> want_reply 0 > > debug3: receive packet: type 91 > > debug2: callback start > > debug2: fd 3 setting TCP_NODELAY > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > > debug2: client_session2_setup: id 0 > > debug2: channel 0: request pty-req confirm 1 > > Have you ever experienced such issue ? > > Any suggestion? > > Many thanks > > /Giuseppe
— Slawek Kaplonski Senior software engineer Red Hat
Hi Radoslaw, all, I applied the GSSAPIAuthentication setting along with the UseDNS one. No luck. One thing I want to share and that maybe goes in a "no network" direction. I disabled the execution of the motd script during the login phase via "chmod -x /etc/update-motd.d/*". I'm able to reduce the login time from ~12secs down to ~3sec. To me, it looks like the issue is with the access in RW to the Guest VM filesystem which takes quite a while. One more thing, in the tcpdump trace, during the SSH login, I can see that when the procedure gets stuck (that pledge network... log in ssh) is the Server (so the Guest OS) that takes time to reply with a "Server packet. But, I can see a TCP ACK sent back from the server. This makes me quite sure that from a networking point of view the things are handled with no delay. /G On Mon, 15 Jul 2019 at 19:27, Radosław Piliszek <radoslaw.piliszek@gmail.com> wrote:
For completeness - always also ensure that 'GSSAPIAuthentication' is set to 'no' because in default config it might require DNS lookups too. (Obviously you can run GSSAPIAuthentication and avoid DNS lookups by configuring GSSAPI appropriately ;-) ).
Kind regards, Radek
pon., 15 lip 2019 o 19:14 Giuseppe Sannino <km.giuseppesannino@gmail.com> napisał(a):
Hi Alex, yeah, it was the first suspect also based on the various research on internet. I currently have the "useDNS" set to no but still the issue persists.
/G
On Mon, 15 Jul 2019 at 18:46, Alex Schultz <aschultz@redhat.com> wrote:
On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow.
Are you having dns issues? Historically if you have UseDNS set to true and your dns servers are bad it can just be slow to connect as it tries to do the reverse lookup.
/Giuseppe
On Mon, 15 Jul 2019 at 18:18, Sean Mooney <smooney@redhat.com> wrote:
Hi! first of all, thanks for the fast replies. I do appreciate that.
I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well
From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that.
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing
On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: throughput on
the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later.
when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm.
if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow.
The Cinder Volume docker is running on the Compute Host and Cinder
the filesystem as backend.
BR /Giuseppe
On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski <skaplons@redhat.com> wrote:
> Hi, > > I suspect some problems with names resolving. Can You check if You have > also such delay when doing e.g. “sudo” commands after You ssh to
> instance? > > > On 12 Jul 2019, at 16:23, Brian Haley <haleyb.dev@gmail.com> wrote: > > > > > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > > > Hi community, > > > I need your help ,tips, advices. > > > *> Environment <* > > > I have deployed Openstack "Stein" using the latest kolla-ansible on the > > following deployment topology: > > > 1) OS Controller running as VM on a "cloud" location > > > 2) OS Compute running on a baremetal server remotely (wrt OS > > Controller) location > > > 3) Network node running on the Compute host > > > As per the above info, Controller and compute run on two different > > networks. > > > Kolla-Ansible is not really designed for such scenario but after > > manipulating the globals.yml and the inventory files (basically I had to > move node specific network settings from the globals to the inventory > file), eventually the deployment works fine. > > > *> Problem <* > > > I have no specific issue working with this deployment except
> > following: > > > "SSH connection to the VM is quite slow". > > > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, > > whatever). > > > > But once logged-in things are OK? For example, an scp stalls
is using the the the same
> > way, but the transfer is fast? > > > > > *> Observations <* > > > * Except for the slowness during the SSH login, I don't have any > > > further specific issue working with this envirorment > > > * With the Network on the Compute I can turn the OS controller off > > > with no impact on the VM. Still the connection is slow > > > * I tried different type of images (Ubuntu, CentOS, Windows) always > > > with the same result. > > > * SSH connection is slow even if I try to login into the VM within the > > > IP Namespace > > > From the ssh -vvv, I can see that the authentication gets stuck here: > > > debug1: Authentication succeeded (publickey). > > > Authenticated to ***** > > > debug1: channel 0: new [client-session] > > > debug3: ssh_session2_open: channel_new: 0 > > > debug2: channel 0: send open > > > debug3: send packet: type 90 > > > debug1: Requesting no-more-sessions@openssh.com <mailto: > > no-more-sessions@openssh.com> > > > debug3: send packet: type 80 > > > debug1: Entering interactive session. > > > debug1: pledge: network > > > > > > > > 10 to 15 seconds later > > > > What is sshd doing at this time? Have you tried enabling debug or > > running tcpdump when a new connection is attempted? At first glance I'd > say it's a DNS issue since it eventually succeeds, the logs would help to > point in a direction. > > > > -Brian > > > > > > > debug3: receive packet: type 80 > > > debug1: client_input_global_request: rtype hostkeys-00@openssh.com > > <mailto:hostkeys-00@openssh.com> want_reply 0 > > > debug3: receive packet: type 91 > > > debug2: callback start > > > debug2: fd 3 setting TCP_NODELAY > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > > > debug2: client_session2_setup: id 0 > > > debug2: channel 0: request pty-req confirm 1 > > > Have you ever experienced such issue ? > > > Any suggestion? > > > Many thanks > > > /Giuseppe > > — > Slawek Kaplonski > Senior software engineer > Red Hat > >
On 2019-07-15 16:29:47 +0200 (+0200), Giuseppe Sannino wrote: [...]
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. [...]
Have you checked dmesg in the guest instance to see if there is any I/O problem reported by the kernel? The login process will block on updating /var/log/wtmp or similar, so if writes to whatever backing store that lives on are delayed, that can explain the symptom. -- Jeremy Stanley
Ciao Jeremy, dmesg reports no error on the guest. syslog and auth.log look clean as well. /G On Mon, 15 Jul 2019 at 18:30, Jeremy Stanley <fungi@yuggoth.org> wrote:
On 2019-07-15 16:29:47 +0200 (+0200), Giuseppe Sannino wrote: [...]
Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. [...]
Have you checked dmesg in the guest instance to see if there is any I/O problem reported by the kernel? The login process will block on updating /var/log/wtmp or similar, so if writes to whatever backing store that lives on are delayed, that can explain the symptom. -- Jeremy Stanley
participants (8)
-
Alex Schultz
-
Brian Haley
-
Giuseppe Sannino
-
Jeremy Stanley
-
Mark Goddard
-
Radosław Piliszek
-
Sean Mooney
-
Slawek Kaplonski