<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi!<br>
</p>
<br>
<div class="moz-cite-prefix">On 26/02/18 12:53, Jorge Luiz Correa
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAE2bT_04m2COtrDuEVAKuAMzh+JqEY4RgTj9pJqZ_NSG+2jURA@mail.gmail.com">
<div dir="ltr">
<div>I would like some help to identify (and correct) a problem
with instances metadata during booting. My environment is a
Mitaka instalation, under Ubuntu 16.04 LTS, with 1 controller,
1 network node and 5 compute nodes. I'm using classic OVS as
network setup. <br>
<br>
The problem ocurs after some period of time in some projects
(not all projects at same time). When booting a Ubuntu Cloud
Image with cloud-init, instances lost conection with API
metadata and doesn't get their information like key-pairs and
cloud-init scripts. <br>
<br>
<span style="font-family:monospace,monospace">[ 118.924311]
cloud-init[932]: 2018-02-23 18:27:05,003 -
url_helper.py[WARNING]: Calling '<a
href="http://169.254.169.254/2009-04-04/meta-data/instance-id"
moz-do-not-send="true">http://169.254.169.254/2009-04-04/meta-data/instance-id</a>'
failed [101/120s]: request error
[HTTPConnectionPool(host='169.254.169.254', port=80): Max
retries exceeded with url: /2009-04-04/meta-data/instance-id
(Caused by
ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection
object at 0x7faabcd6fa58>, 'Connection to 169.254.169.254
timed out. (connect timeout=50.0)'))]<br>
[ 136.959361] cloud-init[932]: 2018-02-23 18:27:23,038 -
url_helper.py[WARNING]: Calling '<a
href="http://169.254.169.254/2009-04-04/meta-data/instance-id"
moz-do-not-send="true">http://169.254.169.254/2009-04-04/meta-data/instance-id</a>'
failed [119/120s]: request error
[HTTPConnectionPool(host='169.254.169.254', port=80): Max
retries exceeded with url: /2009-04-04/meta-data/instance-id
(Caused by
ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection
object at 0x7faabcd7f240>, 'Connection to 169.254.169.254
timed out. (connect timeout=17.0)'))]<br>
[ 137.967469] cloud-init[932]: 2018-02-23 18:27:24,040 -
DataSourceEc2.py[CRITICAL]: Giving up on md from ['<a
href="http://169.254.169.254/2009-04-04/meta-data/instance-id"
moz-do-not-send="true">http://169.254.169.254/2009-04-04/meta-data/instance-id</a>']
after 120 seconds<br>
[ 137.972226] cloud-init[932]: 2018-02-23 18:27:24,048 -
url_helper.py[WARNING]: Calling '<a
href="http://192.168.0.7/latest/meta-data/instance-id"
moz-do-not-send="true">http://192.168.0.7/latest/meta-data/instance-id</a>'
failed [0/120s]: request error
[HTTPConnectionPool(host='192.168.0.7', port=80): Max
retries exceeded with url: /latest/meta-data/instance-id
(Caused by
NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection
object at 0x7faabcd7fc18>: Failed to establish a new
connection: [Errno 111] Connection refused',))]<br>
[ 138.974223] cloud-init[932]: 2018-02-23 18:27:25,053 -
url_helper.py[WARNING]: Calling '<a
href="http://192.168.0.7/latest/meta-data/instance-id"
moz-do-not-send="true">http://192.168.0.7/latest/meta-data/instance-id</a>'
failed [1/120s]: request error
[HTTPConnectionPool(host='192.168.0.7', port=80): Max
retries exceeded with url: /latest/meta-data/instance-id
(Caused by
NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection
object at 0x7faabcd7fa58>: Failed to establish a new
connection: [Errno 111] Connection refused',))]</span><br>
<br>
After give up 169.254.169.254 it tries 192.168.0.7 that is the
dhcp address for the project. <br>
<br>
I've checked that neutron-l3-agent is running, without errors.
On compute node where VM is running, agents and vswitch is
running. I could check the namespace of a problematic project
and saw an iptables rules redirecting traffic from <a
href="http://169.254.169.254:80" moz-do-not-send="true">169.254.169.254:80</a>
to <a href="http://0.0.0.0:9697" moz-do-not-send="true">0.0.0.0:9697</a>,
and there is a process neutron-ns-medata_proxy_ID that opens
that port. So, it look like the metadata-proxy is running
fine. But, as we can see in logs there is a timeout. <br>
<br>
</div>
</div>
</blockquote>
<br>
Did you check if port 80 is listening inside the dhcp namespace with
"ip netns exec NAMESPACE netstat -punta" ? <br>
<br>
We recently hit something similar in which the ns-proxy was up and
the metadata-agent as well but the port 80 was missing inside the
namespace, a restart fixed it but there was no logs of a failure
anywhere so it may be similar.<br>
<br>
<blockquote type="cite"
cite="mid:CAE2bT_04m2COtrDuEVAKuAMzh+JqEY4RgTj9pJqZ_NSG+2jURA@mail.gmail.com">
<div dir="ltr">If I restart all services on network node sometimes
solves the problem. In some cases I have to restart services on
controller node (nova-api). So, all work fine for some time and
start to have problems again. <br>
<div><br>
Where can I investigate to try finding the cause of the
problem?<br>
<br>
I appreciate any help. Thank you!<br>
<br clear="all">
<div>
<div class="gmail_signature">
<div dir="ltr">- JLC</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Mailing list: <a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
Post to : <a class="moz-txt-link-abbreviated" href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a>
Unsubscribe : <a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a>
</pre>
</blockquote>
<br>
</body>
</html>