[Openstack] [Glance] Snapshot Saving problem

Heiko Krämer info at honeybutcher.de
Fri Jan 10 13:17:44 UTC 2014


Hiho guys,

I'm running in a very strange problem with glance.
In my production system (Havana 2013.3) will stuck a snapshot uploading
in "saving" state every time.

Tested:
* I can create images via CLI and Dashboard with glance API-Version 1 and 2.
* All components (registry, api, keystone) are reachable and working
* I can see the image meta informations in the glance db
* I can see the clean snapshot file on the glance store host with 0 bytes
* The snapshot image file is created correctly by qemu-img
* URL are tested from compute node with CURL and reachable
* API and Registry receive the HTTP requests to storing snapshot image


I can see after some seconds in the compute log file:

c0e5c8f] [instance: 78ffa117-b615-4575-952e-a520c3560354] Cleaning up
image 00d916d1-7795-42a7-8911-53e3906a904d decorated_function
/usr/lib/python2.7/dist-packages/nova/compute/manager.py:313
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354] Traceback (most recent call last):
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 309, in
decorated_function
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     *args, **kwargs)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2293,
in snapshot_instance
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     task_states.IMAGE_SNAPSHOT)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2324,
in _snapshot_instance
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     update_task_state)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line
1426, in snapshot
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     image_file)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 395, in update
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]    
_reraise_translated_image_exception(image_id)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 393, in update
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     image_id, **image_meta)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 212, in call
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     return getattr(client.images,
method)(*args, **kwargs)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/glanceclient/v1/images.py", line 291,
in update
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     'PUT', url, headers=hdrs,
body=image_data)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line
272, in raw_request
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     return self._http_request(url,
method, **kwargs)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line
206, in _http_request
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     conn.send('%x\r\n%s\r\n' %
(len(chunk), chunk))
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/httplib.py", line 790, in send
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     self.sock.sendall(data)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/eventlet/green/OpenSSL/SSL.py", line
101, in sendall
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     tail += self.send(data[tail:])
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]   File
"/usr/lib/python2.7/dist-packages/eventlet/green/OpenSSL/SSL.py", line
79, in write
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354]     return self.fd.write(data)
2014-01-10 09:24:41.834 17967 TRACE nova.compute.manager [instance:
78ffa117-b615-4575-952e-a520c3560354] SysCallError: (110, 'Connection
timed out')

2014-01-10 09:24:41.841 17967 DEBUG stevedore.extension [-] found
extension EntryPoint.parse('file = nova.image.download.file')
_load_plugins /usr/lib/python2.7/dist-packages/stevedore/extension.py:84
2014-01-10 09:24:41.844 17967 DEBUG stevedore.extension [-] found
extension EntryPoint.parse('file = nova.image.download.file')
_load_plugins /usr/lib/python2.7/dist-packages/stevedore/extension.py:84
2014-01-10 09:24:41.847 17967 DEBUG glanceclient.common.http [-] curl -i
-X DELETE -H 'X-Service-Catalog: [{"endpoints_links": [], "endpoints":
[{"adminURL":
"http://storage.local:8776/v1/88b8a024aaf3472e8881ea99cc0e5c8f",
"region": "prod", "publicURL":
"http://storage.local:8776/v1/88b8a024aaf3472e8881ea99cc0e5c8f", "id":
"745e0d81437b44379eac4f6d667b3db8", "internalURL":
"http://storage.local:8776/v1/88b8a024aaf3472e8881ea99cc0e5c8f"}],
"type": "volume", "name": "cinder"}]' -H 'X-Identity-Status: Confirmed'
-H 'X-Roles: admin' -H 'User-Agent: python-glanceclient' -H
'X-Tenant-Id: 88b8a024aaf3472e8881ea99cc0e5c8f' -H 'X-User-Id:
0069f28089c042fcb64350e4d51cafaf' -H 'X-Auth-Token:
05f54c1251fb44af8c50f08367d4d5e0' -H 'Content-Type:
application/octet-stream' -k
https://images.local:9292/v1/images/00d916d1-7795-42a7-8911-53e3906a904d
log_curl_request
/usr/lib/python2.7/dist-packages/glanceclient/common/http.py:140
2014-01-10 09:24:42.219 17967 DEBUG glanceclient.common.http [-]
HTTP/1.1 200 OK
date: Fri, 10 Jan 2014 09:24:42 GMT
content-length: 0
content-type: text/html; charset=UTF-8
x-openstack-request-id: req-554e6bd1-3bb8-4be1-9172-5e6cb3610c92
 log_http_response
/usr/lib/python2.7/dist-packages/glanceclient/common/http.py:150



.local domains have valid DNS entries!

I've now investigated some hours and i can't find the problem. With
strace on the glance-api process i see this (ATTENTION! Not the same
image ID but is on all tests the same result):

write(2, "2014-01-10 11:29:59.805 31095 DEBUG
glance.registry.client.v1.client [a6c2e470-cce4-4007-b753-87333a80d44b
386f04ddb39046bdbb222baf95746961 5895ddf86da846a2988ea37cf3bf8bd6]
Registry request GET /images/74780068-814a-4fcf-80a8-a726704ed03d HTTP
200 request id req-9fedf6da-5a59-432a-8d3c-affddd819b23 do_request
/usr/lib/python2.7/dist-packages/glance/registry/client/v1/client.py:115\n",
385) = 385
recvfrom(17, "{\"image\": {\"status\": \"active\", \"created_at\":
\"2013-12-03T13:33:14.395943\", \"properties\": {}, \"name\":
\"default-12.04\", \"deleted\": false, \"container_format\": \"bare\",
\"min_ram\": 0, \"disk_format\": \"qcow2\", \"location_data\":
[{\"url\":
\"file:///var/lib/glance/images/74780068-814a-4fcf-80a8-a726704ed03d\",
\"metadata\": {}}], \"updated_at\": \"2013-12-03T13:33:18.796661\",
\"owner\": \"b33bf3927d4e449a98cec4a883148110\", \"protected\": false,
\"location\":
\"file:///var/lib/glance/images/74780068-814a-4fcf-80a8-a726704ed03d\",
\"checksum\": \"03ca187afd8046c85d8bc0243a0827b6\", \"min_disk\": 0,
\"is_public\": true, \"deleted_at\": null, \"id\":
\"74780068-814a-4fcf-80a8-a726704ed03d\", \"size\": 479002624}}", 674,
0, NULL, NULL) = 674
close(17)                               = 0
write(16,
"\27\3\2\3\260\2426\251\235\2008s\7s\26\330\177\337\365\30\303lyr\36\202\25\210\241\332)8m\203N\210\267\221\244M\r\17\334\f<^\301\331\20\245\300\235\321\30u\330\n\237-\233\342YR\0\305C\356\323\317uf\211\323A\212WGNH9{:L\335\210i2\331O\244\216r*\211\250[\270\325\371P\246\322\234\343\265\227K\313s\362\343L\305'\275\24\312\275,\220\26z9\326\336\303\223*\257\202\351\245\312\334\275\222\246\376\221B\tK_\361\326s#\16\300\20\375\220\315\367A\305x<\26\244\255W,\1\304\317\211\256\341q\365\342F$_j\247\22j\344\220#\234\6\251!\244\240\200$`)\255\354\276\1c\307\6\0\323LF|\252o\254@\242\246\266\350\37\252\371\32\1\331\24\361\354\236\321R\17\204\34\0\355\231\271eb\226{Y\316\317\237\317\367\17+\351\365\215\272\0231\252\343v\357\0103\306w\257|\346T\340\f\17%\221\370R\371\350\201\31W\350\244\306\247\226\375\f\22_\235\26\200@<\\\305K\263\3\222\225`d\356\356\345z\326\301\275\241\225!\3\24\203Em\256\265\272\23\305Pl\10r\322I\26\366\274\251\344\336\3YmWP\2354\243\32\376\3323\200h\254\5p3F\201\277I\271u\211\335X\2321\r\252\2]\2100*9\205S\207\373\"\30&\254\343\325\35]\327\235;\301K=c\263\256\3\305\277\341\30A%\20`\n\30\317il\361\251m;*\275\337Y\3623\364\332\313\r[jYo\377\276+\256LLyk\302\261\372l\255@\323\2571\304\221\362\5rZ\367\213`\273\363\231lK5\33467\317\263\374v/\310c\357\262E\222ID\317\265\211\377\34\266-\262U\336\247eY\242!\22\321\342\337\345\216\300\271s\f\256\312\221\234\306\5\"\6*C1\314\276{\271\243VI\306\233\225M\335D\232\321\16\303\336\207zX\350\201\314\221\213h+\7\373\233\222\"\21\265\363\244\302{\3275\270(b*T\240\35\366\217\212\307sM\213\355\222\353\307S4I\374\360\271\332\4n\0\20\4\373\270t}\243~GI\7\2521\5\237\363\355J\36\352e,M\3432B\327IQ\30tPC@\210Vj\223m\1\267\210\251P\213\247z\34S\315\25O\276\347\332W<C<;\336\371\326\35}\277\32d\347\212y\241\335{\244\302\0232\234-'\1\323S\346\276\335\"\344\365\363g\363&\263\23\225C\264Wjg\275\0023\350\300\304\237\242\\\4\2011\274\347\373\346\\3\276,\306Jg\361U\22zT\275*\r]\373\33*\247L\222a`p\367>\37\357\237\3561]\237\320D\357Ua\312y\227<\33\263gn\274\330P\204\354:a\31\367\16\265\367c\252\261\"\177\32\222\326~\304\315\37\346:\214\375/\300\214Z\212\331BA\352\253\34\6S\355B\246*\320\22\250?D\251c\2602\212\300\376l\272rA\25de:X=\262\377<\327\301\26\26\311C7\243-K2\2363\244\265Kp\377\316\322u\3\272\222\207\354x\323\230\311O\330l\r\t\23^\370l\0\22v\203\263\271\201\371\356=)\21\270M\253P*\276\205\363\214\360\344u\237\255\221w\253\301\376\264\4\221\231\245v[\32\2127{\f\371\371\6I\270\240\275#\3074I\302\331Y\373\366\222u\351\315\302)\255\312\332Z\335\234\223\5\214\235\256\233\305
\206W\277\206\376%\321\212\4\342f;\351C\370\1\230;\214V\33j\227\313}",
949) = 949
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
read(16, 0x36b7973, 5)                  = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=12,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 3, 29758) = 1 ([{fd=16,
revents=POLLIN}])
read(16, "", 5)                         = 0
shutdown(16, 2 /* send and receive */)  = 0
close(16)                               = 0
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 29755) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 29696) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 29647) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 26685) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 26061) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 23858) = 1 ([{fd=12,
revents=POLLIN}])
accept(12, 0x7fff3ec69c40, [16])        = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 21968) = 1 ([{fd=12,
revents=POLLIN}])
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 2058) = 0 (Timeout)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 197) = 0 (Timeout)
poll([{fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=14,
events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 1) = 0 (Timeout)


All seems to be working,
* start instances
* glance image up/down load
* migrate instances
* cinder volume CRUD


Only Instance snapshorts are not working -.-


Anyone have a hint or an idea?

Cheers and Thanks
Heiko






More information about the Openstack mailing list