<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
Interesting feedbacks :)
<div class="">It looks like the 'get-me-a-network' issue [1]. It's a detail implementation, choose for technical/business reasons, which broke the tests and the compatibility.</div>
<div class="">We will probably have other subject like this one.</div>
<div class=""><br class="">
</div>
<div class="">Do we already speak about providing different level for each program? I'm not speaking about guidelines which target the releases. I'm speaking about:</div>
<div class="">- OpenStack Powered compute containing</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>- OpenStack Powered compute basics (get-me-a-network, get-me-a-compute, get-me-a-storage)</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>- OpenStack Powered compute advanced (boot-from-volume, boot-from-image, resize-a-instance, get-a-floating-ip, ...)</div>
<div class=""><br class="">
</div>
<div class="">Maybe we should add another level between basics and advanced.</div>
<div class=""><br class="">
</div>
<div class="">[1] <a href="http://specs.openstack.org/openstack/neutron-specs/specs/liberty/get-me-a-network.html" class="">
http://specs.openstack.org/openstack/neutron-specs/specs/liberty/get-me-a-network.html</a><br class="">
<div class="">
<div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class=""><br class="Apple-interchange-newline">
--</div>
<div class="">Jean-Daniel Bonnetot</div>
<div class=""><a href="http://www.ovh.com" class="">http://www.ovh.com</a></div>
<div class="">@pilgrimstack</div>
</div>
<br class="Apple-interchange-newline">
</div>
<br class="Apple-interchange-newline" style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<br class="Apple-interchange-newline">
</div>
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">Le 2 sept. 2016 à 08:47, Zhenyu Zheng <<a href="mailto:zhengzhenyulixi@gmail.com" class="">zhengzhenyulixi@gmail.com</a>> a écrit :</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">Hi All,
<div class=""><br class="">
</div>
<div class="">We are deploying Public Cloud platform based on OpensStack in EU, we are now working on DefCore certificate for our public cloud platform and we meet some problems;</div>
<div class=""><br class="">
</div>
<div class="">OpenStack(Nova) supports both "boot from Image" and "boot from Volume" when launching instances; When we talk about large scale commercial deployments such as Public Cloud, the reliability of the service is been considered as the key factor; </div>
<div class=""><br class="">
</div>
<div class="">When we use "boot from Image" we can have two kinds of deployments: 1. Nova-compute with no shared storage backend; 2. Nova-compute with shared storage backend. As for case 1, the system disk created from the image will be created on the local
disk of the host that nova-compute is on, and the reliability of the userdata is considered low and it will be very hard to manage this large amount of disks from different hosts all over the deployment, thus it can be considered not commercially ready for
large scale deployments. As for case 2, the problem of reliability and manage can be solved, but new problems are introduced - the resource usage and capacity amounts tracking being incorrect, this has been an known issue[1] in Nova for a long time and the
Nova team is trying to solve the problem by introducing a new "resource provider" architecture [2], this new architecture will need few releases to be fully functional, thus case 2 is also considered to be not commercially ready.</div>
<div class=""><br class="">
</div>
<div class="">For the reasons I listed above, we have chosen to use "boot from Volume" to be the only way of booting instance in our Public Cloud, by doing this, we can overcome the above mentioned cons and get other benefits such as:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">Resiliency - Cloud Block Storage is a persistent volume, users can retain it after the server is deleted. Users can then use the volume to create a new server. </div>
<div class="">Flexibility - User can have control over the size and type (SSD or SATA) of volume that used to boot the server. This control enables users to fine-tune the storage to the needs of your operating system or application.</div>
<div class="">Improvements in managing and recovering from server outages</div>
<div class="">Unified volume management</div>
</div>
<div class=""><br class="">
</div>
<div class="">Only support "boot from Volume" brings us problems when pursuing the DefCore certificate:</div>
<div class=""><br class="">
</div>
<div class="">we have tests that trying to get instance list filtered by "image_id" which is None for volume booted instances:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details</div>
<div class="">tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_server_details</div>
<div class="">tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_image</div>
<div class="">tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_image</div>
</div>
<div class=""><br class="">
</div>
<div class=""> - The detailed information for instances booted from volumes does not contain informations about image_id, thus the test cases filter instance by image id cannot pass.</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">we also have tests like this:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">tempest.api.compute.images.test_images.ImagesTestJSON.test_delete_saving_image</div>
</div>
<div class=""><br class="">
</div>
<div class="">
<div class=""> - This test tests creating an image for an instance, and delete the created instance snapshot during the image status of “saving”. As for instances booted from images, the snapshot status flow will be: queued->saving->active. But for instances
booted from volumes, the action of instance snapshotting is actually an volume snapshot action done by cinder, the image saved in glance will only have the link to the created cinder volume snapshot, and the image status will be directly change to “active”,
as the logic in this test will wait for the image status in glance change to “saving”, so it cannot pass for volume booted instances.</div>
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Also:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments</div>
</div>
<div class=""><br class="">
</div>
<div class=""> - This test attaches one volume to an instance and then counts the number of attachments for that instance, the expected count was hardcoded to be 1. As for volume booted instances, the system disk is already an attachment, so the actual count
of attachment will be 2, and the test fails.</div>
<div class=""><br class="">
</div>
<div class="">And finally:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_rebuild_server</div>
<div class=""> tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_deleted_server</div>
<div class="">tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_non_existent_server</div>
</div>
<div class=""><br class="">
</div>
<div class=""> - Rebuilding action is not supported when the instance is created via volume. </div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">All those tests mentioned above are not friendly to "boot from Volume" instances, we hope we can have some workarounds about the above mentioned tests, as the problem that is having with "boot from Image" is really stopping us using it and it
will also be good for DefCore if we can figure out how to deal with this two types of instance creation.</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">References:</div>
<div class="">[1] Bugs related to resource usage reporting and calculation:</div>
<div class=""><br class="">
</div>
<div class="">* Hypervisor summary shows incorrect total storage (Ceph)</div>
<div class=""> <a href="https://bugs.launchpad.net/nova/+bug/1387812" class="">https://bugs.launchpad.net/nova/+bug/1387812</a></div>
<div class="">* rbd backend reports wrong 'local_gb_used' for compute node</div>
<div class=""> <a href="https://bugs.launchpad.net/nova/+bug/1493760" class="">https://bugs.launchpad.net/nova/+bug/1493760</a></div>
<div class="">* nova hypervisor-stats shows wrong disk usage with shared storage</div>
<div class=""> <a href="https://bugs.launchpad.net/nova/+bug/1414432" class="">https://bugs.launchpad.net/nova/+bug/1414432</a></div>
<div class="">* report disk consumption incorrect in nova-compute</div>
<div class=""> <a href="https://bugs.launchpad.net/nova/+bug/1315988" class="">https://bugs.launchpad.net/nova/+bug/1315988</a></div>
<div class="">* VMWare: available disk spaces(hypervisor-list) only based on a single</div>
<div class=""> datastore instead of all available datastores from cluster</div>
<div class=""> <a href="https://bugs.launchpad.net/nova/+bug/1347039" class="">https://bugs.launchpad.net/nova/+bug/1347039</a></div>
<div class=""><br class="">
</div>
<div class="">[2] BP about solving resource usage reporting and calculation with a generic resource pool (resource provider):</div>
<div class=""><br class="">
</div>
<div class=""><a href="https://git.openstack.org/cgit/openstack/nova-specs/tree/specs/newton/approved/generic-resource-pools.rst" class="">https://git.openstack.org/cgit/openstack/nova-specs/tree/specs/newton/approved/generic-resource-pools.rst</a><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Thanks,</div>
<div class=""><br class="">
</div>
<div class="">Kevin Zheng</div>
</div>
_______________________________________________<br class="">
Defcore-committee mailing list<br class="">
<a href="mailto:Defcore-committee@lists.openstack.org" class="">Defcore-committee@lists.openstack.org</a><br class="">
http://lists.openstack.org/cgi-bin/mailman/listinfo/defcore-committee<br class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</body>
</html>