[openstack-dev] [Infra][nova][magnum] Jenkins failed quite often for "Cannot set up guest memory 'pc.ram': Cannot allocate memory"
Clark Boylan
cboylan at sapwetik.org
Tue Dec 15 22:40:38 UTC 2015
On Sun, Dec 13, 2015, at 10:51 AM, Clark Boylan wrote:
> On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote:
> > Hi,
> >
> > As Kai Qiang mentioned, magnum gate recently had a bunch of random
> > failures, which occurred on creating a nova instance with 2G of RAM.
> > According to the error message, it seems that the hypervisor tried to
> > allocate memory to the nova instance but couldn’t find enough free memory
> > in the host. However, by adding a few “nova hypervisor-show XX” before,
> > during, and right after the test, it showed that the host has 6G of free
> > RAM, which is far more than 2G. Here is a snapshot of the output [1]. You
> > can find the full log here [2].
> If you look at the dstat log
> http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz
> the host has nowhere near 6GB free memory and less than 2GB. I think you
> actually are just running out of memory.
> >
> > Another observation is that most of the failure happened on a node with
> > name “devstack-trusty-ovh-*” (You can verify it by entering a query [3]
> > at http://logstash.openstack.org/ ). It seems that the jobs will be fine
> > if they are allocated to a node other than “ovh”.
> I have just done a quick spot check of the total memory on
> devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free -m`
> and the results are 7480, 7732, and 6976 megabytes respectively. Despite
> using 8GB flavors in each case there is variation and OVH comes in on
> the low end for some reason. I am guessing that you fail here more often
> because the other hosts give you just enough extra memory to boot these
> VMs.
To follow up on this we seem to have tracked this down to how the linux
kernel restricts memory at boot when you don't have a contiguous chunk
of system memory. We have worked around this by increasing the memory
restriction to 9023M at boot which gets OVH inline with Rackspace and
slightly increases available memory on HPCloud (because it actually has
more of it).
You should see this fix in action after image builds complete tomorrow
(they start at 1400UTC ish).
>
> We will have to look into why OVH has less memory despite using flavors
> that should be roughly equivalent.
> >
> > Any hints to debug this issue further? Suggestions are greatly
> > appreciated.
> >
> > [1] http://paste.openstack.org/show/481746/
> > [2]
> > http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html
> > [3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml
More information about the OpenStack-dev
mailing list