[Ocata][Heat] Strange error returned after stack creation failure -r aw template with id xxx not found

Zane Bitter zbitter at redhat.com
Fri Aug 14 00:37:36 UTC 2020


On 24/07/20 10:59 am, Laurent Dumont wrote:
> Hey Zane,
> 
> Thank you so much for the details - super interesting. We've worked with 
> the Vendor to try and reproduce while we had our logs for Heat turned to 
> DEBUG. Unfortunately, all of the creations they have attempted since 
> have worked. It first failed 4 times out of 5 and has since worked...

Interesting - sounds like a timing issue, but I haven't spotted any code 
that looks like it could fail by going too fast.

> It's one of those problems! We'll keep trying to reproduce. Just to be 
> sure, the actual yaml is stored in the DB and then accessed to create 
> the actual Heat ressources?

Yep, correct. It's stored and the ID is passed in the RPC message here:

https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L308
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L372-L374
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L336-L337

and then when the other engine receives the create_stack RPC message it 
uses the stored template instead of one passed in the message like you 
would get from a create call initiated via the ReST API:

https://opendev.org/openstack/heat/src/branch/master/heat/engine/service.py#L847-L851
https://opendev.org/openstack/heat/src/branch/master/heat/engine/service.py#L731-L732

- ZB

> 
> Thanks!
> 
> On Wed, Jul 22, 2020 at 3:46 PM Zane Bitter <zbitter at redhat.com 
> <mailto:zbitter at redhat.com>> wrote:
> 
>     On 21/07/20 8:03 pm, Laurent Dumont wrote:
>      > Hi!
>      >
>      > We are currently troubleshooting a Heat stack issue where one of the
>      > stack (one of 25 or so) is failing to be created properly (seemingly
>      > randomly).
>      >
>      > The actual error returned by Heat is quite strange and Google has
>     been
>      > quite sparse in terms of references.
>      >
>      > The actual error looks like the following (I've sanitized some of
>     the
>      > names):
>      >
>      > Resource CREATE failed: resources.potato: Resource CREATE failed:
>      > resources[0]: raw template with id 22273 not found
> 
>     When creating a nested stack, rather than just calling the RPC
>     method to
>     create a new stack, Heat stores the template in the database first and
>     passes the ID in the RPC message.[1] (It turns out that by doing it
>     this
>     way we can save massive amounts of memory when processing a large tree
>     of nested stacks.) My best guess is that this message indicates that
>     the
>     template row has been deleted by the time the other engine goes to look
>     at it.
> 
>     I don't see how you could have got an ID like 22273 without the
>     template
>     having been successfully stored at some point.
> 
>     The template is only supposed to be deleted if the RPC call returns
>     with
>     an error.[2] The only way I can think of for that to happen before an
>     attempt to create the child stack is if the RPC call times out, but the
>     original message is eventually picked up by an engine. I would check
>     your logs for RPC timeouts and consider increasing them.
> 
>     What does the status_reason look like at one level above in the tree?
>     That should indicate the first error that caused the template to be
>     deleted.
> 
>      >     heat resource-list STACK_NAME_HERE -n 50
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >     | resource_name    | physical_resource_id                 |
>      >     resource_type           | resource_status | updated_time    
>          |
>      >     stack_name
>      >          |
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >     | potato              | RESOURCE_ID_HERE |
>     OS::Heat::ResourceGroup |
>      >     CREATE_FAILED   | 2020-07-18 T19:52:10Z |
>      >     nested_stack_1_STACK_NAME_HERE                  |
>      >     | potato_server_group | RESOURCE_ID_HERE |
>     OS::Nova::ServerGroup   |
>      >     CREATE_COMPLETE | 2020-07-21T19:52:10Z |
>      >     nested_stack_1_STACK_NAME_HERE                  |
>      >     | 0                |                                      |
>      >     potato1.yaml     | CREATE_FAILED   | 2020-07-18T19:52:12Z |
>      >     nested_stack_2_STACK_NAME_HERE |
>      >     | 1                |                                      |
>      >     potato1.yaml     | INIT_COMPLETE   | 2020-07- 18 T19:52:12Z |
>      >     nested_stack_2_STACK_NAME_HERE |
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >
>      >
>      > The template itself is pretty simple and attempts to create a
>      > ServerGroup and 2 VMs (as part of the ResourceGroup). My feeling
>     is that
>      > one the creation of those machines fails and Heat get's a little
>     cooky
>      > and returns an error that might not be the actual root cause. I
>     would
>      > have expected the VM to show up in the resource list but I just
>     see the
>      > source "yaml".
> 
>     It's clear from the above output that the scaled unit of the resource
>     group is in fact a template (not an OS::Nova::Server), and the error is
>     occurring trying to create a stack from that template (potato1.yaml) -
>     before Heat even has a chance to start creating the server.
> 
>      > Has anyone seen something similar in the past?
> 
>     Nope.
> 
>     cheers,
>     Zane.
> 
>     [1]
>     https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L367-L384
>     [2]
>     https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L335-L342
> 
> 




More information about the openstack-discuss mailing list