<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi Gorka and Renat</p>
    <p><br>
    </p>
    <p>Thanks you for your suggestions and sorry to have forgotten the
      [mistral] subject prefix .<br>
    </p>
    <p><br>
    </p>
    <p>>Renat:<br>
      >workflow should <span style="caret-color: rgb(39, 39, 40);">probably</span><span
        style="caret-color: rgb(39, 39, 40);"> </span>be responsible for
      tracking a status of an operation. <br>
    </p>
    <p>>Gorka:<br>
      >Instead of a sleep, which may get you through this issue but
      fall into a<br>
      >different one and won't return the right status code, you
      should<br>
      >probably have a loop checking the status of the backup and
      return a non<br>
      >zero status code if it ends up in "error" state.
    </p>
    <p>The idea of Gorka sounds good.<br>
    </p>
    <p>If you look at the snapshot worflow of Jose Castro, you will find
      a similar snippet:<br>
    </p>
    <p>   
      #<a class="moz-txt-link-freetext" href="https://techblog.web.cern.ch/techblog/post/scheduled-snapshots/">https://techblog.web.cern.ch/techblog/post/scheduled-snapshots/</a><br>
         
#<a class="moz-txt-link-freetext" href="https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows/raw/master/workflows/instance_snapshot.yaml">https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows/raw/master/workflows/instance_snapshot.yaml</a>
      | sed -e 's%action_region: "cern"%action_region: "ch-zh1"%'
      >instance_snapshot.yaml<br>
    </p>
    <p>    stop_instance:<br>
            description: 'Stops the instance for consistency'<br>
            action: nova.servers_stop<br>
            input:<br>
              server: <% $.instance %><br>
              action_region: <% $.action_region %><br>
            on-success:<br>
              - wait_for_stop_instance<br>
            on-error:<br>
              - error_task<br>
      <br>
          wait_for_stop_instance:<br>
            description: 'Waits until the instance is shutoff to
      continue'<br>
            action: nova.servers_find<br>
            input:<br>
              id: <% $.instance %><br>
              status: 'SHUTOFF'<br>
              action_region: <% $.action_region %><br>
            retry:<br>
              delay: 5<br>
              count: 40<br>
            on-success:<br>
              - check_boot_source<br>
            on-error:<br>
              - error_task<br>
    </p>
    <p><br>
    </p>
    <p>>We’ve discussed a more generic solution in the past for
      similar situations but it seems to be virtually impossible to find
      it.</p>
    <p>Ok so it looks that this issue cannot be fixed with a small
      bugfix. <br>
      It would require a feature extension.</p>
    <p>I can imagine that quite a few api calls from the different
      openstack modules/services are asynchronous and would require
      mistral to check their progress status every time in a different
      ad hoc manner.<br>
      That would make the such a new feature in mistral quite expensive
      to implement.</p>
    <p>It would be great if every async call would return a job_id in a
      standard form by each service.<br>
      So mistral would be able to track them in an uniform way.<br>
      This would also allows openstack client to run in sync or async
      mode, according to the user need.<br>
    </p>
    <p>But such a design requirement better need to be done at day one;
      it is likely too late to change all openstack services...</p>
    <p><br>
    </p>
    <p>However, there is a minor enhancement that could be done:<br>
      let the user specify if a cron trigger need to auto-delete itself
      after its last execution or not.<br>
    </p>
    <p>Keeping expired cron triggers could be nice for:<br>
      -avoiding the such racing issues as with swift/radosgw<br>
      -allowing the user to edit and reschedule a expired cron trigger<br>
    </p>
    <p>What do you think?<br>
    </p>
    <p><br>
    </p>
    <p>Best Regards</p>
    <p>Francois<br>
    </p>
    <p><br>
    </p>
    <br>
    <p><br>
    </p>
    <p><br>
      <br>
      <br>
      <br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 9/24/19 8:36 AM, Renat Akhmerov
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:4f779c2f-e43e-4f1a-a3a2-44a4e5515ef7@Spark">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <title></title>
      <div name="messageBodySection">
        <div dir="auto">Hi!
          <div dir="auto"><br>
          </div>
          <div dir="auto">I would kindly ask you to add [mistral] into
            the subject of the emails related to Mistral. I just saw
            this thread accidentally (since I can’t read everything) and
            missed it in the first place.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">On the issue itself… So yes, the discovery you
            made makes perfect sense. I agree that a workflow should <span
              style="caret-color: rgb(39, 39, 40);">probably</span><span
              style="caret-color: rgb(39, 39, 40);"> </span>be
            responsible for tracking a status of an operation. We’ve
            discussed a more generic solution in the past for similar
            situations but it seems to be virtually impossible to find
            it. If you have some ideas, please share. We can discuss it.</div>
          <div dir="auto"><br>
          </div>
        </div>
      </div>
      <div name="messageSignatureSection"><br>
        <div class="matchFont">Thanks<br>
          <br>
          Renat Akhmerov<br>
          @Nokia</div>
      </div>
      <div name="messageReplySection">On 23 Sep 2019, 14:41 +0700, Gorka
        Eguileor <a class="moz-txt-link-rfc2396E" href="mailto:geguileo@redhat.com"><geguileo@redhat.com></a>, wrote:<br>
        <blockquote type="cite" class="spark_quote" style="margin: 5px
          5px; padding-left: 10px; border-left: thin solid #1abc9c;">On
          20/09, Francois Scheurer wrote:<br>
          <blockquote type="cite" class="spark_quote" style="margin: 5px
            5px; padding-left: 10px; border-left: thin solid #e67e22;">Hi
            Gorka<br>
            <br>
            <br>
            <blockquote type="cite" class="spark_quote" style="margin:
              5px 5px; padding-left: 10px; border-left: thin solid
              #3498db;">Then I assume you prefer the Swift backup driver
              over the Ceph one<br>
              because you are using one of the OpenStack releases that
              had trouble >with<br>
            </blockquote>
            Incremental Backups on the Ceph backup driver.<br>
            <br>
            <br>
            You are probably right. But I cannot answer that because I
            was not involve<br>
            in that decision.<br>
            <br>
            <br>
            Ok in the radosgw logs I see this:<br>
            <br>
            <br>
            2019-09-20 15:40:06.805529 7f19edb9b700 20
token_id=gAAAAABdhNauRvNev5P90ovX7_cb5_4MkY1tg5JHFpAH8JL-_0vDs06lHW5F9Iphua7fxCWTxxdL-0fRzhR8We_nN6Hx9z3FTWcTXLUMtIUPe0WMKQgW6JkUTP8RwSjAfF4W04OztEg3VAUGN_5gWRlBX-KT9uypnEszadG1yA7gpjkCokNnD8oaIeE6arvs_EjfJib51rao<br>
            2019-09-20 15:40:06.805664 7f19edb9b700 20 sending request
            to<br>
            <a class="moz-txt-link-freetext" href="https://keystone.service.stage.ewcs.ch/v3/auth/tokens">https://keystone.service.stage.ewcs.ch/v3/auth/tokens</a><br>
            2019-09-20 15:40:06.805803 7f19edb9b700 20 ssl verification
            is set to off<br>
            2019-09-20 15:40:07.235356 7f19edb9b700 20 sending request
            to<br>
            <a class="moz-txt-link-freetext" href="https://keystone.service.stage.ewcs.ch/v3/auth/tokens">https://keystone.service.stage.ewcs.ch/v3/auth/tokens</a><br>
            2019-09-20 15:40:07.235404 7f19edb9b700 20 ssl verification
            is set to off<br>
            2019-09-20 15:40:07.267091 7f19edb9b700  5 Failed keystone
            auth from<br>
            <a class="moz-txt-link-freetext" href="https://keystone.service.stage.ewcs.ch/v3/auth/tokens">https://keystone.service.stage.ewcs.ch/v3/auth/tokens</a> with
            404<br>
            BTW: our radosgw is configured to delegate user
            authentication to keystone.<br>
            <br>
            In keystone logs I see this:<br>
            <br>
            2019-09-20 15:40:07.218 24 INFO keystone.token.provider<br>
            [req-21b2f11c-9e67-4487-af05-420acfb65ace - - - - -] Token
            being processed:<br>
            token.user_id [f7c7296949f84a4387c5172808a0965b],<br>
            token.expires_at[2019-09-21T13:40:07.000000Z],<br>
            token.audit_ids[[u'hFweMPCrSO2D00rNcRNECw']],
            token.methods[[u'password']],<br>
            token.system[None], token.domain_id[None],<br>
            token.project_id[4120792f50bc4cf2b4f97c4546462f06],
            token.trust_id[None],<br>
            token.federated_groups[None],
            token.identity_provider_id[None],<br>
            token.protocol_id[None],<br>
token.access_token_id[None],token.application_credential_id[None].<br>
            2019-09-20 15:40:07.257 21 INFO keystone.common.wsgi<br>
            [req-9f858abb-68f9-42cf-b71a-f1cafca91844
            f7c7296949f84a4387c5172808a0965b<br>
            4120792f50bc4cf2b4f97c4546462f06 - default default] GET<br>
            <a class="moz-txt-link-freetext" href="http://keystone.service.stage.ewcs.ch/v3/auth/tokens">http://keystone.service.stage.ewcs.ch/v3/auth/tokens</a><br>
            2019-09-20 15:40:07.265 21 WARNING keystone.common.wsgi<br>
            [req-9f858abb-68f9-42cf-b71a-f1cafca91844
            f7c7296949f84a4387c5172808a0965b<br>
            4120792f50bc4cf2b4f97c4546462f06 - default default] Could
            not find trust:<br>
            934ed82d2b14413899023da0bee6a953.: TrustNotFound: Could not
            find trust:<br>
            934ed82d2b14413899023da0bee6a953.<br>
            <br>
            <br>
            So what happens is following:<br>
            <br>
            1. when the user creates the cron trigger, mistral creates a
            trust<br>
            2. when the cron trigger executes the workflow, openstack
            create a<br>
            volume snapshot (a rbd image) then copy it to swift (rgw)
            then<br>
            delete the snapshot<br>
            3. when the execution finishes, if the cron trigger has no
            remaining<br>
            executions scheduled, then mistral remove the cron trigger
            and the trust<br>
            <br>
            The problem is a racing issue: apprently the copying of the
            snapshot to<br>
            swift run in the background and mistral removes the trust
            before the<br>
            operation completes...<br>
            <br>
            That explains the error in keystone and also the cron
            trigger execution<br>
            result which is "success" even if the resulting backup is
            actually "failed".<br>
            <br>
            <br>
            To test this theory I set up the same cron trigger with more
            than one<br>
            scheduled execution and the backups were suddenly created
            correctly ;-).<br>
            <br>
            <br>
            So something need to be done on the code to deal with this
            racing issue.<br>
            <br>
            In the meantime, I will try to put a sleep action after the
            'create backup'<br>
            action.<br>
            <br>
          </blockquote>
          <br>
          Hi,<br>
          <br>
          Congrats on figuring out the issue. :-)<br>
          <br>
          Instead of a sleep, which may get you through this issue but
          fall into a<br>
          different one and won't return the right status code, you
          should<br>
          probably have a loop checking the status of the backup and
          return a non<br>
          zero status code if it ends up in "error" state.<br>
          <br>
          Cheers,<br>
          Gorka.<br>
          <br>
          <blockquote type="cite" class="spark_quote" style="margin: 5px
            5px; padding-left: 10px; border-left: thin solid #e67e22;"><br>
            Best Regards<br>
            <br>
            Francois<br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            <br>
            On 9/20/19 4:02 PM, Gorka Eguileor wrote:<br>
            <blockquote type="cite" class="spark_quote" style="margin:
              5px 5px; padding-left: 10px; border-left: thin solid
              #3498db;">On 20/09, Francois Scheurer wrote:<br>
              <blockquote type="cite" class="spark_quote" style="margin:
                5px 5px; padding-left: 10px; border-left: thin solid
                #d35400;">Hi Gorka<br>
                <br>
                <br>
                We have a swift endpoint set up on opentstack, which
                points to our ceph<br>
                radosgw backend<br>
                <br>
                Radosgw provides s3 & swift.<br>
                <br>
                So the swift logs are here actually the radosgw logs.<br>
                <br>
              </blockquote>
              Hi,<br>
              <br>
              OK, thanks for the clarification.<br>
              <br>
              Then I assume you prefer the Swift backup driver over the
              Ceph one<br>
              because you are using one of the OpenStack releases that
              had trouble<br>
              with Incremental Backups on the Ceph backup driver.<br>
              <br>
              Cheers,<br>
              Gorka.<br>
              <br>
              <br>
              <blockquote type="cite" class="spark_quote" style="margin:
                5px 5px; padding-left: 10px; border-left: thin solid
                #d35400;">Cheers<br>
                <br>
                Francois<br>
                <br>
                <br>
                <br>
                On 9/20/19 2:46 PM, Gorka Eguileor wrote:<br>
                <blockquote type="cite" class="spark_quote"
                  style="margin: 5px 5px; padding-left: 10px;
                  border-left: thin solid #34495e;">On 20/09, Francois
                  Scheurer wrote:<br>
                  <blockquote type="cite" class="spark_quote"
                    style="margin: 5px 5px; padding-left: 10px;
                    border-left: thin solid #2ecc71;">Dear Gorka and
                    Hervé<br>
                    <br>
                    <br>
                    Thanks for your hints.<br>
                    <br>
                    I have set the debug log level on radosgw.<br>
                    <br>
                    I will retest now and post here the results.<br>
                    <br>
                    <br>
                    Cheers<br>
                    <br>
                    Francois<br>
                  </blockquote>
                  Hi,<br>
                  <br>
                  Sorry, I may have missed something in the
                  conversation, weren't you<br>
                  using Swift?<br>
                  <br>
                  I think you need to see the Swift logs as well, since
                  that's the API<br>
                  service that complained about the authorization.<br>
                  <br>
                  Cheers,<br>
                  Gorka.<br>
                  <br>
                  <blockquote type="cite" class="spark_quote"
                    style="margin: 5px 5px; padding-left: 10px;
                    border-left: thin solid #2ecc71;"><br>
                    <br>
                    --<br>
                    <br>
                    <br>
                    EveryWare AG<br>
                    François Scheurer<br>
                    Senior Systems Engineer<br>
                    Zurlindenstrasse 52a<br>
                    CH-8003 Zürich<br>
                    <br>
                    tel: +41 44 466 60 00<br>
                    fax: +41 44 466 60 10<br>
                    mail: <a class="moz-txt-link-abbreviated" href="mailto:francois.scheurer@everyware.ch">francois.scheurer@everyware.ch</a><br>
                    web: <a class="moz-txt-link-freetext" href="http://www.everyware.ch">http://www.everyware.ch</a><br>
                  </blockquote>
                </blockquote>
                --<br>
                <br>
                <br>
                EveryWare AG<br>
                François Scheurer<br>
                Senior Systems Engineer<br>
                Zurlindenstrasse 52a<br>
                CH-8003 Zürich<br>
                <br>
                tel: +41 44 466 60 00<br>
                fax: +41 44 466 60 10<br>
                mail: <a class="moz-txt-link-abbreviated" href="mailto:francois.scheurer@everyware.ch">francois.scheurer@everyware.ch</a><br>
                web: <a class="moz-txt-link-freetext" href="http://www.everyware.ch">http://www.everyware.ch</a><br>
              </blockquote>
              <br>
            </blockquote>
            --<br>
            <br>
            <br>
            EveryWare AG<br>
            François Scheurer<br>
            Senior Systems Engineer<br>
            Zurlindenstrasse 52a<br>
            CH-8003 Zürich<br>
            <br>
            tel: +41 44 466 60 00<br>
            fax: +41 44 466 60 10<br>
            mail: <a class="moz-txt-link-abbreviated" href="mailto:francois.scheurer@everyware.ch">francois.scheurer@everyware.ch</a><br>
            web: <a class="moz-txt-link-freetext" href="http://www.everyware.ch">http://www.everyware.ch</a><br>
            <br>
          </blockquote>
          <br>
          <br>
          <br>
        </blockquote>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: <a class="moz-txt-link-abbreviated" href="mailto:francois.scheurer@everyware.ch">francois.scheurer@everyware.ch</a>
web: <a class="moz-txt-link-freetext" href="http://www.everyware.ch">http://www.everyware.ch</a> </pre>
  </body>
</html>