[Openstack-operators] [openstack-dev] [Octavia] [Kolla] SSL errors polling amphorae and missing tenant network interface

Tobias Urdin tobias.urdin at binero.se
Thu Oct 25 14:00:06 UTC 2018


Might as well throw it out here.

After a lot of troubleshooting we were able to narrow our issue down to 
our test environment running qemu virtualization, we moved our compute 
node to hardware and
used kvm full virtualization instead.

We could properly reproduce the issue where generating a CSR from a 
private key and then trying to verify the CSR would fail complaining about
"Signature did not match the certificate request"

We suspect qemu floating point emulation caused this, the same OpenSSL 
function that validates a CSR is the one used when validating the SSL 
handshake which caused our issue.
After going through the whole stack, we have Octavia working flawlessly 
without any issues at all.

Best regards
Tobias

On 10/23/2018 04:31 PM, Tobias Urdin wrote:
> Hello Erik,
>
> Could you specify the DNs you used for all certificates just so that I
> can rule it out on my side.
> You can redact anything sensitive with some to just get the feel on how
> it's configured.
>
> Best regards
> Tobias
>
> On 10/22/2018 04:47 PM, Erik McCormick wrote:
>> On Mon, Oct 22, 2018 at 4:23 AM Tobias Urdin <tobias.urdin at binero.se> wrote:
>>> Hello,
>>>
>>> I've been having a lot of issues with SSL certificates myself, on my
>>> second trip now trying to get it working.
>>>
>>> Before I spent a lot of time walking through every line in the DevStack
>>> plugin and fixing my config options, used the generate
>>> script [1] and still it didn't work.
>>>
>>> When I got the "invalid padding" issue it was because of the DN I used
>>> for the CA and the certificate IIRC.
>>>
>>>    > 19:34 < tobias-urdin> 2018-09-10 19:43:15.312 15032 WARNING
>>> octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect
>>> to instance. Retrying.: SSLError: ("bad handshake: Error([('rsa
>>> routines', 'RSA_padding_check_PKCS1_type_1', 'block type is not 01'),
>>> ('rsa routines', 'RSA_EAY_PUBLIC_DECRYPT', 'padding check failed'),
>>> ('SSL routines', 'ssl3_get_key_exchange', 'bad signature')],)",)
>>>    > 19:47 < tobias-urdin> after a quick google "The problem was that my
>>> CA DN was the same as the certificate DN."
>>>
>>> IIRC I think that solved it, but then again I wouldn't remember fully
>>> since I've been at so many different angles by now.
>>>
>>> Here is my IRC logs history from the #openstack-lbaas channel, perhaps
>>> it can help you out
>>> http://paste.openstack.org/show/732575/
>>>
>> Tobias, I owe you a beer. This was precisely the issue. I'm deploying
>> Octavia with kolla-ansible. It only deploys a single CA. After hacking
>> the templates and playbook to incorporate a separate server CA, the
>> amphorae now load and provision the required namespace. I'm adding a
>> kolla tag to the subject of this in hopes that someone might want to
>> take on changing this behavior in the project. Hopefully after I get
>> through Upstream Institute in Berlin I'll be able to do it myself if
>> nobody else wants to do it.
>>
>> For certificate generation, I extracted the contents of
>> octavia_certs_install.yml (which sets up the directory structure,
>> openssl.cnf, and the client CA), and octavia_certs.yml (which creates
>> the server CA and the client certificate) and mashed them into a
>> separate playbook just for this purpose. At the end I get:
>>
>> ca_01.pem - Client CA Certificate
>> ca_01.key - Client CA Key
>> ca_server_01.pem - Server CA Certificate
>> cakey.pem - Server CA Key
>> client.pem - Concatenated Client Key and Certificate
>>
>> If it would help to have the playbook, I can stick it up on github
>> with a huge "This is a hack" disclaimer on it.
>>
>>> -----
>>>
>>> Sorry for hijacking the thread but I'm stuck as well.
>>>
>>> I've in the past tried to generate the certificates with [1] but now
>>> moved on to using the openstack-ansible way of generating them [2]
>>> with some modifications.
>>>
>>> Right now I'm just getting: Could not connect to instance. Retrying.:
>>> SSLError: [SSL: BAD_SIGNATURE] bad signature (_ssl.c:579)
>>> from the amphoras, haven't got any further but I've eliminated a lot of
>>> stuck in the middle.
>>>
>>> Tried deploying Ocatavia on Ubuntu with python3 to just make sure there
>>> wasn't an issue with CentOS and OpenSSL versions since it tends to lag
>>> behind.
>>> Checking the amphora with openssl s_client [3] it gives the same one,
>>> but the verification is successful just that I don't understand what the
>>> bad signature
>>> part is about, from browsing some OpenSSL code it seems to be related to
>>> RSA signatures somehow.
>>>
>>> 140038729774992:error:1408D07B:SSL routines:ssl3_get_key_exchange:bad
>>> signature:s3_clnt.c:2032:
>>>
>>> So I've basicly ruled out Ubuntu (openssl-1.1.0g) and CentOS
>>> (openssl-1.0.2k) being the problem, ruled out signing_digest, so I'm
>>> back to something related
>>> to the certificates or the communication between the endpoints, or what
>>> actually responds inside the amphora (gunicorn IIUC?). Based on the
>>> "verify" functions actually causing that bad signature error I would
>>> assume it's the generated certificate that the amphora presents that is
>>> causing it.
>>>
>>> I'll have to continue the troubleshooting to the inside of the amphora,
>>> I've used the test-only amphora image before but have now built my own
>>> one that is
>>> using the amphora-agent from the actual stable branch, but same issue
>>> (bad signature).
>>>
>>> For verbosity this is the config options set for the certificates in
>>> octavia.conf and which file it was copied from [4], same here, a
>>> replication of what openstack-ansible does.
>>>
>>> Appreciate any feedback or help :)
>>>
>>> Best regards
>>> Tobias
>>>
>>> [1]
>>> https://github.com/openstack/octavia/blob/master/bin/create_certificates.sh
>>> [2] http://paste.openstack.org/show/732483/
>>> [3] http://paste.openstack.org/show/732486/
>>> [4] http://paste.openstack.org/show/732487/
>>>
>>> On 10/20/2018 01:53 AM, Michael Johnson wrote:
>>>> Hi Erik,
>>>>
>>>> Sorry to hear you are still having certificate issues.
>>>>
>>>> Issue #2 is probably caused by issue #1. Since we hot-plug the tenant
>>>> network for the VIP, one of the first steps after the worker connects
>>>> to the amphora agent is finishing the required configuration of the
>>>> VIP interface inside the network namespace on the amphroa.
>>>>
>> Thanks for the hint on the workflow of this. I hadn't gotten deep
>> enough into the code to find that yet, but I suspected it was blocking
>> since the namespace never got created either. Thanks
>>
>>>> If I remember correctly, you are attempting to configure Octavia with
>>>> the dual CA option (which is good for non-development use).
>>>>
>>>> This is what I have for notes:
>>>>
>>>> [certificates] gets the following:
>>>> cert_generator = local_cert_generator
>>>> ca_certificate = server CA's "server.pem" file
>>>> ca_private_key = server CA's "server.key" file
>>>> ca_private_key_passphrase = pass phrase for ca_private_key
>>>>     [controller_worker]
>>>>     client_ca = Client CA's ca_cert file
>>>>     [haproxy_amphora]
>>>> client_cert = Client CA's client.pem file (I think with it's key
>>>> concatenated is what rm_work said the other day)
>>>> server_ca = Server CA's ca_cert file
>>>>
>> This is all very helpful. It's a bit difficult to know what goes where
>> the way the documentation is written presently. For something that's
>> going to be the defacto standard for loadbalancing, we as a community
>> need to do a better job of documenting how to set up, configure, and
>> manage this in production. I'm trying to capture my lessons learned
>> and processes as I go to help with that if I can.
>>
>> -Erik
>>
>>>> That said, I can probably run through this and write something up next
>>>> week that is more step-by-step/detailed.
>>>>
>>>> Michael
>>>>
>>>> On Fri, Oct 19, 2018 at 2:31 PM Erik McCormick
>>>> <emccormick at cirrusseven.com> wrote:
>>>>> Apologies for cross-posting, but in the event that these might be
>>>>> worth filing as bugs, I wanted the Octavia devs to see it as well...
>>>>>
>>>>> I've been wrestling with getting Octavia up and running and have
>>>>> become stuck on two issues. I'm hoping someone has run into these
>>>>> before. My google foo has come up empty.
>>>>>
>>>>> Issue 1:
>>>>> When the Octavia controller tries to poll the amphora instance, it
>>>>> tries repeatedly and eventually fails. The error on the controller
>>>>> side is:
>>>>>
>>>>> 2018-10-19 14:17:39.181 26 ERROR
>>>>> octavia.amphorae.drivers.haproxy.rest_api_driver [-] Connection
>>>>> retries (currently set to 300) exhausted.  The amphora is unavailable.
>>>>> Reason: HTTPSConnectionPool(host='10.7.0.112', port=9443): Max retries
>>>>> exceeded with url: /0.5/plug/vip/10.250.20.15 (Caused by
>>>>> SSLError(SSLError("bad handshake: Error([('rsa routines',
>>>>> 'RSA_padding_check_PKCS1_type_1', 'invalid padding'), ('rsa routines',
>>>>> 'rsa_ossl_public_decrypt', 'padding check failed'), ('asn1 encoding
>>>>> routines', 'ASN1_item_verify', 'EVP lib'), ('SSL routines',
>>>>> 'tls_process_server_certificate', 'certificate verify
>>>>> failed')],)",),)): SSLError: HTTPSConnectionPool(host='10.7.0.112',
>>>>> port=9443): Max retries exceeded with url: /0.5/plug/vip/10.250.20.15
>>>>> (Caused by SSLError(SSLError("bad handshake: Error([('rsa routines',
>>>>> 'RSA_padding_check_PKCS1_type_1', 'invalid padding'), ('rsa routines',
>>>>> 'rsa_ossl_public_decrypt', 'padding check failed'), ('asn1 encoding
>>>>> routines', 'ASN1_item_verify', 'EVP lib'), ('SSL routines',
>>>>> 'tls_process_server_certificate', 'certificate verify
>>>>> failed')],)",),))
>>>>>
>>>>> On the amphora side I see:
>>>>> [2018-10-19 17:52:54 +0000] [1331] [DEBUG] Error processing SSL request.
>>>>> [2018-10-19 17:52:54 +0000] [1331] [DEBUG] Invalid request from
>>>>> ip=::ffff:10.7.0.40: [SSL: SSL_HANDSHAKE_FAILURE] ssl handshake
>>>>> failure (_ssl.c:1754)
>>>>>
>>>>> I've generated certificates both with the script in the Octavia git
>>>>> repo, and with the Openstack Ansible playbook. I can see that they are
>>>>> present in /etc/octavia/certs.
>>>>>
>>>>> I'm using the Kolla (Queens) containers for the control plane so I'm
>>>>> sure I've satisfied all the python library constraints.
>>>>>
>>>>> Issue 2:
>>>>> I"m not sure how it gets configured, but the tenant network interface
>>>>> (ens6) never comes up. I can spawn other instances on that network
>>>>> with no issue, and I can see that Neutron has the port attached to the
>>>>> instance. However, in the instance this is all I get:
>>>>>
>>>>> ubuntu at amphora-33e0aab3-8bc4-4fcb-bc42-b9b36afb16d4:~$ ip a
>>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
>>>>> group default qlen 1
>>>>>        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>>>        inet 127.0.0.1/8 scope host lo
>>>>>           valid_lft forever preferred_lft forever
>>>>>        inet6 ::1/128 scope host
>>>>>           valid_lft forever preferred_lft forever
>>>>> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast
>>>>> state UP group default qlen 1000
>>>>>        link/ether fa:16:3e:30:c4:60 brd ff:ff:ff:ff:ff:ff
>>>>>        inet 10.7.0.112/16 brd 10.7.255.255 scope global ens3
>>>>>           valid_lft forever preferred_lft forever
>>>>>        inet6 fe80::f816:3eff:fe30:c460/64 scope link
>>>>>           valid_lft forever preferred_lft forever
>>>>> 3: ens6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
>>>>> default qlen 1000
>>>>>        link/ether fa:16:3e:89:a2:7f brd ff:ff:ff:ff:ff:ff
>>>>>
>>>>> There's no evidence of the interface anywhere else including udev rules.
>>>>>
>>>>> Any help with either or both issues would be greatly appreciated.
>>>>>
>>>>> Cheers,
>>>>> Erik
>>>>>
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-operators mailing list