[openstack-dev] [tc] supporting Go

Robert Collins robertc at robertcollins.net
Tue May 10 20:46:26 UTC 2016


On 11 May 2016 at 06:10, Hayes, Graham <graham.hayes at hpe.com> wrote:
> On 10/05/2016 01:01, Gregory Haynes wrote:

> The way this component works makes it quite difficult to make any major
> improvement.
>
> MiniDNS (the component) takes data and sends a zone transfer every time
> a recordset gets updated. That is a full (AXFR) zone transfer, so every
> record in the zone gets sent to each of the DNS servers that end users
> can hit.
>
> This can be quite a large number - ns[1-6].example.com. may well be
> tens or hundreds of servers behind anycast IPs and load balancers.
>
> In many cases, internal zones (or even external zones) can be quite
> large - I have seen zones that are 200-300Mb. If a zone is high traffic

I presume you mean MB ?

> (like say cloud.example.com. where a record is added / removed for
> each boot / destroy, or the reverse DNS zones for a cloud), there can
> be a lot of data sent out from this component.
>
> We are a small development team, and after looking at our options, and
> judging the amount of developer hours we had available, a different
> language was the route we decided on. I was going to go implement a few
> POCs and see what was most suitable.

Out of interest, what was the problem you had/have with Python here?
Sending a few GB of data at wire speeds on a TCP link is pretty
shallow for Python, though not having had my hands on a 40gpbs NIC I
can't personally say whether its still the case there.

I guess my fundamental question is: is this a domain problem, or a
Python problem? If the problem is 'we need to send 300MB to 100
servers in < 5 seconds', which is 30GB of traffic - you're going to
need a 240GB/5s == 48Gbps NIC, or you're going to need distributed
workers to shard that workload across machines.

If the problem is 'designate's memory use is blowing way up when we
try to do this' - that might be a very straightforward fix (use
memoryviews and zero-copy IO).

I guess what I'm wondering is whether there is a low hanging fix, and
as an observer I have absolutely no insight into the problem you've
been having - I'd like to know more... is there a bug report perhaps?

My *fear* is that the underlying problem has nothing to do with Python
and can rear its head in any language - and that perhaps the idioms
being used in Designate (or OpenStack as a whole) are driving whatever
specific problem you've got?

I know that our current programming model is missing a lot of the
easily-correct improvements that have been created in the last few
decades (because our basic abstraction is the thread) - how much does
that factor in, do you think?

-Rob

-- 
Robert Collins <rbtcollins at hpe.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list