Open Stack

Tue May 23 20:09:35 UTC 2017

On 05/23/2017 03:16 PM, Edward Leafe wrote:
> On May 23, 2017, at 1:43 PM, Jay Pipes <jaypipes at gmail.com> wrote:
> 
>> [1] Witness the join constructs in Golang in Kubernetes as they work around etcd not being a relational data store:
> 
> 
> Maybe it’s just me, but I found that Go code more understandable than some of the SQL we are using in the placement engine. :)
> 
> I assume that the SQL in a relational engine is faster than the same thing in code, but is that difference significant? For extremely large data sets I think that the database processing may be rate limiting, but is that the case here? Sometimes it seems that we are overly obsessed with optimizing data handling when the amount of data is relatively small. A few million records should be fast enough using just about anything.

When you write your app fresh, put some data into it, a few hundred 
rows, not at all.  Pull it all into memory and sort/filter all you want, 
SQL is too hard.  Push it to production!  works great.   send the 
customer your bill.

6 months later.   Customer has 10K rows.   The tools their contractor 
wrote seem a little sticky.    Not sure when that happened?

A year later.  Customer is at 300K rows, nowhere near "a few million" 
records.  Application regularly crashes when asked to search and filter 
results.   Because Python interpreter uses a fair amount of memory for a 
result set, multiplied by the overhead of Python object() / dict() per 
row == 100's / 1000's of megs of memory to have 300000 objects in memory 
all at once.  Multiply by dozens of threads / processes handling 
concurrent requests, Python interpreter rarely returns memory.  Then add 
latency of fetching 300K rows over the wire, converting to objects. 
Concurrent requests pile up because they're slower; == more processes, 
== more memory.

New contractor is called in to rewrite the whole thing in MongoDB.   Now 
it's fast again!   Proceed to chapter 2, "So you decided to use 
MongoDB...."   :)

Open Stack

[openstack-dev] [tc] Active or passive role with our database layer

OpenStack

Community

Documentation

Branding & Legal