[openstack-dev] [watcher] Api and Decision Engine integration - design question

Kaczynski, Tomasz tomasz.kaczynski at intel.com
Tue Apr 19 06:42:02 UTC 2016


Hi Vincent,
Thanks for your comments :)

I don't want to go to any "religious" discussions, just a comment about scalability. If you look at what other software companies are doing these days to achieve scalability you'll notice that they rather split their monolithic systems (if they had one) into smaller, independent parts - often in the form of microservices. Microservices isolate the common functionalities, can be scaled horizontally, deployed separately without affecting the whole system. Having a central DB is asking for troubles - any change in DB schema, you need to redeploy the whole system. The larger the system is, the problems are bigger. DB becomes a bottleneck and doesn't scale well (also because of its pull vs. push model).

So I'd say, if you really care about scalability, I'd rather push the system into the microservices architecture and not put a central DB in place, which by the way becomes a single point of failure.

There is a great book by the way if anyone is interested in learning more about microservices:
http://shop.oreilly.com/product/0636920033158.do

Just my 2 cents :)

Tomasz

-----Original Message-----
From: Vincent FRANÇOISE [mailto:Vincent.FRANCOISE at b-com.com] 
Sent: Monday, April 18, 2016 4:02 PM
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [watcher] Api and Decision Engine integration - design question

Hi Tomasz,

Overall, I don't really have any strong opinion on this. I just think that if we do it with option 1, it may become quite hard to make Watcher more scalable in the long run if we need to. That's why I would tend to choose option 2. Also, it's not so easy for me to evaluate how much work the DB sync would require compared to what I did for the strategies and goals (see https://review.openstack.org/#/c/305965/2/watcher/decision_engine/sync.py),
so take my answer with a grain of salt :)


On 15/04/2016 10:10, Kaczynski, Tomasz wrote:
> Hi guys,
> 
> I’m implementing the Watcher Scoring Module. As part of that, I need 
> to expose the information about Scoring Engines through the API/Python CLI.
> 
>  
> 
> The scoring engine list might be quite dynamic. Although the scoring 
> engines will be pluggable through the stevedore plug-in model, a 
> single plug-in might contain one or more scoring engines. In some 
> scenarios this list will be static – a plug-in developer will just 
> expose few algorithms and that’s it. But in some other scenarios, the 
> scoring engines might be implemented as external web services for 
> example and there might be an on-going development process on data 
> models, which will result in multiple scoring engines in multiple 
> versions, which might change quite frequently (e.g. few times a day).
> 
>  
> 
> Of course, the responsibility for handling all of that is entirely on 
> the scoring engine plug-in developer. But it would be good to keep the 
> scoring engine abstraction layer clean and simple, hiding all of these 
> details.
> 
>  
> 
> And here comes the problem:
> 
> Somehow the dynamic list of scoring engines has to be passed from 
> Decision Engine (where the Scoring Engine abstraction layer will be
> sitting) to the Api / CLI. There are currently 2 options on the table 
> how this could be done:
> 
>  
> 
> Option 1:
> 
> Allow Api to call Decision Engine directly through existing RPC Api 
> (currently using messaging transport).
> 
>  
> 
> Option 2:
> 
> Let Decision Engine keep Scoring Engine information synced in the DB 
> so that Watcher Api can simply query for this information as required.
> 
>  
> 
> Pros and cons of each option:
> 
> Option 1:
> 
> -          Good: Simpler implementation and no need for keeping DB in sync.
> 
> -          Good: No risk of data inconsistency. Nothing is being cached,
> data is always accurate. Decision Engine is a single source of truth.
> 
> -          Good: Scoring Engine Plug-in creates a simple stevedore
> plug-in, implements scoring engine classes, implements a factory class 
> returning scoring engines and that’s all.
> 
> -          Good: Supports also more complicated scenarios with dynamic
> scoring engine list – encapsulated in the factory class.
> 
> -          Bad: Dependency on Decision Engine – it needs to be up and
> running. Can be mitigated by caching the last response from Decision 
> Engine – if DE RCP Api is not responding, the last known data could be 
> returned.
> 
> -          Bad: Not sure how reliable/performant RPC over messaging
> transport is. Need to test.
> 
> -          Bad: Might have scalability issues (I believe there is only
> one Decision Engine instance, please confirm!). But this might be at 
> least partially mitigated by caching on the Watcher Api level (e.g. if 
> the last data was retrieved less than X minutes ago, no need to query 
> Decision Engine). In the context that this information is only used by 
> Strategy developers to actually implement strategies using some 
> Scoring Engines, it might be perfectly fine to cache data for longer 
> periods of time (1 hour or more).
I confirm there is only one DE process whereas the API can have as many workers as you want (configurable).
> 
> Option 2:
> 
> -          Good: Watcher Api decoupled from Decision Engine. Can work
> even if DE is not working or busy.
> 
> -          Good: In case of Watcher this option should scale better.
> Decision Engine typically has only one instance and is not subject to 
> horizontal scalability (please confirm my understanding!).
> 
> -          Bad: More complicated implementation. For dynamic scenarios
> (adding scoring engines on the fly) requires some sort of notification 
> mechanism, so that the DB will stay in sync. Can be done by exposing 
> event handling in scoring engine abstraction layer, but it’s 
> unnecessary complication for simple cases with static data. But can be 
> mitigated by using helper classes enforcing DB sync without actually 
> exposing any events in the abstract classes (so if plug-in needs to 
> sync DB, it calls some helper method, all others just do nothing).
> 
I agree on the difficulty here, but we can do implement this incrementally so we wouldn't have to handle all these cases straight away.
> -          Bad: Potential issues with data consistency. If there is a
> problem or a bug in the sync code, it might be hard to recover from 
> the problem without Watcher redeployment.
> 
> -          Bad: Any change in the DB structure might require to change
> all the parties and even the existing plug-ins.
> 
Fair point, but how many fields are need in DB, and aren't some of them going to be like some serialized JSON containing some parameters that may vary depending on the plugin? In such a case, this would be extensible OOTB.
>  
> 
> My preference is to go with option 1 because of the simpler 
> implementation and no problems with data consistency. Nothing needs to 
> be purged, synced, data is always accurate. If Decision Engine is not 
> working, there is a bigger problem anyway (but there is a mitigation 
> by caching the DE last response).
> 
>  
> 
> I hope I managed to explain the concept and the problem.
> 
>  
> 
> I appreciate your opinion about that!
> 
>  
> 
> Kind regards,
> 
> Tomasz
> 
>  
> 
> ---------------------------------------------------------------------
> *Intel Technology Poland sp. z o.o.
> *ul. Słowackiego 173 | 80-298 Gdańsk | Sąd Rejonowy Gdańsk Północ | 
> VII Wydział Gospodarczy Krajowego Rejestru Sądowego - KRS 101882 | NIP
> 957-07-52-316 | Kapitał zakładowy 200.000 PLN.
> 
> Ta wiadomość wraz z załącznikami jest przeznaczona dla określonego 
> adresata i może zawierać informacje poufne. W razie przypadkowego 
> otrzymania tej wiadomości, prosimy o powiadomienie nadawcy oraz trwałe 
> jej usunięcie; jakiekolwiek przeglądanie lub rozpowszechnianie jest 
> zabronione.
> This e-mail and any attachments may contain confidential material for 
> the sole use of the intended recipient(s). If you are not the intended 
> recipient, please contact the sender and delete all copies; any review 
> or distribution by others is strictly prohibited.
> 
> 
> 
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


More information about the OpenStack-dev mailing list