[openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

Robert Collins robertc at robertcollins.net
Mon Aug 19 07:12:11 UTC 2013


On 19 August 2013 18:35, Jay Pipes <jaypipes at gmail.com> wrote:

>> http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html
>>
>> There is no proper use of an ORM.
>
>
> I'm not a super-fan of ORMs, Robert. I'm not sure why you're insisting on
> taking me down this road...

Sorry, not sure how we ended up here ;)

> All I'm saying is that we should be careful not to swap one set of problems
> for another. I say this because I've seen the Nova data-access code develop
> from its very earliest days, up to this point. I've seen the horrors of
> trying to mask an object approach on top of a non-relational data store,
> witnessed numerous attempts to rewrite the way that connection pooling and
> session handling is done, and in general just noticed the tension between
> the two engineering factions that want to keep things agnostic towards
> backend storage and at the same time make the backend storage perform and
> scale adequately.

Ah! Ok, completely agree: playing flip-flop on problem sets would be a
poor outcome.

> I'm not sure why you are being so aggressive about this topic. I certainly
> am not being aggressive about my responses -- just cautioning that the
> existing codebase has seen its fair share of refactoring, some of which has
> been a failure and had to be reverted. I would hate to jump into a frenzy to
> radically change the way that the data access code works in Nova without a
> good discussion.

I didn't intend to be aggressive - sorry - super sorry in fact. I've
been burnt by months of effort turning around problem codebases where
the ORM was a significant cause of the problems.


>>> But then I guarantee somebody is gonna spend a bunch of time writing an
>>> object-oriented API to the model objects because the ORM is very useful
>>> for
>>> the data modification part of the DB interaction.
>>
>>
>> !cite - seriously...
>
>
> ? I give an example below... a cautionary tale if you will, about one
> possible consequence of "getting rid of the ORM".

I think what I really meant here is 'you say months, but if we're
writing an object-orientated API surely we'd just use one of the
mapping techniques available in SQLAlchemy..'

>> This strawman is one way that it might be written. Given that a
>> growing set of our projects have non-SQL backends, this doesn't look
>> like the obvious way to phrase it to me.
>
>
> I'm using the SQLAlchemy Core API above, with none of the SQLAlchemy ORM
> code... which is (I thought), what you were proposing we do? How is that a
> strawman argument? :(

So what is in my head is that we have two layers:
business logic
storage logic

And the thing I don't like about the ORM approach is that our business
logic objects are storage logic objects - even though we don't use
http://martinfowler.com/eaaCatalog/domainModel.html we can easily
trigger late evaluation when traversing collections. In particular
because we have large numbers of developers who are likely going to
not be holding the entire problem domain in their head; the churn that
results on code and design tends to throw things out again and again
over time. And we have IMO too much business logic in the
db/sqlalchemy/api.py files scattered around.

So, what I'd like to see is something where the storage layer and
logic layer are more thoroughly decoupled: only return plain ol Python
objects from the DB layer; but within that layer I wouldn't object to
an ORM being used; secondly I'd like to make sure we don't end up
making business decisions in the storage layer, because that makes it
harder when porting to a different storage layer - such as the nova
conductor is.

So the business logic layer for adding a fixed IP would be something like:
i = business.Instance.find(blah=blah)
ip = business.FixedIP(blah=blah)
i.fixed_ips.append(ip)
storage.save(i)

i and ip would be plain ol python objects
storage.save would have the same semantics as an RPC call - it could
do a transaction itself, but there's no holding transactions between
calls to save.

This is very close to:

>>> instead of this:
>>>
>>> i = Instance(blah=blah)
>>> ip = FixedIp(blah=blah)
>>> i.fixed_ips.append(ip)
>>> session.add(u)
>>> session.commit()

But there is no ORM exposed to the developers working with the storage
API - it's contained.

>>> And so you've thrown the baby out with the bathwater and made more work
>>> for
>>> everyone.
>>
>>
>> Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
>> bugs across the project.
>
>
> Could just as easily introduce the same bugs by radically redesigning the
> data access code without first considering all sides of the problem domain.

Totally!

Again, sorry for the tone before, I can only claim a) been burnt in
the past and and b) a week or so of reduced sleep thanks to baby :(.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list