[openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

Flavio Percoco flavio at redhat.com
Tue Aug 20 10:20:38 UTC 2013


On 20/08/13 00:15 -0700, Mark Washenberger wrote:
>
>        2) I highly caution folks who think a No-SQL store is a good storage
>        solution for any of the data currently used by Nova, Glance (registry),
>        Cinder (registry), Ceilometer, and Quantum. All of the data stored and
>        manipulated in those projects is HIGHLY relational data, and not
>        objects/documents. Switching to use a KVS for highly relational data is
>        a terrible decision. You will just end up implementing joins in your
>        code...
>
>
>
>    +10000
>
>    FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
>    here.
>
>
>
>I have to say I'm kind of baffled by this sentiment (expressed here and
>elsewhere in the thread.) I'm not a NoSQL expert, but I hang out with a few and
>I'm pretty confident Glance at least is not that relational. We do two types of
>joins in glance. The first, like image properties, is basically just an
>implementation detail of the sql driver. Its not core to the application. Any
>NoSQL implementation will simply completely denormalize those properties into
>the image record. (And honestly, so might an optimized SQL implementation. . .)
>
>The second type of join, image_members, is basically just a hack to solve the
>problem created because the glance api offers several simultaneous implicit
>"views" of images. Specifically, when you list images in glance, you are seeing
>a union of three views: public images, images you own, and images shared with
>you. IMO its actually a more scalable and sensible solution to make these views
>more explicit and independent in the API and code, taking a lesson from
>filesystems which have to scale to a lot of metadata (notice how visibility is
>generally an attribute of a directory, not of regular files in your typical
>Unix FS?). And to solve this problem in SQL now we still have to do a
>server-side union, which is a bit sad. But even before we can refactor the API
>(v3 anyone?) I don't see it as unworkably slow for a NoSQL driver to track
>these kinds of views.

You make really good points here but I don't fully agree.

I don't think the issue is actually translating Glance's models to
NoSQL or NoSQL db's performance, I'm pretty sure we could benefit in some
areas but not all of them. To me, and that's what my comment was referring
to, this is more related to  what kind of data we're actually
treating, the guarantees we should provide and how they are
implemented now.

There are a couple of things that would worry me about an hypothetic
support for NoSQL but I guess one that I'd consider very critical is
migrations. Some could argue asking whether we'd really need them or
not  - when talking about NoSQL databases - but we do. Using a
schemaless database wouldn't mean we don't have a schema. Migrations
are not trivial for some NoSQL databases, plus, this would mean
drivers, most probably, would have to have their own implementation.

>The bigger concern to me is that Glance seems a bit trigger-happy with indexes.
>But I'm confident we're in a similar boat there: performance in NoSQL won't be
>that terrible for the most important use cases, and a later refactoring can put
>us on a more sustainable track in the long run. 

I'm not worried about this, though. 


>>> All I'm saying is that we should be careful not to swap one set of
>>> problems for another.
>
>> My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
>> direct sub in and I fear that this effort can be adding a large workload
>> for little benefit.
>
>The goal isn't really to replace sqlalchemy completely. I'm hoping I can create
>a space where multiple drivers can operate efficiently without introducing bugs
>(i.e. pull all that business logic out of the driver!) I'll be very interested
>to see if people can, after such a refactoring, try out some more storage
>approaches, such as dropping the sqlalchemy orm in favor of its generic engine
>support or direct sql execution, as well as NoSQL what-have-you. We don't have
>to make all of the drivers live in the project, so it really can be a good
>place for interested parties to experiment.

And this is exactly what I'm concerned about. There's a lot of
business logic implemented at the driver level right now which makes
it really difficult (impossible?) to even think about using a NoSQL
database. However, I'm not even sure that taking BL to a higher level
would be the "go-time" for new NoSQL drivers. 

As mentioned already, this might end up in app-level implementations
that shouldn't be there.

Again, I'm not arguing NoSQL capabilities in this matter - I'm a huge
fan of NoSQL technologies -, what I'd argue is whether they are the
best tool for this task. This is something that should be evaluated in
a per module basis, which I obviously don't have a complete knowledge
of.

Cheers,
FF

-- 
@flaper87
Flavio Percoco



More information about the OpenStack-dev mailing list