[openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

Mark Washenberger mark.washenberger at markwash.net
Tue Aug 20 16:42:56 UTC 2013


On Tue, Aug 20, 2013 at 3:20 AM, Flavio Percoco <flavio at redhat.com> wrote:

> On 20/08/13 00:15 -0700, Mark Washenberger wrote:
>
>>
>>        2) I highly caution folks who think a No-SQL store is a good
>> storage
>>        solution for any of the data currently used by Nova, Glance
>> (registry),
>>        Cinder (registry), Ceilometer, and Quantum. All of the data stored
>> and
>>        manipulated in those projects is HIGHLY relational data, and not
>>        objects/documents. Switching to use a KVS for highly relational
>> data is
>>        a terrible decision. You will just end up implementing joins in
>> your
>>        code...
>>
>>
>>
>>    +10000
>>
>>    FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
>>    here.
>>
>>
>>
>> I have to say I'm kind of baffled by this sentiment (expressed here and
>> elsewhere in the thread.) I'm not a NoSQL expert, but I hang out with a
>> few and
>> I'm pretty confident Glance at least is not that relational. We do two
>> types of
>> joins in glance. The first, like image properties, is basically just an
>> implementation detail of the sql driver. Its not core to the application.
>> Any
>> NoSQL implementation will simply completely denormalize those properties
>> into
>> the image record. (And honestly, so might an optimized SQL
>> implementation. . .)
>>
>> The second type of join, image_members, is basically just a hack to solve
>> the
>> problem created because the glance api offers several simultaneous
>> implicit
>> "views" of images. Specifically, when you list images in glance, you are
>> seeing
>> a union of three views: public images, images you own, and images shared
>> with
>> you. IMO its actually a more scalable and sensible solution to make these
>> views
>> more explicit and independent in the API and code, taking a lesson from
>> filesystems which have to scale to a lot of metadata (notice how
>> visibility is
>> generally an attribute of a directory, not of regular files in your
>> typical
>> Unix FS?). And to solve this problem in SQL now we still have to do a
>> server-side union, which is a bit sad. But even before we can refactor
>> the API
>> (v3 anyone?) I don't see it as unworkably slow for a NoSQL driver to track
>> these kinds of views.
>>
>
> You make really good points here but I don't fully agree.
>

Thanks for your measured response. I wrote my previous response a bit late
at night for me and I hope I wasn't rude :-/

>
> I don't think the issue is actually translating Glance's models to
> NoSQL or NoSQL db's performance, I'm pretty sure we could benefit in some
> areas but not all of them. To me, and that's what my comment was referring
> to, this is more related to  what kind of data we're actually
> treating, the guarantees we should provide and how they are
> implemented now.
>
> There are a couple of things that would worry me about an hypothetic
> support for NoSQL but I guess one that I'd consider very critical is
> migrations. Some could argue asking whether we'd really need them or
> not  - when talking about NoSQL databases - but we do. Using a
> schemaless database wouldn't mean we don't have a schema. Migrations
> are not trivial for some NoSQL databases, plus, this would mean
> drivers, most probably, would have to have their own implementation.


I definitely think different drivers will need their own migrations. When
I've been playing around with this refactoring, I created a "Migrator"
interface and made it part of the driver interface to instantiate an
appropriate migrator object. But I was definitely concerned about a number
of things here. First off, is it just too confusing to have multiple
migrations? The migration sequences will definitely need to be different
per driver. How do we support cross-driver migrations?


>
>
>  The bigger concern to me is that Glance seems a bit trigger-happy with
>> indexes.
>> But I'm confident we're in a similar boat there: performance in NoSQL
>> won't be
>> that terrible for the most important use cases, and a later refactoring
>> can put
>> us on a more sustainable track in the long run.
>>
>
> I'm not worried about this, though.
>

Okay, that is reassuring.


>
>  All I'm saying is that we should be careful not to swap one set of
>>>> problems for another.
>>>>
>>>
>>  My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
>>> direct sub in and I fear that this effort can be adding a large workload
>>> for little benefit.
>>>
>>
>> The goal isn't really to replace sqlalchemy completely. I'm hoping I can
>> create
>> a space where multiple drivers can operate efficiently without
>> introducing bugs
>> (i.e. pull all that business logic out of the driver!) I'll be very
>> interested
>> to see if people can, after such a refactoring, try out some more storage
>> approaches, such as dropping the sqlalchemy orm in favor of its generic
>> engine
>> support or direct sql execution, as well as NoSQL what-have-you. We don't
>> have
>> to make all of the drivers live in the project, so it really can be a good
>> place for interested parties to experiment.
>>
>
> And this is exactly what I'm concerned about. There's a lot of
> business logic implemented at the driver level right now which makes
> it really difficult (impossible?) to even think about using a NoSQL
> database. However, I'm not even sure that taking BL to a higher level
> would be the "go-time" for new NoSQL drivers.
> As mentioned already, this might end up in app-level implementations
> that shouldn't be there.
>

I appreciate this concern. I do think that moving the BL out of the driver
is just good because its good, though, as well.


>
> Again, I'm not arguing NoSQL capabilities in this matter - I'm a huge
> fan of NoSQL technologies -, what I'd argue is whether they are the
> best tool for this task. This is something that should be evaluated in
> a per module basis, which I obviously don't have a complete knowledge
> of.
>

I think its possible that some folks just really want the HA and
reliability attributes of NoSQL. I feel compelled to support this desire in
some form or another. But I don't want to create a mess for the project to
do it, so I appreciate your concerns.


>
> Cheers,
>
> FF
>
> --
> @flaper87
> Flavio Percoco
>
> ______________________________**_________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.**org <OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130820/7a695a70/attachment.html>


More information about the OpenStack-dev mailing list