[Openstack] [OpenStack][Swift] Fast way of uploading 200GB of 200KB files to Swift

Leander Bessa Beernaert leanderbb at gmail.com
Mon Jan 14 17:01:26 UTC 2013


I currently have 4 machines running 10 clients each uploading 1/40th of the
data. More than 40 simultaneous clientes starts to severely affect
Keystone's ability to handle these operations.


On Mon, Jan 14, 2013 at 4:58 PM, Chuck Thier <cthier at gmail.com> wrote:

> That should be fine, but it doesn't have any way of reporting stats
> currently.  You could use tools like ifstat to look at how much
> bandwidth you are using.  You can also look at how much cpu the swift
> tool is using.  Depending on how your data is setup, you could run
> several swift-client processes in parallel until you max either your
> network or cpu.  I would start with one client first, until you max it
> out, then move on to the next.
>
> --
> Chuck
>
> On Mon, Jan 14, 2013 at 10:45 AM, Leander Bessa Beernaert
> <leanderbb at gmail.com> wrote:
> > I'm currently using the swift client to upload files, would you recommend
> > another approach?
> >
> >
> > On Mon, Jan 14, 2013 at 4:43 PM, Chuck Thier <cthier at gmail.com> wrote:
> >>
> >> Using swift stat probably isn't the best way to determine cluster
> >> performance, as those stats are updated async, and could be delayed
> >> quite a bit as you are heavily loading the cluster.  It also might be
> >> worthwhile to use a tool like swift-bench to test your cluster to make
> >> sure it is properly setup before loading data into the system.
> >>
> >> --
> >> Chuck
> >>
> >> On Mon, Jan 14, 2013 at 10:38 AM, Leander Bessa Beernaert
> >> <leanderbb at gmail.com> wrote:
> >> > I'm getting around 5-6.5 GB a day of bytes written on Swift. I
> >> > calculated
> >> > this by calling "swift stat && sleep 60s && swift stat". I did some
> >> > calculation based on those values to get to the end result.
> >> >
> >> > Currently I'm resetting swift with a node size of 64, since 90% of the
> >> > files
> >> > are less than 70KB in size. I think that might help.
> >> >
> >> >
> >> > On Mon, Jan 14, 2013 at 4:34 PM, Chuck Thier <cthier at gmail.com>
> wrote:
> >> >>
> >> >> Hey Leander,
> >> >>
> >> >> Can you post what performance you are getting?  If they are all
> >> >> sharing the same GigE network, you might also check that the links
> >> >> aren't being saturated, as it is pretty easy to saturate pushing 200k
> >> >> files around.
> >> >>
> >> >> --
> >> >> Chuck
> >> >>
> >> >> On Mon, Jan 14, 2013 at 10:15 AM, Leander Bessa Beernaert
> >> >> <leanderbb at gmail.com> wrote:
> >> >> > Well, I've fixed the node size and disabled the all the replicator
> >> >> > and
> >> >> > auditor processes. However, it is even slower now than it was
> before
> >> >> > :/.
> >> >> > Any
> >> >> > suggestions?
> >> >> >
> >> >> >
> >> >> > On Mon, Jan 14, 2013 at 3:23 PM, Leander Bessa Beernaert
> >> >> > <leanderbb at gmail.com> wrote:
> >> >> >>
> >> >> >> Ok, thanks for all the tips/help.
> >> >> >>
> >> >> >> Regards,
> >> >> >>
> >> >> >> Leander
> >> >> >>
> >> >> >>
> >> >> >> On Mon, Jan 14, 2013 at 3:21 PM, Robert van Leeuwen
> >> >> >> <Robert.vanLeeuwen at spilgames.com> wrote:
> >> >> >>>
> >> >> >>> > Allow me to rephrase.
> >> >> >>> > I've read somewhere (can't remember where) that it would be
> >> >> >>> > faster
> >> >> >>> > to
> >> >> >>> > upload files if they would be uploaded to separate containeres.
> >> >> >>> > This was suggested for a standard swift installation with a
> >> >> >>> > certain
> >> >> >>> > replication factor.
> >> >> >>> > Since I'll be uploading the files with the replicators turned
> >> >> >>> > off,
> >> >> >>> > does
> >> >> >>> > it really matter if I insert a group of them in separate
> >> >> >>> > containeres?
> >> >> >>>
> >> >> >>> My guess is this concerns the SQLite database load distribution.
> >> >> >>> So yes, it still matters.
> >> >> >>>
> >> >> >>> Just to be clear: turning replicators off does not matter at ALL
> >> >> >>> when
> >> >> >>> putting files in a healthy cluster.
> >> >> >>> Files will be "replicated" / put on all required nodes at the
> >> >> >>> moment
> >> >> >>> the
> >> >> >>> put request is done.
> >> >> >>> The put request will only give an OK when there is quorum writing
> >> >> >>> the
> >> >> >>> file (the file is stored on more than half of the required object
> >> >> >>> nodes)
> >> >> >>> The replicator daemons do not have anything to do with this.
> >> >> >>>
> >> >> >>> Cheers,
> >> >> >>> Robert
> >> >> >>>
> >> >> >>> _______________________________________________
> >> >> >>> Mailing list: https://launchpad.net/~openstack
> >> >> >>> Post to     : openstack at lists.launchpad.net
> >> >> >>> Unsubscribe : https://launchpad.net/~openstack
> >> >> >>> More help   : https://help.launchpad.net/ListHelp
> >> >> >>>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > Mailing list: https://launchpad.net/~openstack
> >> >> > Post to     : openstack at lists.launchpad.net
> >> >> > Unsubscribe : https://launchpad.net/~openstack
> >> >> > More help   : https://help.launchpad.net/ListHelp
> >> >> >
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130114/1aa070da/attachment.html>


More information about the Openstack mailing list