Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹
On Thu, 2020-01-23 at 20:48 +0000, Ruchi Rajasekhar wrote:
Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹
assuming you mean https://pivotal.io/ checking there docs it seams to be supported on openstack https://docs.pivotal.io/platform/2-8/plan/openstack/openstack_ref_arch.html there install guide is here https://docs.pivotal.io/platform/2-8/customizing/openstack.html pachyderm seams to mainly target deployment on kubernetes. even the local on perm guide https://docs.pachyderm.com/latest/deploy-manage/deploy/on_premises/ assumes that you will deploy kubernetes so you can always just deploy kubernets on openstack and then deploy pachyderm on that but it does not look like they tried to make it easy to install without kubernetes. infact that is more or less the first line in there deployment overview "Pachyderm runs on Kubernetes and is backed by an object store of your choice." https://docs.pachyderm.com/latest/deploy-manage/deploy/ since it was never intended to run on anything othe than kuberntes you best bet to deploy it is to deploy that first on openstack or bare mentally then deploy pachyderm as normal.
I am guessing you are asking about Pivotal Greenplum based on the data science question. If so, yes, they mention OpenStack support in the datasheet. https://content.pivotal.io/datasheets/pivotal-greenplum Michael On Thu, Jan 23, 2020 at 1:20 PM Sean Mooney <smooney@redhat.com> wrote:
On Thu, 2020-01-23 at 20:48 +0000, Ruchi Rajasekhar wrote:
Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹
assuming you mean https://pivotal.io/ checking there docs it seams to be supported on openstack https://docs.pivotal.io/platform/2-8/plan/openstack/openstack_ref_arch.html
there install guide is here https://docs.pivotal.io/platform/2-8/customizing/openstack.html
pachyderm seams to mainly target deployment on kubernetes. even the local on perm guide https://docs.pachyderm.com/latest/deploy-manage/deploy/on_premises/ assumes that you will deploy kubernetes so you can always just deploy kubernets on openstack and then deploy pachyderm on that but it does not look like they tried to make it easy to install without kubernetes. infact that is more or less the first line in there deployment overview "Pachyderm runs on Kubernetes and is backed by an object store of your choice." https://docs.pachyderm.com/latest/deploy-manage/deploy/
since it was never intended to run on anything othe than kuberntes you best bet to deploy it is to deploy that first on openstack or bare mentally then deploy pachyderm as normal.
On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar <RRajasekhar@misoenergy.org> wrote:
Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹
if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that. good luck! peace o/
Which data science platforms are you considering ? We may run some of them at CERN, we generally use Kubernetes (via Magnum) are the underlying provisioning engine with autoscaling up/down now available in Train. Our SPARK environments are provisioned likewise. Tim On 24 Jan 2020, at 15:48, Michael McCune <elmiko@redhat.com<mailto:elmiko@redhat.com>> wrote: On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar <RRajasekhar@misoenergy.org<mailto:RRajasekhar@misoenergy.org>> wrote: Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹ if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that. good luck! peace o/
At StackHPC, we have deployed Pangeo, JupyterHub and Kubeflow on K8s cluster deployed using Magnum. Kubeflow comprises of a lot of things so it is likely there is something in there for those doing data science stuff. Best Bharat
On 24 Jan 2020, at 18:09, Tim Bell <Tim.Bell@cern.ch> wrote:
Which data science platforms are you considering ?
We may run some of them at CERN, we generally use Kubernetes (via Magnum) are the underlying provisioning engine with autoscaling up/down now available in Train. Our SPARK environments are provisioned likewise.
Tim
On 24 Jan 2020, at 15:48, Michael McCune <elmiko@redhat.com> wrote:
On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar <RRajasekhar@misoenergy.org> wrote: Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹
if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that.
good luck!
peace o/
participants (6)
-
Bharat Kunwar
-
Michael Johnson
-
Michael McCune
-
Ruchi Rajasekhar
-
Sean Mooney
-
Tim Bell