over the last 2 weeks, I have set up a small cluster with a few servers (T350 DELL, so really small servers), completely changing the operation of storage, image management, using ephemerals because for most projects It's enough. And there, in a few seconds, I deploy 30 Windows 10 or 2019 server instances. I now have to properly distribute the resources of the future cluster, that is to say the control, network, compute, storage roles on baremetal servers or Vms....but I am happy.. I am moving forward.

thanks again

Franck VEDEL

Le 19 mars 2024 à 08:05, Eugen Block <eblock@nde.ag> a écrit :

Hi Franck, sorry for my late response, the last days have been quite busy.
Indeed, several minutes to spawn a new VM is long, as others already suggested it makes sense to verify where that time is spent.
I don't have too many insights in budget and hardware planning so I can't really help with that. But if HA is not an issue you could go for a hyperconverged setup and colocate all the services. That would require some powerful servers, not sure if that fits into your budget though. Some of our customers have their hardware vendors which have a catalog for different use-cases (storage, hypervisors, etc) to choose from, do you have such an option as well?

Zitat von Franck VEDEL <franck.vedel@univ-grenoble-alpes.fr>:

Hello and thanks for your help. Its very interesting to see your response.

first, excuse my english… google translate help me a lot…

I understand the difficulty of the question (build a ne cluster, it's difficult to understand the different solutions for configuring a cluster, there are so many parameters.
What is certain is that I don't need HA. Actually, with my 3 nodes cluster, 2 nodes are controllers and network, 3 nodes are compute, one is storage (with an iscsi bay).
All vcpus are used so I need to delete some projects before starting a new lab with students.

I just tried to build a test open stack cluster with 6 nodes, with a Ceph cluster (so openstack on the same servers). Ceph is used with cinder. Instance creation is slow. For example, if I create a 20G Windows instance (with volume on a Ceph cluster), it takes 6 minutes (so if I put 30 students in parallel doing this operation, it is very long, too long). If I don't use a volume, same thing because the Ceph cluster has a "vms" pool in use.

On my current production cluster, the same instance creation operation without volume (ephemeral disk) is fast, but I do not have enough disks (800G) on each server. And no possibilities to add disks.
What I need is a solution that allows me to quickly create instances (including Windows) from 20 to 40G, in ephemeral, but that I can use for certain projects to create images from snapshots so I also need a solution with volumes.

In short... it's still complicated because I do all this in addition to my work, and I don't have all the time I would like for that.
Let's imagine a budget of 100,000 euros. How would you build a cluster for 250 students who would do labs to build networks configuration, creation and connection of instances, so nothing complicated, wanting instance creations to be fast. How many nodes, and what distribution of roles? Just to get some ideas…

Franck VEDEL

Le 11 mars 2024 à 14:04, Eugen Block <eblock@nde.ag> a écrit :

Hello Franck,

it's not an easy question to answer, I'll just write up a few of my thoughts.
In general, ceph is a good idea for openstack, yes. But you have to keep in mind that when a server fails in a 3 node cluster it becomes degraded as there's no target for recovery left until the server comes back online. So my recommendation would be at least 4 nodes for a "real" production ceph cluster, but 3 would work as long as your infrastructure is stable enough (no regular power outages or anything).
Colocating ceph and openstack on the same hardware can work (I read about it once in a while), but that means more services become unavailable (or are degraded) in case of maintenance or failure. And I'm not sure how deployment tools like kolla-ansible deal with it, I've never installed such a mixed infrastructure. If you colocated the compute service with the ceph servers you would have to migrate VMs every time a ceph server needs maintenance, or they become unavailable if a server fails unexpectedly (and you'd have to tweak the database to migrate them to a different compute node). So from a maintenance/failure point of view colocation is not the best idea.

We've had a single control node openstack running for years without any incident, but updating was disruptive, of course, at least for self-service networks, provider networks are directly available on the compute nodes, so most of the infrastructure was not impacted. We then added a second control node with a galera tie-breaker to have a HA cluster.

The question is what your requirements actually are wrt (high) availability. How is your current setup with 3 nodes? Are all 3 nodes both control and compute nodes? What is the current storage backend, local filesystem of the compute nodes?

Is it an option to buy more (smaller) nodes so that you could have a dedicated ceph cluster?

Regards,
Eugen

Zitat von Franck VEDEL <franck.vedel@univ-grenoble-alpes.fr>:

Good morning,
I currently have an Openstack cluster made up of 3 nodes, an iscsi bay (10T), 576 G of Ram, 10T, 288vcpus.
This cluster is used by around 150 students, but is reaching its limits.
Having obtained a budget to set up a larger cluster, I am wondering about the choice of the number of nodes, their role (how many controllers, network, compute, etc.) and above all what solution for storage.
Let's imagine a budget to buy 6 servers with good capacities, is the right choice Ceph storage (with cinder and rdb?) on the Openstack cluster nodes? Do we need 3 servers for a Ceph cluster and 3 for the Openstack part (in this case I lose capacity for the "compute" part)... I don't know what the right choices are and above all, I have a little afraid of going in the wrong directions.
Could any of you guide me, or give me links to sites that could help me (and that I haven't seen).
Thanks in advance

Franck VEDEL
Dép. Réseaux Informatiques & Télécoms
IUT1 - Univ GRENOBLE Alpes
0476824462
Stages, Alternance, Emploi.