Openshift on vSphere - Part 1 Preparation

As many know, VMware offers its own Kubernetes solution called vSphere Kubernetes Service (VKS). This relies on the vSphere Supervisor installed in a vCenter. This, although a great product for the integration of Kubernetes in a VMware environment, has a lot of dependencies and many things that need to be concidered before installing it, from a host, network and storage perspective.

There are easier solutions for quickly starting up some Kubernetes clusters like native K8S, K3S or Rancher as I have already detailed here: https://blog.redlab.li/rancher-on-vsphere-part-1/

There is also another solution that many have probably heard of calles OpenShift by a small little company called RedHat. This can also be installed on vSphere with some scripts where it automatically creates the VMs and the cluster on an existing vSphere cluster. There is the supported and not free version calles OSCP (OpenShift Container Platform) but also the OpenSource upstream project called OKD. The main differences are that you obviously don't get support with OKD and can also not use the enterprise repositories for apps and plugins. There are also some differences in what OS is used for the nodes etc. but these are not that important for this article and can be seen here in detail: https://www.redhat.com/en/topics/containers/red-hat-openshift-okd

For a lab usage however this is pretty great to test some things out.

Prerequisites

In order to install OKD or OSCP you need to consider the requirements that are needed to install and manage your cluster. From a vSphere and Hardware perspective these are the follwing:

You need vSphere 7.0 or later. Also the free memory and CPU are the following:

Machine	Operating System	CPU ^[1]	RAM	Storage	Input/Output Per Second (IOPS)^[2]
Bootstrap	Fedora	4	16 GB	100 GB	300
Control plane	FCOS	4	16 GB	100 GB	300
Compute	FCOS	2	8 GB	100 GB	300

Machine

Operating System

CPU ^[1]

RAM

Storage

Input/Output Per Second (IOPS)^[2]

Bootstrap

Fedora

16 GB

100 GB

300

Control plane

FCOS

16 GB

100 GB

300

Compute

FCOS

8 GB

100 GB

300

In a productive environment you would need 3 Control Plane VMs and at least 2 Compute VMs. In a lab you can deploy with less or even combine compute and control plane to limit to 3 nodes total. The bootstrap VM is only used for installation and gets deleted again after the installation has finished.

In this example I have the ressources so I deploy this with 3 control plane and 3 compute nodes. All of these nodes have 4 vCPUs and 16GB of RAM.

You also need an external LoadBalancer to access your cluster and LoadBalance between the nodes. This has to be setup yourself. In this example I used AVI Loadbalancer but any other that can be configured can work aswell.

Lastly you need a DHCP Server to give IPs to those VMs. This can be a bit tricky from experience, as it needs to be a DHCP Server which gives out IPs in sequence, because you need the loadbalancer configured before the install. There are options to configure it with static IPs but in the lab I just installed a Windows Server VM with the DHCP Server configured to assign the IPs of the nodes.

Configure LoadBlancer

You need to create the following objects in your LoadBalancer so that the cluster can install correctly and run correctly. It is important, that the IPs of the nodes are correct and the Virtual Services are correct as the installer wants to connect over the VIPs to access the cluster. If this is not correct the installation will fail.

Create Pools and VIPs for the following:

API Pool
Members: Master Nodes
Port: 6443
VIP: API

Machineconfig Pool
Members: Master Nodes
Port: 22623
VIP: API

Ingress Ingress Pool HTTP
Members: Worker Node
Port: 80
VIP: Ingress

Ingress Ingress Pool HTTPS
Members: Worker Nodes
Port: 443
VIP: Ingress

Create L4 Virtual Services for those Pools.

You need to create 2 VIPs, one for HTTP(s) and one for API. The API VIP will have the Virtual Service for API and machineconfig assigned, and the HTTP VIP will have the HTTP and HTTPS Service assigned.

As you can see, you need to create the pools and for that you need the IPs of the nodes. The installer creates a bootstrap node first which gets deleted again after the installation of the master nodes has finished. However, as the installer creates the bootstrap node and the master nodes almost at the same time it is very hard to determine which node gets which IP (as it is a first come, first serve scenario). Therefore you need to add the first 4 IPs to the API and Machineconfig pool and the next 3 to the ingress pools.

Prepare DNS

You will need some DNS entries, so that you can access the cluster (and the installer also accesses it via the DNS name). The DNS names have to point to the Virtual Service VIPs that you have created in the LoadBalancer. The following DNS entries need to be created:

api.clustername.domain.local - API VIP
api-init.clustername.domain.local - API VIP
*.apps.clustername.domain.local - Ingress VIP

Prepare DHCP

Create a DHCP Server which can be configured. I used a Windows DHCP server but a Linux one can also be used. However, with the NSX Segment DHCP server I run into some issues, as this one doesn't assign the IPs in a sequentual order but instead picks random IPs from the whole pool. This makes it very difficult to create the Pools on the LoadBalancer.

The following options have to be set according to RedHat:

Option	Value (Example)	Description
003	10.177.165.254	Standard Gateway of network used
004	10.24.0.10, 10.24.0.11	Time Servers
006	10.24.0.10, 10.24.0.11	DNS Servers
042	10.24.0.10, 10.24.0.11	NTP Servers
101	Europe/Zurich	TimeZone

I have indeed run into problems when Options 042 and 101 were not set and the installer run into a timeout when trying to set the timezone on the nodes.

Download required files

Next we need to download the required files to a Linux machine (Windows could also work but I prefer a Linux VM). I installed OKD in this example so I directly downloaded the files from the OKD GitHub page. The newest version at the time of writing was 4.19.0-okd-scos.1 so I used this. You can get newer releases if available from the GitHub page and adjust the links accordingly: https://github.com/okd-project/okd/releases

On your Linux machine do the following:

mkdir okd-installer && cd okd-installer
wget https://github.com/okd-project/okd/releases/download/4.19.0-okd-scos.1/openshift-install-linux-4.19.0-okd-scos.1.tar.gz
wget https://github.com/okd-project/okd/releases/download/4.19.0-okd-scos.1/openshift-client-linux-4.19.0-okd-scos.1.tar.gz
tar xzvf openshift-client-linux-4.19.0-okd-scos.1.tar.gz
tar xzvf openshift-install-linux-4.19.0-okd-scos.1.tar.gz

This downloads the OpenShift Client installer to your machine and unpacks the compressed files.

After that we can install the installer and oc client to the PATH so that you can run those binaries from everywhere like so:

sudo cp ./openshift-install /usr/local/bin/ && cp ./oc /usr/local/bin/

Next steps

After this your environment should be set up and ready to configure the installer. Because we are not done yet, not even close. It is important to note that there are quite a lot different configurations that still need to be done, like configuring the vCenter, the VIPs, DNS names etc in the installer config so that it knows where to deploy the cluster.
Also, I will download and adjust the deployment files for the Antrea CNI. As this is still a very VMware focussed blog, I will be using the Antrea CNI as my CNI. You could also use calico, flannel or another CNI. By default OKD installs the default OpenShift CNI. We will be replacing this with Antrea and integrate it into NSX in a future part of this blog series.