In Part1 we discussed how to prepare your environment and how to install the necessary tools for this installation. We also went through the required DNS and DHCP settings.
In this part we will cover the actuall stand up of the cluster with the antrea CNI.
Download Antrea
You need a Broadcom support account with entitlement to Antrea to get those files. Download the Antrea manifests from the Broadcom download portal.
Needed is the following:
VMware Container Networking with Antrea, K8s Operator Manifests which should give you the "deploy.tar.gz" file.
Unpack it in the install folder and adjust the images in the files as stated below. These are examples for vmware antrea 1.1.0 (which correlates to antrea upstream 2.1.0). If you need a different version of antrea find the images on the release notes on the broadcom website.
deploy/openshift/operator.antrea.vmware.com_v1_antreainstall_cr.yaml: antreaAgentImage: projects.packages.broadcom.com/antreainterworking/antrea-agent-ubi:v2.1.0_vmware.3
deploy/openshift/operator.antrea.vmware.com_v1_antreainstall_cr.yaml: antreaControllerImage: projects.packages.broadcom.com/antreainterworking/antrea-controller-ubi:v2.1.0_vmware.3
deploy/openshift/operator.antrea.vmware.com_v1_antreainstall_cr.yaml: interworkingImage: projects.packages.broadcom.com/antreainterworking/interworking-ubi:1.1.0_vmware.1
deploy/openshift/operator.yaml:
image: projects.packages.broadcom.com/antreainterworking/antrea-operator:v2.1.0_vmware.3Install OCP Cluster with Antrea CNI
In order to install a cluster with Antrea as a CNI we need to use the installer provisioned cluster.
Create install config
You have two options on how to create the install config. Either do everything by hand (or copy an existing one) or create it dynamically and update accordingly with the values you need.
Run the following command in the folder you want to use for the config files.
./openshift-install create install-config --dir .This will ask you for some values to set in order to install the cluster in the following order:
- SSH Public Key: Add a key you want to use to ssh into the nodes should you need it
- Platform: Which platform you want to use, select vsphere here
- vCenter: Type the FQDN of your vCenter
- Username: The username on the vCenter used, can be Administrator or a service account with the privileges outlined here: https://docs.redhat.com/en/documentation/openshift_container_platform/4.17/html/installing_on_vmware_vsphere/installer-provisioned-infrastructure#installation-vsphere-installer-infra-requirements-account_ipi-vsphere-installation-reqs
- Password: Password of that user
- Default Datastore: The datastore where the nodes will be deployed to
- Network: The network to which the nodes will be attached to
- Virtual IP Address for API: The IP you gave the VIP for the API as described in Part 1 of this series
- Virtual IP Address for API: The IP you gave the VIP for HTTP in Part 1
- Base Domain: The domain of your environment used for this install, example could be ocp.homelab.local. Needs to be correctly reflected in the DNS entries you made.
- Cluster Name: The name of the cluster. Will be part of the FQDN of the cluster
- Pull Secret: Your pull secret to pull images from RedHat. If you use OKD instead you use the fake pull secret they provide:
{"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}
After this a new file will be created in the folder you specified. Save it to some other location, as this will be consumed during the installation.
Edit the file named install-config.yaml with at least the following adjustments to get antrea as your CNI:
...
...
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 10.10.0.0/16 # ensure reachable cidr for VMs and APIs
networkType: antrea # Changed from OVNKubernetes to antrea
serviceNetwork:
- 172.30.0.0/16
platform:
vsphere:
apiVIPs:
- 10.10.6.97 # Should be the API VIP created earlier
...
...
ingressVIPs:
- 10.10.6.98 # Should be the HTTP VIP created earlier
...
...
publish: External
pullSecret: |
# should be the RedHat secret or the OKD secret if using OKD
sshKey: |
# should show the SSH Key selected when creating the file with the installerCreate manifests
In order to create cluster resources on the vSphere infrastructure the installer needs to create manifests that are used to create the machines. Create these by running the following command:
./openshift-install create manifests --dir .
The installer will now create several files and folders. Important here is the folder manifests/ because in here we need to copy the Antrea deployment manifests so that the cluster can install the essential files.
Copy the antrea manifests that we prepared earlier into the manifests folder as follows:
cp ./deploy/openshift/* ./manifests/
After that the cluster should be ready to be deployed.
Install the cluster
Now run the installer and install the cluster. This can take a very long time so be patient. You will see the status of the install roughly. More detailed infos can be found in the .openshift-install.log file, or refer to the troubleshooting sections here if something goes wrong.
./openshift-install create cluster --dir .
The installer will now create a template in the specified vCenter and clone this. First it will setup 1 bootstrap node and 3 master nodes. After they are done setting up it will create the specified amount of worker nodes.
The installer will output the current state it is on and until when it needs to be done, otherwise it will run into a timeout. You can follow the installation in the vCenter (creation of the nodes) and in the AVI controller to see if the api, machineconfig and eventually the ingress comes up.
If your install failes at the step
INFO Waiting up to 20m0s (until 4:10PM CET) for the Kubernetes API at https://api.oscp-test-01.homelab.local:6443...
ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://api.oscp-test-01.homelab.local:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp: lookup api.oscp-test-01.homelab.local on 127.0.0.53:53: no such host
ERROR Bootstrap failed to complete: Get "https://api.oscp-test-01.homelab.local:6443/version": dial tcp: lookup api.oscp-test-01.soultec.lab on 127.0.0.53:53: no such host
ERROR Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane.
This is a timeout while waiting for the API to come up. If it is just taking longer than expected you can run the command:
./openshift-install wait-for bootstrap-complete --dir .
Which then will wait for the API again with the same config.
Once this succeeds you can then run:
./openshift-install wait-for install-complete --dir .
which will then wait until the installation is complete and output the commands to access the cluster.
If it still fails you might have some error in your configuration. The most likely errors are:
- Incorrect VIP IPs set.
- Incorrect DHCP IP Addresses assigned that do not line up with the pool created in the loadbalancer
- Incorrect vSphere settings
Verify if all your settings are correct and try again.
Access the cluster
If everything went well, you can access the cluster through the URL https://console-openshift-console.cluster-name.base.domain. The password for the user kubeadmin should be posted in the console, if you cannot find it, it's also located in the folder ./auth/kubeadmin-password. If the install failed in the later stages it might just take a bit longer. Try to login to the console and see if it is up so you can continue to troubleshoot through there.
To login with the oc tool do the following:
./oc login <cluster-api-vip>:6443
Then you need to login by going to the web and generating the token and logging in by pasting the command you get from there.
Conclusion
If you followed all this steps so far you should have a running OpenShift cluster with access to the dashboard and through the OC tool. This is now running with Antrea as a CNI.
In the next part we will be running the Antrea-NSX-interworking installation, so that we can see and create antrea network policies directly in the cluster from the NSX UI.
Another point we will tackle is LDAP integration, so you can login to the cluster and the dashboard with an LDAP account. Alternatively you can also integrate an OIDC provider to do login through that.