Openshift 4 vSphere Install

Pre-reqs

Reserve one IP for Bootstrap Node
Reserve three IPs for Control-Plane Nodes
Reserve two or three IPs for Infra-Nodes
Check if Control-plane Load-Balancer VIP points to four IPs (Bootstrap + Control-Plane nodes)
Check if Infra-Node Load-Balancer VIP points to the Infra-Node IPs
Check created DNS entries: api.<cluster_name>.<base_domain> - Should point to Control-Plane LB VIP api-int.<cluster_name>.<base_domain> - Should point to Control-Plane LB VIP *.apps.<cluster_name>.<base_domain> - Should point to Infra-Node LB VIP etcd-0.<cluster_name>.<base_domain> - Should point to Control-Plane Node 0 etcd-1.<cluster_name>.<base_domain> - Should point to Control-Plane Node 1 etcd-2.<cluster_name>.<base_domain> - Should point to Control-Plane Node 2 _etcd-server-ssl._tcp.<cluster_name>.<base_domain> - Should be three SRV entries pointing to each of the Control-Plane nodes.
Check the DHCP server on same VLAN is up
Check the HTTP server on same VLAN is up (could be the install host)

Generate ssh key

On installer host:

ssh-keygen -t rsa -b 4096 -N '' \
    -f ~/.ssh/id_rsa

eval "$(ssh-agent -s)"

ssh-add ~/.ssh/id_rsa

SSH_PUB_KEY=`cat ~/.ssh/id_rsa`

Save both id_rsa and id_rsa.pub files to be able to access the cluster in the future

Create install dir and install files

INSTALL_DIR=$INSTALL_DIR
mkdir $INSTALL_DIR

Get installer link from https://cloud.redhat.com/openshift/install Get the Pull secret from same page and save as a .txt file in the install dir.

Generate install yaml `install-config.yaml`

apiVersion: v1
baseDomain: basedomain.local
compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0
controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3
metadata:
  name: clustername
networking:
  clusterNetwork:
  - cidr: 10.25.0.0/16
    hostPrefix: 23
  machineCIDR: 10.36.73.0/23
  networkType: OpenShiftSDN
  serviceNetwork:
  - 10.26.0.0/16
platform:
  vsphere:
    vcenter: your.vcenter.server
    username: username
    password: password
    datacenter: datacenter
    defaultDatastore: datastore
pullSecret: '{"auths": ...}'  # Contents of the pull secret
sshKey: 'ssh-ed25519 AAAA...' # Contents from id_rsa.pub

If datastore is part of a cluster datastore, add the full path (clusterdatastore/datastore).

If using proxy, add to the config file above:

...
proxy:
  httpProxy: http://<username>:<pswd>@<ip>:<port>
  httpsProxy: http://<username>:<pswd>@<ip>:<port>
  noProxy: example.com,example2.com
...

Create a backup dir and make a copy of the install file on it. The file get's deleted by installer in the process.

Creating manifests and ignition files

./openshift-install create manifests --dir=$INSTALL_DIR

Change the parameter for the schedulable on masters:

vi manifests/cluster-scheduler-02-config.yml
# Set mastersSchedulable to False

Configure NetworkPolicy

touch $INSTALL_DIR/manifests/cluster-network-03-config.yml
vi $INSTALL_DIR/manifests/cluster-network-03-config.yml

Add content:

apiVersion: operator.openshift.io/v1
kind: Network
metadata:
  name: cluster
spec:
  defaultNetwork:
    type: OpenShiftSDN
    openshiftSDNConfig:
      mode: NetworkPolicy
      mtu: 1450
      vxlanPort: 4789

Ref. https://docs.openshift.com/container-platform/4.2/networking/configuring-networkpolicy.html

Generate ignition files:

./openshift-install create ignition-configs --dir $INSTALL_DIR

Copy the bootstrap.ign file to the HTTP server

Create the bootstrap-append file locally with vi $INSTALL_DIR/append-bootstrap.ign

{
  "ignition": {
    "config": {
      "append": [
        {
          "source": "<bootstrap_ignition_config_url>",
          "verification": {}
        }
      ]
    },
    "timeouts": {},
    "version": "2.1.0"
  },
  "networkd": {},
  "passwd": {},
  "storage": {},
  "systemd": {}
}

Generate the base64 for the ignition files:

base64 -w0 $INSTALL_DIR/master.ign > $INSTALL_DIR/master.64
base64 -w0 $INSTALL_DIR/worker.ign > $INSTALL_DIR/worker.64
base64 -w0 $INSTALL_DIR/append-bootstrap.ign > $INSTALL_DIR/append-bootstrap.64

Create the machines on vCenter

Download the Red Hat CoreOS OVA from https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.2/4.2.0/rhcos-4.2.0-x86_64-vmware.ova
On vSphere Client
1. Create a folder to hold template mathcing the cluster name from install file
2. Create a template from the OVA for each node type (Master, Worker, Bootstrap)
3. Add the bootstrap Configuration Parameters:
  1. guestinfo.ignition.config.data = Content of each machine type base64 ignition (master, worker, bootstrap)
  2. guestinfo.ignition.config.data.encoding = base64
  3. disk.EnableUUID = TRUE
  4. Set the parameter: VM Options → Advanced → Latency Sensitivity, to: High.
Clone one Bootstrap from the Bootstrap template
Clone three Master-nodes from the Master template
Clone at least two Infra-Nodes from the Worker template
Collect the MAC addresses from the Boostrap, Master-nodes and Infra-Nodes and update the DHCP server binding these MAC addresses with reserved IPs.
Boot the bootstrap machine, three masters and two workers customizing resource sizes (CPU/MEM) if needed.

Ref. https://docs.openshift.com/container-platform/4.2/installing/installing_vsphere/installing-vsphere.html#installation-vsphere-machines_installing-vsphere

Monitor for the bootstrap process to finish

./openshift-install --dir=$INSTALL_DIR wait-for bootstrap-complete \
    --log-level=info

After the bootstrap is done, remove bootstrap node from load-balancer and then remove it from vCenter.

Log-in into the cluster

export KUBECONFIG=$INSTALL_DIR/auth/kubeconfig

oc get-nodes
oc whoami

Approve CSRs for nodes

oc get nodes

oc get csr  # Will show certificates as Pending

Approve all CSRs:

oc get csr --no-headers | awk '{print $1}' | xargs oc adm certificate approve

Check operator status

watch -n5 oc get clusteroperators

Configure registry

With persistent storage:

Create a NFS export on NFS Server with parameters:

# cat /etc/exports
/mnt/data *(rw,sync,no_wdelay,no_root_squash,insecure,fsid=0)
# exportfs -rv
exporting *:/mnt/data

Create a PV pointing to the created export dir and NFS server IP

apiVersion: v1
kind: PersistentVolume
metadata:
  name: registry-storage
  namespace: 
spec:
  capacity:
    storage: 5Gi 
  accessModes:
  - ReadWriteOnce 
  nfs: 
    path: /tmp 
    server: 172.17.0.2 
  persistentVolumeReclaimPolicy: Recycle

Edit the image registry operator with created PV

$ oc edit configs.imageregistry.operator.openshift.io

...
storage:
  pvc:
    claim:

Leaving the claim

https://docs.openshift.com/container-platform/4.2/installing/installing_vsphere/installing-vsphere.html#registry-configuring-storage-vsphere_installing-vsphere

Without persistent storage:

oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'

After all operators are up, finish the install:

./openshift-install --dir=$INSTALL_DIR wait-for install-complete

Move internal applications to Infra-Nodes

Tag nodes with infra labels

oc label node node1 node2 node-role.kubernetes.io/infra=

Patch ingress with correct label

oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement":{"nodeSelector": {"matchLabels":{"node-role.kubernetes.io/infra":""}}}}}'

Change amount of router replicas

oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"replicas": 3}}'

Move registry and monitoring to infra-node

# Registry
oc patch configs.imageregistry.operator.openshift.io/cluster -n openshift-image-registry --type=merge --patch '{"spec":{"nodeSelector":{"node-role.kubernetes.io/infra":""}}}'

# Monitoring Stack
cat <<EOF > $HOME/monitoring-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |+
    alertmanagerMain:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    prometheusK8s:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    prometheusOperator:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    grafana:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    k8sPrometheusAdapter:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    kubeStateMetrics:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    telemeterClient:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
EOF

oc create -f $HOME/monitoring-cm.yaml -n openshift-monitoring

Configure authentication

Local htpasswd Auth

As an alternative to external authentication like LDAP, configure the htpasswd as an option in case the external server is unavailable.

# Create password file
htpasswd -c users.passwd -i admin

# Create secret file with the htpasswd contents
oc create secret generic htpass-secret --from-file=htpasswd=users.htpasswd -n openshift-config

# Create the htpasswd authentication manifest

cat <<EOF > htpasswd-auth.yaml
apiVersion: config.openshift.io/v1
 kind: OAuth
 metadata:
   name: cluster
 spec:
   identityProviders:
   - name: Usuario Local
     mappingMethod: claim
     type: HTPasswd
     htpasswd:
       fileData:
         name: htpass-secret
EOF

# Apply to cluster
oc apply -f htpasswd-auth.yaml

# Add cluster-admin role to user
oc adm policy add-cluster-role-to-user cluster-admin admin

LDAP Auth

apiVersion: config.openshift.io/v1
 kind: OAuth
 metadata:
   name: cluster
 spec:
   identityProviders:
     - ldap:
         attributes:
           email:
           - mail 
           id:
           - cn
           name:
             - cn
           preferredUsername:
             - sAMAccountName
         bindDN: [ldap-User]
         bindPassword:
           name: [ldap-secret]
         insecure: true
         url: >-
           ldap://[ldap-ip]:389/OU=CTO,OU=[ou1],OU=[ou2],OU=[ou3],DC=[dc],DC=jm?sAMAccountName?sub?(objectClass=*)
       mappingMethod: claim
       name: Usuario AD
       type: LDAP
     - htpasswd:
         fileData:
           name: htpasswd-6w4j5
       mappingMethod: claim
       name: Usuario Local
       type: HTPasswd

Scaling-up the cluster

To scale-up the cluster, clone additional machines from the worker template created on vCenter. Due to no support on machine-operator for vCenter provider, node autoscale is not available.

The new booted VMs will use the worker ignition files to be added to the existing cluster and it's IP will be fetched from DHCP server.

These machines don't need IP reservation and their MAC addresses do not need to be associated with the IPs.

Reinstalling the cluster

If required to reinstall the cluster, the cloned VMs must be re-created and their matching MAC addresses must be updated in the DHCP server.

The ignition files generated by the installer application contains certificates that are valid for 24 hours. If the cluster needs to be redeployed after 24h, the ignition files must be re-generated.

Misc

Configure vSphere connection parameters:

oc edit cm cloud-config -n openshift-kube-apiserver
oc edit cm cloud-provider-config -n openshift-config

carlosedp/ocp4-vsphere.md