Deploying the Platform in a private vSphere cloud environment

🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions.
Please contact your vendor to obtain the required vSphere installer.

1. Preparing the vSphere infrastructure for OKD installation

OKD is a Kubernetes distribution optimized for continuous application development and deploying multiple instances of isolated container environments. In our case, such environments are registry instances. For detailed information, please refer to the official source.

1.1. Configuring the trusted vCenter API interface

The installer requires access to the trusted vCenter API interface, which allows for the retrieval of trusted vCenter root CA certificates.

Before connecting to the API, vCenter root CA certificates must be added to the system from which the OKD installer will be launched.

1.2. Downloading CA Certificates

Certificates can be downloaded from the vCenter homepage.

By default, certificates are stored at <vCenter>/certs/download.zip. After downloading and unpacking, a directory containing certificates for Linux, macOS, and Windows operating systems will be created.

1.2.1. Example of directory structure viewing

The directory structure with the certificates can be viewed using the following command:

$ tree certs

The resulting structure will be as follows:

certs

├── lin

│   ├── 108f4d17.0

│   ├── 108f4d17.r1

│   ├── 7e757f6a.0

│   ├── 8e4f8471.0

│   └── 8e4f8471.r0

├── mac

│   ├── 108f4d17.0

│   ├── 108f4d17.r1

│   ├── 7e757f6a.0

│   ├── 8e4f8471.0

│   └── 8e4f8471.r0

└── win

    ├── 108f4d17.0.crt

    ├── 108f4d17.r1.crl

    ├── 7e757f6a.0.crt

    ├── 8e4f8471.0.crt

    └── 8e4f8471.r0.crl


3 directories, 15 files

1.2.2. Example of adding certificates

You need to add the relevant certificates for your operating system.

Example for Fedora OS:

$ sudo cp certs/lin/* /etc/pki/ca-trust/source/anchors

$ sudo update-ca-trust extract

1.3. Standard installation resources

The standard installation (Installer-Provisioned Infrastructure) creates the following infrastructure resources:

  • one folder (1 Folder)

  • one tag category (1 Tag Category)

  • one tag (1 Tag)

  • virtual machines (Virtual machines):

    • one template (1 template)

    • one temporary bootstrap node (1 temporary bootstrap node)

    • three control plane nodes for Platform management (3 control-plane nodes)

    • three compute machines (3 compute machines)

1.3.1. Resource requirements

1.3.1.1. Data storage

Alongside the resources described above, the standard OKD deployment requires a minimum of 800 GB of storage space for data storage.

1.3.1.2. DHCP

The deployment requires configuring a DHCP server for network configuration.

2. Deploying and configuring DNS and DHCP components

2.1. IP addresses

Deployment of the vSphere infrastructure (Installer-provisioned vSphere) requires two static IP addresses:

  • Program interface address (API) — used for accessing the cluster’s API.

  • Incoming IP address (Ingress) — used for cluster ingress traffic.

Virtual IP addresses for each of them must be defined in the install-config.yaml file.

2.2. DNS records

DNS records must be created for the two IP addresses on any DNS server designated for the environment. The records should contain the values described in the table:

Name Value

api.${cluster-name}.${base-domain}

API VIP

*.apps.${cluster-name}.${base-domain}`

Ingress VIP

${cluster-name} and ${base-domain} are variables taken from the respective values specified in the install-config.yaml file.

3. Creating the install-config.yaml configuration file

Prerequisites
  1. Log in to your Red Hat account. If you don’t have one, you need to create it.

  2. Purchase a paid subscription for DockerHub should you not have one.

  3. Generate and add an SSH key to your configuration file. This is necessary for accessing your node consoles.

To create the install-config.yaml file required for deploying the OKD cluster, use the following command:

$ openshift-installer create install-config

After creating the file, you need to fill in the necessary parameters, which will be presented in the context menu. The created configuration file includes only the required parameters for a minimal cluster deployment. For customization, refer to the official documentation.

The install-config.yaml configuration example
apiVersion: v1
baseDomain: eua.gov.ua
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform: {}
  replicas: 3
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform: {}
  replicas: 3
metadata:
  creationTimestamp: null
  name: mdtuddm
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 10.0.0.0/16
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
platform:
  vsphere:
    apiVIP: 10.9.1.110
    cluster: HX-02
    datacenter: HXDP-02
    defaultDatastore: NCR_Data2
    ingressVIP: 10.9.1.111
    network: EPAM_OKD_Vlan9_EPG
    password: <password>
    username: epam_dev1@vsphere.local
    vCenter: vcsa.ncr.loc
publish: External
pullSecret: '{"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}'
sshKey: |
  <ssh_key>
  • During the creation of the configuration file, replace <password> with your password and <ssh_key> with your generated SSH key.

  • Also, copy the authentication parameters from your Red Hat account and insert them into the pullSecret field.

  • Please note that you may need to adjust some parameters to match your infrastructure and requirements.

4. Running the OKD4 installer and deploying an empty OKD4 cluster

After creating the install-config.yaml file, to deploy the OKD cluster, execute the following command:

$ openshift-installer create cluster
The cluster deployment process typically takes up to 1.5 hours.

Upon successful deployment, the following cluster access parameters will be provided:

  • Login;

  • Password;

  • Link to the cluster’s web console.

In the directory where the command was executed, a series of files storing the cluster’s status, necessary for its uninstallation, will be created.

Additionally, an /auth folder will appear in this directory, containing two authentication files for working with the cluster through the OKD web console and the OKD CLI.

After starting the cluster deployment process, the Installer removes the install-config.yaml file. Therefore, it is recommended to make a backup of this file if you plan to deploy multiple clusters.

5. Replacing self-signed certificates with trusted certificates

To replace self-signed certificates with trusted certificates, you first need to obtain these certificates.

In this section, we will discuss obtaining free Let’s Encrypt certificates and installing them on the server.

Acquiring Let’s Encrypt certificates is done using the acme.sh utility.

For detailed information on using Let’s Encrypt based on the ACME protocol, refer to the official source.

5.1. Preparation

Clone the acme.sh utility from the GitHub repository:

$ cd $HOME
$ git clone https://github.com/neilpang/acme.sh
$ cd acme.sh

5.2. Certificate request

  1. To simplify the certificate acquisition process, set two environment variables. The first variable should point to the API Endpoint. Make sure you are logged in to OKD as system:admin and use the Openshift CLI console to find the API Endpoint URL:

    $ oc whoami --show-server
    Example response
    https://api.e954.ocp4.opentlc.com:6443
  2. Now, set the LE_API variable for the fully qualified API domain:

    $ export LE_API=$(oc whoami --show-server | cut -f 2 -d ':' | cut -f 3 -d '/' | sed 's/-api././')
  3. Set the LE_WILDCARD variable for your Wildcard Domain:

    $ export LE_WILDCARD=$(oc get ingresscontroller default -n openshift-ingress-operator -o jsonpath='{.status.domain}')
  4. Run the acme.sh script:

    $ ${HOME}/acme.sh/acme.sh --issue -d ${LE_API} -d *.${LE_WILDCARD} --dns
    Example response
    $  ./acme.sh --issue -d  ${LE_API} -d \*.${LE_WILDCARD} --dns --yes-I-know-dns-manual-mode-enough-go-ahead-please
    [Wed Jul 28 18:37:33 EEST 2021] Using CA: https://acme-v02.api.letsencrypt.org/directory
    [Wed Jul 28 18:37:33 EEST 2021] Creating domain key
    [Wed Jul 28 18:37:33 EEST 2021] The domain key is here: $HOME/.acme.sh/api.e954.ocp4.opentlc.com/api.e954.ocp4.opentlc.com.key
    [Wed Jul 28 18:37:33 EEST 2021] Multi domain='DNS:api.e954.ocp4.opentlc.com,DNS:*.apps.e954.ocp4.opentlc.com'
    [Wed Jul 28 18:37:33 EEST 2021] Getting domain auth token for each domain
    [Wed Jul 28 18:37:37 EEST 2021] Getting webroot for domain='cluster-e954-api.e954.ocp4.opentlc.com'
    [Wed Jul 28 18:37:37 EEST 2021] Getting webroot for domain=‘*.apps.cluster-e954-api.e954.ocp4.opentlc.com’
    [Wed Jul 28 18:37:38 EEST 2021] Add the following TXT record:
    [Wed Jul 28 18:37:38 EEST 2021] Domain: '_acme-challenge.api.e954.ocp4.opentlc.com'
    [Wed Jul 28 18:37:38 EEST 2021] TXT value: 'VZ2z3XUe4cdNLwYF7UplBj7ZTD8lO9Een0yTD7m_Bbo'
    [Wed Jul 28 18:37:38 EEST 2021] Please be aware that you prepend _acme-challenge. before your domain
    [Wed Jul 28 18:37:38 EEST 2021] so the resulting subdomain will be: _acme-challenge.api.e954.ocp4.opentlc.com
    [Wed Jul 28 18:37:38 EEST 2021] Add the following TXT record:
    [Wed Jul 28 18:37:38 EEST 2021] Domain: '_acme-challenge.apps.e954.ocp4.opentlc.com'
    [Wed Jul 28 18:37:38 EEST 2021] TXT value: 'f4KeyXkpSissmiLbIIoDHm5BJ6tOBTA0D8DyK5sl46g'
    [Wed Jul 28 18:37:38 EEST 2021] Please be aware that you prepend _acme-challenge. before your domain
    [Wed Jul 28 18:37:38 EEST 2021] so the resulting subdomain will be: _acme-challenge.apps.e954.ocp4.opentlc.com
    [Wed Jul 28 18:37:38 EEST 2021] Please add the TXT records to the domains, and re-run with --renew.
    [Wed Jul 28 18:37:38 EEST 2021] Please add '--debug' or '--log' to check more details.
    DNS records as mentioned in the previous response should be added to the DNS server responsible for the e954.ocp4.opentlc.com zone (the zone value here is just an example). The TXT records should have the following format:
    TXT record 1
    _acme-challenge.api.e954.ocp4.opentlc.com TXT value: 'VZ2z3XUe4cdNLwYF7UplBj7ZTD8lO9Een0yTD7m_Bbo'
    TXT record 2
    _acme-challenge.apps.e954.ocp4.opentlc.com TXT value: 'f4KeyXkpSissmiLbIIoDHm5BJ6tOBTA0D8DyK5sl46g'
  5. After this step, you need to run the acme.sh command again.

    $ acme.sh --renew -d e954.ocp4.opentlc.com --yes-I-know-dns-manual-mode-enough-go-ahead-please
  6. Upon successful completion of the previous steps, run the following commands.

    Usually, a good approach is to move certificates from the default acme.sh path to a more convenient directory. You can use the —install-cert key of the acme.sh script to copy certificates to $HOME/certificates, for example:

    $ export CERTDIR=$HOME/certificates
    
    $ mkdir -p ${CERTDIR} ${HOME}/acme.sh/acme.sh --install-cert -d ${LE_API} -d *.${LE_WILDCARD} --cert-file ${CERTDIR}/cert.pem --key-file ${CERTDIR}/key.pem --fullchain-file ${CERTDIR}/fullchain.pem --ca-file ${CERTDIR}/ca.cer

5.2.1. Installing certificates for Router

  1. Create a secret for this. Execute the following command:

    $ oc create secret tls router-certs --cert=${CERTDIR}/fullchain.pem --key=${CERTDIR}/key.pem -n openshift-ingress
  2. After completing the previous steps, update the Custom Resource:

    $ oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec": 	{ "defaultCertificate": { "name": "router-certs" }}}'

6. Creating a MachineSet for Ceph infrastructure

MachineSet resources are groups of machines. MachineSets are intended for machines as sets of replicas for Pods where containers are deployed. If you need more machines or, conversely, need to reduce their quantity, you can adjust the replica field at the MachineSet level to meet your computational needs. For detailed information on creating MachineSets, please refer to the official source.

To deploy the Platform, you must create a MachineSet for the Ceph data storage system. To do this, use the machine-set-ceph.yaml configuration file and modify the cluster name accordingly.

Example machine-set-ceph.yaml configuration file
kind: MachineSet
metadata:
  name: mdtuddm-b86zw-ceph
  namespace: openshift-machine-api
  labels:
    machine.openshift.io/cluster-api-cluster: mdtuddm-b86zw
spec:
  replicas: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: mdtuddm-b86zw
      machine.openshift.io/cluster-api-machineset: mdtuddm-b86zw-ceph
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: mdtuddm-b86zw
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: mdtuddm-b86zw-ceph
    spec:
      taints:
        - effect: NoSchedule
          key: node.ocs.openshift.io/storage
          value: 'true'
      metadata:
        labels:
          cluster.ocs.openshift.io/openshift-storage: ''
      providerSpec:
        value:
          numCoresPerSocket: 1
          diskGiB: 120
          snapshot: ''
          userDataSecret:
            name: worker-user-data
          memoryMiB: 73728
          credentialsSecret:
            name: vsphere-cloud-credentials
          network:
            devices:
              - networkName: EPAM_OKD_Vlan9_EPG
          metadata:
            creationTimestamp: null
          numCPUs: 16
          kind: VSphereMachineProviderSpec
          workspace:
            datacenter: HXDP-02
            datastore: NCR_Data2
            folder: /HXDP-02/vm/mdtuddm-b86zw
            resourcePool: /HXDP-02/host/HX-02/Resources
            server: vcsa.ncr.loc
          template: mdtuddm-b86zw-rhcos
          apiVersion: vsphereprovider.openshift.io/v1beta1

After editing the file according to the cluster name, execute the command to create the necessary MachineSet and the corresponding number of nodes for deploying the Ceph data storage.

In our case, the cluster name is defined in the .yaml file as mdtuddm-b86zw.

7. Preparing and running the Installer for Platform deployment on the target OKD cluster

The Installer — is a set of commands, a script used for deploying the Platform.

To run the Installer, you need to meet several conditions for preparing the workstation from which the Installer will be launched. Below is an example of such preparation on Ubuntu 20.04 LTS.

7.1. Prerequisites

Install Docker on the official source: https://docs.docker.com/engine/install/.

7.2. Deploying and updating the Platform

7.2.1. Deploying the Platform from scratch

7.2.1.1. Prerequisites
Ensure that the required packages are installed: docker, wget, unzip.
  1. Download the necessary version of the installer.

    сd /tmp
    wget -O mdtu-ddm-platform-<version>.zip https://nexus-public-mdtu-ddm-edp-cicd.apps.cicd2.mdtu-ddm.projects.epam.com/nexus/repository/edp-maven-releases/ua/gov/mdtu/ddm/infrastructure/mdtu-ddm-platform/<version>/mdtu-ddm-platform-<version>.zip
  2. Unpack the archive in the home directory.

    unzip /tmp/mdtu-ddm-platform-<version>.zip -d /home/<user>/workdir/installer-<version>
  3. Move the kubeconfig file after cluster installation:

    cd /home/<user>/workdir/installer-<version>
    cp /path/to/kubeconfig ./
  4. Move the certificates folder for DSO:

    cp /path/to/folder/certificates ./
7.2.1.2. Adding a separate configuration file for deployment in the vSphere environment
  1. Edit exports.list for vSphere.

    All values should be taken after the cluster installation. Also, ensure that you have up-to-date values for idgovuaClientId and idgovuaClientSecret.

    vi exports.list
    
    ### vSphere Credentials ###
    export VSPHERE_SERVER=""
    export VSPHERE_USER=""
    export VSPHERE_PASSWORD=""
    export VSPHERE_CLUSTER=""
    export VSPHERE_DATASTORE=""
    export VSPHERE_DATACENTER=""
    export VSPHERE_NETWORK=""
    export VSPHERE_NETWORK_GATEWAY=""
    export VSPHERE_RESOURCE_POOL="" #якщо не використовується, ставимо "/"
    export VSPHERE_FOLDER=""
    
    ### Minio and Vault IPs ###
    export VSPHERE_VAULT_INSTANCE_IP=""
    export VSPHERE_MINIO_INSTANCE_IP=""
    
    ### id.gov.ua ###
    export idgovuaClientId=""
    export idgovuaClientSecret=""
  2. Edit install.sh, specifically after source ./functions.sh, add source ./exports.list.

    vi install.sh

    It should look like this:

    #!/usr/bin/env bash
    set -e
    #Include function file
    source ./functions.sh
    source ./exports.list
7.2.1.3. Deploying the Installer
  1. Execute the following commands:

    IMAGE_CHECKSUM=$(sudo docker load -i control-plane-installer.img | sed -r "s#.*sha256:(.*)#\\1#" | tr -d '\n');
    echo $IMAGE_CHECKSUM
    sudo docker tag ${IMAGE_CHECKSUM} control-plane-installer:<version>;
  2. Deploy a new version of the Platform with images from scratch:

    sudo docker run --rm --name control-plane-installer-<version> --user root:$(id -g) --net host -v $(pwd):/tmp/installer --env KUBECONFIG=/tmp/installer/kubeconfig --env idgovuaClientId=mock --env idgovuaClientSecret=mock --env deploymentMode=development --entrypoint "/bin/bash" control-plane-installer:<version> -c "./install.sh -i"
    The deploymentMode parameter can be set either to development or production.

7.2.2. Updating the Platform

7.2.2.1. Prerequisites
Ensure that the required packages are installed: docker, wget, unzip.
  1. Download the necessary version of the installer.

    сd /tmp
    wget -O mdtu-ddm-platform-<version>.zip https://nexus-public-mdtu-ddm-edp-cicd.apps.cicd2.mdtu-ddm.projects.epam.com/nexus/repository/edp-maven-releases/ua/gov/mdtu/ddm/infrastructure/mdtu-ddm-platform/<version>/mdtu-ddm-platform-<version>.zip
  2. Unpack the archive in the home directory.

    unzip /tmp/mdtu-ddm-platform-<version>.zip -d /home/<user>/workdir/installer-<version>
  3. Move the kubeconfig file after cluster installation:

    cd /home/<user>/workdir/installer-<version>
    cp /path/to/kubeconfig ./
  4. Move the certificates folder for DSO.

    If the certificates haven’t changed, you can skip this step.
    cp /path/to/folder/certificates ./
7.2.2.2. Adding a separate configuration file for deployment in the vSphere environment
  1. Move exports.list from the previous release.

    cp /home/<user>/workdir/installer-<previous_version>/exports.list ./

    Also, ensure that you have up-to-date values for idgovuaClientId and idgovuaClientSecret.

  2. Edit install.sh, specifically after source ./functions.sh, add source ./exports.list.

    vi install.sh

    It should look as follows:

    #!/usr/bin/env bash
    set -e
    #Include function file
    source ./functions.sh
    source ./exports.list
7.2.2.3. Configuring the MinIO component during cluster update in the vSphere environment
  1. Transfer the tfstate for MinIO from the previous release for vSphere.

    cp /home/<user>/workdir/installer-<version>/terraform/minio/vsphere/terraform.tfstate ./terraform/minio/vsphere/
  2. Transfer the tfstate for MinIO (Packer) from the previous release for vSphere.

    сp /home/<user>/workdir/installer-<version>/terraform/minio/vsphere/packer/terraform.tfstate ./terraform/minio/vsphere/packer/
7.2.2.4. Configuring the vault component during cluster update in the vSphere environment
  1. Transfer the tfstate for Vault from the previous release.

    cp /home/<user>/workdir/installer-<version>/terraform/vault/vsphere/terraform.tfstate ./terraform/vault/vsphere/
  2. Transfer the tfstate for Vault (Packer) from the previous release.

    сp /home/<user>/workdir/installer-<version>/terraform/vault/vsphere/packer/terraform.tfstate ./terraform/vault/vsphere/packer/
7.2.2.5. Deploying the Installer
  1. Execute the following commands:

    IMAGE_CHECKSUM=$(sudo docker load -i control-plane-installer.img | sed -r "s#.*sha256:(.*)#\\1#" | tr -d '\n');
    echo $IMAGE_CHECKSUM
    sudo docker tag ${IMAGE_CHECKSUM} control-plane-installer:<version>;
  2. Update the Platform version with update images.

    sudo docker run --rm --name control-plane-installer-<version> --user root:$(id -g) --net host -v $(pwd):/tmp/installer --env KUBECONFIG=/tmp/installer/kubeconfig --env idgovuaClientId=mock --env idgovuaClientSecret=mock --env deploymentMode=development --entrypoint "/bin/bash" control-plane-installer:<version> -c "./install.sh -u"
    Where deploymentMode can be either development or production, depending on the previous run.

8. Managing Platform configuration

Cluster management follows the GitOps methodology. This means that any changes in the cluster configuration, cluster components, and Platform components are made through modifying the cluster configuration in the git branch of the corresponding component.

Metadata for all components managed through the GitOps approach is stored in the cluster-mgmt component.

Below is a list of components for which GitOps management is currently implemented:

  • catalog-source

  • monitoring

  • storage

  • logging

  • service-mesh

  • backup-management

  • user-management

  • control-plane-nexus

  • external-integration-mocks

  • cluster-kafka-operator

  • smtp-server

  • redis-operator

  • postgres-operator