Extending a VMware vSphere Cluster Deployment

This document explains how to extend a baseline VMware vSphere cluster deployment after the minimum single-datacenter workflow is running successfully.

Scenarios

Use this document in the following scenarios:

You need a second NIC on control plane or worker nodes.
You want to distribute nodes across multiple datacenters or deployment zones.
You want to add more data disks.
You want to scale out the worker pool.

Prerequisites

Before you begin, ensure the following conditions are met:

The baseline workflow in Creating a VMware vSphere Cluster in the global Cluster completed successfully.
You validated the extra parameters in Preparing Parameters for a VMware vSphere Cluster.
You understand which manifests own the network, placement, and disk settings in your deployment.

Add a second NIC

When nodes require an additional management, storage, or service network, extend the manifests in the following resources:

02-vsphereresourcepool-control-plane.yaml
03-vsphereresourcepool-worker.yaml
20-control-plane.yaml
30-workers-md-0.yaml
04-failure-domains.yaml if failure domains are enabled

Add the second NIC to each node slot in the static allocation pools:

network:
- networkName: "<nic1_network_name>"
  deviceName: "<nic1_device_name>"
  ip: "<master_01_nic1_ip>/<nic1_prefix>"
  gateway: "<nic1_gateway>"
  dns:
  - "<nic1_dns_1>"
- networkName: "<nic2_network_name>"
  deviceName: "<nic2_device_name>"
  ip: "<master_01_nic2_ip>/<nic2_prefix>"
  gateway: "<nic2_gateway>"
  dns:
  - "<nic2_dns_1>"

Apply the same pattern to the worker node slots:

network:
- networkName: "<nic1_network_name>"
  deviceName: "<nic1_device_name>"
  ip: "<worker_01_nic1_ip>/<nic1_prefix>"
  gateway: "<nic1_gateway>"
  dns:
  - "<nic1_dns_1>"
- networkName: "<nic2_network_name>"
  deviceName: "<nic2_device_name>"
  ip: "<worker_01_nic2_ip>/<nic2_prefix>"
  gateway: "<nic2_gateway>"
  dns:
  - "<nic2_dns_1>"

Add the second NIC to the machine templates:

network:
  devices:
  - dhcp4: true
    networkName: "<nic1_network_name>"
  - dhcp4: true
    networkName: "<nic2_network_name>"

If failure domains are enabled, update the network list in VSphereFailureDomain.spec.topology.networks:

topology:
  networks:
  - <nic1_network_name>
  - <nic2_network_name>

When you define the second NIC values, prepare the following placeholders in the checklist and manifests:

<master_01_nic2_ip>
<master_02_nic2_ip>
<master_03_nic2_ip>
<worker_01_nic2_ip>
<worker_02_nic2_ip> when you also expand the worker pool

When you move between one NIC and two NICs, apply the following rules:

Expand from one NIC to two NICs

Update all of the following fields together:

VSphereResourcePool.spec.resources[].network
VSphereMachineTemplate.spec.template.spec.network.devices
VSphereFailureDomain.spec.topology.networks when failure domains are enabled

Revert from two NICs to one NIC

Remove the second NIC block from all of the following fields:

The second NIC entry in VSphereResourcePool.spec.resources[].network
The second device entry in VSphereMachineTemplate.spec.template.spec.network.devices
The second network name in VSphereFailureDomain.spec.topology.networks

Enable multiple datacenters and failure domains

Use multiple datacenters and failure domains when you need node placement across different vCenter datacenters or compute clusters.

The following principles apply:

One cluster can define multiple VSphereFailureDomain objects.
Each VSphereDeploymentZone references one VSphereFailureDomain.
The control plane uses VSphereCluster.spec.failureDomainSelector.
A worker MachineDeployment uses spec.template.spec.failureDomain when it must target a specific deployment zone.

Prepare the following placeholders for the first datacenter:

<compute_cluster_1>
<default_datastore_1>
<resource_pool_path_1>
<fd_name_1>
<dz_name_1>

Prepare the following placeholders for the second datacenter:

<dc_name_2>
<fd_name_2>
<dz_name_2>
<compute_cluster_2>
<default_datastore_2>
<resource_pool_path_2>

If you add a third datacenter, continue with the same placeholder pattern:

<dc_name_3>
<fd_name_3>
<dz_name_3>
<compute_cluster_3>
<default_datastore_3>
<resource_pool_path_3>

Create the failure-domain objects in 04-failure-domains.yaml. The first datacenter also needs a VSphereFailureDomain and VSphereDeploymentZone when failure domains are enabled:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
  name: <fd_name_1>
spec:
  region:
    name: region-a
    type: Datacenter
    tagCategory: k8s-region
    autoConfigure: true
  zone:
    name: zone-1
    type: ComputeCluster
    tagCategory: k8s-zone
    autoConfigure: true
  topology:
    datacenter: <default_datacenter>
    computeCluster: <compute_cluster_1>
    datastore: <default_datastore_1>
    networks:
    - <nic1_network_name>
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: <dz_name_1>
spec:
  server: <vsphere_server>
  failureDomain: <fd_name_1>
  controlPlane: true
  placementConstraint:
    resourcePool: <resource_pool_path_1>
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
  name: <fd_name_2>
spec:
  region:
    name: region-a
    type: Datacenter
    tagCategory: k8s-region
    autoConfigure: true
  zone:
    name: zone-2
    type: ComputeCluster
    tagCategory: k8s-zone
    autoConfigure: true
  topology:
    datacenter: <dc_name_2>
    computeCluster: <compute_cluster_2>
    datastore: <default_datastore_2>
    networks:
    - <nic1_network_name>
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: <dz_name_2>
spec:
  server: <vsphere_server>
  failureDomain: <fd_name_2>
  controlPlane: true
  placementConstraint:
    resourcePool: <resource_pool_path_2>

Enable control plane selection across the available failure domains:

failureDomainSelector: {}

Set a worker deployment zone when a worker MachineDeployment must be pinned to one deployment target:

failureDomain: <worker_failure_domain>

Use a VSphereDeploymentZone name for <worker_failure_domain>, not a VSphereFailureDomain name.

Recommendation: Before you enable multiple datacenters, confirm that the VM template, networks, and datastores are available in every target datacenter.

Before you enable multiple datacenters, also confirm the following prerequisites:

The template is already synchronized to every target datacenter.
The network names are resolvable in every target datacenter.
The datastore names are resolvable in every target datacenter.
The vSphere CPI datacenter list covers every target datacenter.

Add or remove data disks

The baseline example includes dedicated data disks for control plane nodes and can include dedicated data disks for worker nodes.

In the baseline target scenario, the control plane data disk is part of the minimum deployment set. If the control plane design depends on an additional disk for /var/cpaas or another directory, do not remove it.

Worker data disks are optional and depend on workload requirements.

If a worker node does not need a data disk, remove the persistentDisks section entirely from the corresponding node slot:

persistentDisks:
- name: "<worker_01_disk_name>"
  sizeGiB: <worker_01_disk_size_gib>
  mountPath: "<worker_01_disk_mount_path>"
  fsFormat: "<worker_01_disk_fs>"

If no worker node requires a data disk, remove that block from every worker slot in the worker CAPV static allocation pool.

If a node needs multiple data disks, append more entries to the same persistentDisks list:

persistentDisks:
- name: "<disk_a_name>"
  sizeGiB: <disk_a_size_gib>
  mountPath: "<disk_a_mount_path>"
  fsFormat: "<disk_a_fs>"
- name: "<disk_b_name>"
  sizeGiB: <disk_b_size_gib>
  mountPath: "<disk_b_mount_path>"
  fsFormat: "<disk_b_fs>"

Scale out worker nodes

Worker scale-out depends on the relationship between MachineDeployment.spec.replicas and the available node slots in the worker CAPV static allocation pool, VSphereResourcePool.spec.resources[].

Apply the following rules:

The number of node slots can be greater than replicas.
Idle slots do not affect a running cluster.
If replicas exceeds the number of available slots, CAPV cannot assign the new worker nodes correctly.

Use the following order when you scale out workers:

Add new worker node slots to 03-vsphereresourcepool-worker.yaml.
Increase MachineDeployment.spec.replicas in 30-workers-md-0.yaml.

The following example adds a new worker slot:

- hostname: "<worker_node_name_2>"
  datacenter: "<worker_02_datacenter>"
  network:
  - networkName: "<nic1_network_name>"
    ip: "<worker_02_nic1_ip>/<nic1_prefix>"
    gateway: "<nic1_gateway>"
    dns:
    - "<nic1_dns_1>"
  persistentDisks:
  - name: "<worker_02_disk_name>"
    sizeGiB: <worker_02_disk_size_gib>
    mountPath: "<worker_02_disk_mount_path>"
    fsFormat: "<worker_02_disk_fs>"

Then update the worker replicas:

replicas: <worker_replicas>

Verification

After each extension, validate the cluster state with the following commands:

kubectl -n <namespace> get cluster,vspherecluster,kubeadmcontrolplane,machinedeployment,machine,vspheremachine,vspherevm
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig get nodes -o wide

Confirm the following results:

The new placement, NIC, or disk definitions are reflected in the target resources.
New worker nodes reach the Ready state.
Existing nodes remain healthy after the change.

Next Steps

Apply one extension at a time. Validate the result before you combine multiple changes in the same cluster.

#Extending a VMware vSphere Cluster Deployment

#TOC

#Scenarios

#Prerequisites

#Add a second NIC

#Expand from one NIC to two NICs

#Revert from two NICs to one NIC

#Enable multiple datacenters and failure domains

#Add or remove data disks

#Scale out worker nodes

#Verification

#Next Steps