Skip to content

vSphere Provider

The vSphere provider enables VirtRigaud to manage virtual machines on VMware vSphere environments, including vCenter Server and standalone ESXi hosts. This provider is designed for enterprise production environments with comprehensive support for vSphere features.

Overview

This provider implements the VirtRigaud provider interface to manage VM lifecycle operations on VMware vSphere:

  • Create: Create VMs from templates, content libraries, or OVF/OVA files
  • Delete: Remove VMs and associated storage (with configurable retention)
  • Power: Start, stop, restart, and suspend virtual machines
  • Describe: Query VM state, resource usage, guest info, and vSphere properties
  • Reconfigure: Hot-add CPU/memory, resize disks, modify network adapters (v0.2.3+)
  • Clone: Create full or linked clones from existing VMs or templates (v0.2.3+)
  • Snapshot: Create, delete, and revert VM snapshots with memory state
  • TaskStatus: Track asynchronous operations with progress monitoring (v0.2.3+)
  • ConsoleURL: Generate vSphere web client console URLs (v0.2.3+)
  • ImagePrepare: Import OVF/OVA, deploy from content library, or ensure template existence

Prerequisites

⚠️ IMPORTANT: Active vSphere Environment Required

The vSphere provider connects to VMware vSphere infrastructure and requires active vCenter Server or ESXi hosts.

Requirements:

  • vCenter Server 7.0+ or ESXi 7.0+ (running and accessible)
  • User account with appropriate privileges for VM management
  • Network connectivity from VirtRigaud to vCenter/ESXi (HTTPS/443)
  • vSphere infrastructure:
  • Configured datacenters, clusters, and hosts
  • Storage (datastores) for VM files
  • Networks (port groups) for VM connectivity
  • Resource pools for VM placement (optional)

Testing/Development:

For development environments: - Use VMware vSphere Hypervisor (ESXi) free version - vCenter Server Appliance evaluation license - VMware Workstation/Fusion with nested ESXi - EVE-NG or GNS3 with vSphere emulation

Authentication

The vSphere provider supports multiple authentication methods:

Username/Password Authentication (Common)

Standard vSphere user authentication:

apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
  name: vsphere-prod
  namespace: default
spec:
  type: vsphere
  endpoint: https://vcenter.example.com/sdk
  credentialSecretRef:
    name: vsphere-credentials
  # Optional: Skip TLS verification (development only)
  insecureSkipVerify: false
  runtime:
    mode: Remote
    image: "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.3.3"
    service:
      port: 9443

Create credentials secret:

apiVersion: v1
kind: Secret
metadata:
  name: vsphere-credentials
  namespace: default
type: Opaque
stringData:
  username: "virtrigaud@vsphere.local"
  password: "SecurePassword123!"

Session Token Authentication (Advanced)

For environments using external authentication:

apiVersion: v1
kind: Secret
metadata:
  name: vsphere-token
  namespace: default
type: Opaque
stringData:
  token: "vmware-api-session-id:abcd1234..."

Create a dedicated service account with minimal required privileges:

# vSphere privileges for VirtRigaud service account:
# - Datastore: Allocate space, Browse datastore, Low level file operations
# - Network: Assign network  
# - Resource: Assign virtual machine to resource pool
# - Virtual machine: All privileges (or subset based on requirements)
# - Global: Enable methods, Disable methods, Licenses

Configuration

Connection Endpoints

Endpoint Type Format Use Case
vCenter Server https://vcenter.example.com/sdk Multi-host management (recommended)
vCenter FQDN https://vcenter.corp.local/sdk Internal domain environments
vCenter IP https://192.168.1.10/sdk Direct IP access
ESXi Host https://esxi-host.example.com Single host environments

Deployment Configuration

Using Helm Values

# values.yaml
providers:
  vsphere:
    enabled: true
    endpoint: "https://vcenter.example.com/sdk"
    insecureSkipVerify: false  # Set to true for self-signed certificates
    credentialSecretRef:
      name: vsphere-credentials
      namespace: virtrigaud-system

Production Configuration with TLS

# Create secret with credentials and TLS certificates
apiVersion: v1
kind: Secret
metadata:
  name: vsphere-secure-credentials
  namespace: virtrigaud-system
type: Opaque
stringData:
  username: "svc-virtrigaud@vsphere.local"
  password: "SecurePassword123!"
  # Optional: Custom CA certificate for vCenter
  ca.crt: |
    -----BEGIN CERTIFICATE-----
    # Your vCenter CA certificate here
    -----END CERTIFICATE-----

---
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
  name: vsphere-production
  namespace: virtrigaud-system
spec:
  type: vsphere
  endpoint: https://vcenter.prod.example.com/sdk
  credentialSecretRef:
    name: vsphere-secure-credentials
  insecureSkipVerify: false

Development Configuration

# For development with self-signed certificates
providers:
  vsphere:
    enabled: true
    endpoint: "https://esxi-dev.local"
    insecureSkipVerify: true  # Only for development!
    credentialSecretRef:
      name: vsphere-dev-credentials

Multi-vCenter Configuration

# Deploy multiple providers for different vCenters
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
  name: vsphere-datacenter-a
spec:
  type: vsphere
  endpoint: https://vcenter-a.example.com/sdk
  credentialSecretRef:
    name: vsphere-credentials-a

---
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
  name: vsphere-datacenter-b
spec:
  type: vsphere
  endpoint: https://vcenter-b.example.com/sdk
  credentialSecretRef:
    name: vsphere-credentials-b

vSphere Infrastructure Setup

Required vSphere Objects

The provider expects the following vSphere infrastructure to be configured:

Datacenters and Clusters

# Example vSphere hierarchy:
Datacenter: "Production"
├── Cluster: "Compute-Cluster"
   ├── ESXi Host: esxi-01.example.com
   ├── ESXi Host: esxi-02.example.com
   └── ESXi Host: esxi-03.example.com
├── Datastores:
   ├── "datastore-ssd"     # High-performance storage
   ├── "datastore-hdd"     # Standard storage
   └── "datastore-backup"  # Backup storage
└── Networks:
    ├── "VM Network"        # Default VM network
    ├── "DMZ-Network"       # DMZ port group
    └── "Management"        # Management network

Resource Pools (Optional)

# Create resource pools for workload isolation
Datacenter: "Production"
└── Cluster: "Compute-Cluster"
    └── Resource Pools:
        ├── "Development"    # Dev workloads (lower priority)
        ├── "Production"     # Prod workloads (high priority)
        └── "Testing"        # Test workloads (medium priority)

VM Configuration

VMClass Specification

Define CPU, memory, and vSphere-specific settings:

apiVersion: infra.virtrigaud.io/v1beta1
kind: VMClass
metadata:
  name: standard-vm
spec:
  cpu: 4
  memory: "8Gi"
  firmware: UEFI
  diskDefaults:
    type: thin
    size: "40Gi"
  guestToolsPolicy: install
  performanceProfile:
    cpuHotAddEnabled: true
    memoryHotAddEnabled: true
    latencySensitivity: normal
  securityProfile:
    secureBoot: true
    tpmEnabled: true
  resourceLimits:
    cpuReservation: 1000             # MHz
    cpuLimit: 4000                   # MHz
    memoryReservation: "2Gi"
  # vSphere-specific extra config
  extraConfig:
    "numvcpus.coresPerSocket": "2"   # CPU topology hint

VMImage Specification

Reference vSphere templates, content library items, or OVF files:

apiVersion: infra.virtrigaud.io/v1beta1
kind: VMImage
metadata:
  name: ubuntu-22-04-template
spec:
  # Template from vSphere inventory
  source:
    template: "ubuntu-22.04-template"
    datacenter: "Production"
    folder: "/vm/templates"

  # Or from content library
  # source:
  #   contentLibrary: "OS Templates"
  #   item: "ubuntu-22.04-cloud"

  # Or from OVF/OVA URL
  # source:
  #   ovf: "https://releases.ubuntu.com/22.04/ubuntu-22.04-server-cloudimg-amd64.ova"

  # Guest OS identification
  guestOS: "ubuntu64Guest"

  # Customization specification
  customization:
    type: "cloudInit"              # cloudInit, sysprep, or linux
    spec: "ubuntu-cloud-init"      # Reference to customization spec

Complete VM Example

# First, define the network attachments
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMNetworkAttachment
metadata:
  name: app-network
spec:
  network:
    vsphere:
      portgroup: "VM Network"
  ipAllocation:
    type: Static
    address: "192.168.100.50/24"
    gateway: "192.168.100.1"
    dns: ["192.168.1.10", "8.8.8.8"]

---
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMNetworkAttachment
metadata:
  name: mgmt-network
spec:
  network:
    vsphere:
      portgroup: "Management"
  ipAllocation:
    type: DHCP

---
apiVersion: infra.virtrigaud.io/v1beta1
kind: VirtualMachine
metadata:
  name: web-application
spec:
  providerRef:
    name: vsphere-prod
  classRef:
    name: standard-vm
  imageRef:
    name: ubuntu-22-04-template
  powerState: On

  # Disk configuration
  disks:
    - name: data
      sizeGiB: 500
      type: thin
      storageClass: "hdd-storage"
      scsi:
        controllerType: pvscsi

  # Network configuration (references VMNetworkAttachment resources)
  networks:
    - name: app-network
      networkRef:
        name: app-network
    - name: mgmt-network
      networkRef:
        name: mgmt-network

  # vSphere placement (no 'datacenter' field — datacenter is derived from the provider endpoint)
  placement:
    cluster: "Compute-Cluster"
    resourcePool: "Production"
    folder: "/vm/applications"
    datastore: "datastore-ssd"      # explicit datastore
    host: "esxi-01.example.com"     # pin to specific host (optional)

  # Guest customization
  userData:
    cloudInit:
      inline: |
        #cloud-config
        hostname: web-application
        users:
          - name: ubuntu
            sudo: ALL=(ALL) NOPASSWD:ALL
            ssh_authorized_keys:
              - "ssh-ed25519 AAAA..."
        packages:
          - nginx
          - docker.io
          - open-vm-tools          # VMware tools for guest integration
        runcmd:
          - systemctl enable nginx
          - systemctl enable docker
          - systemctl enable open-vm-tools

Advanced Features

VM Reconfiguration (v0.2.3+)

The vSphere provider supports online VM reconfiguration for CPU, memory, and disk resources:

# Reconfigure VM resources by changing classRef
apiVersion: infra.virtrigaud.io/v1beta1
kind: VirtualMachine
metadata:
  name: web-server
spec:
  classRef:
    name: medium    # Changed from small to medium
  powerState: On

Capabilities: - Online CPU Changes: Hot-add CPUs to running VMs (requires guest OS support) - Online Memory Changes: Hot-add memory to running VMs (requires guest OS support) - Disk Resizing: Expand disks online (shrinking not supported for safety) - Automatic Fallback: Falls back to offline changes if hot-add not supported - Intelligent Detection: Only applies changes when needed

Memory Format Support: - Standard units: 2Gi, 4096Mi, 2048MiB, 2GiB - Parser handles multiple memory unit formats

Limitations: - Disk shrinking prevented to avoid data loss - Some guest operating systems require special configuration for hot-add - BIOS firmware VMs have limited hot-add support (use EFI firmware)

VM Cloning (v0.2.3+)

Create full or linked clones of existing VMs and templates:

# Clone from existing VM via VMClone resource
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMClone
metadata:
  name: web-server-02-clone
spec:
  source:
    vmRef:
      name: web-server-01
  target:
    name: web-server-02
    classRef:
      name: small
  options:
    type: LinkedClone   # or FullClone
    powerOn: true

Clone Types: - Full Clone: Independent copy with separate storage - Linked Clone: Space-efficient copy using snapshots - Automatically creates snapshot if none exists - Requires less storage and faster creation - Parent VM must remain available

Use Cases: - Rapid test environment provisioning - Development environment duplication - Template-based deployments - Disaster recovery scenarios

Task Status Tracking (v0.2.3+)

Monitor asynchronous vSphere operations in real-time:

# VirtRigaud automatically tracks long-running operations
# No manual configuration needed

# Task tracking provides:
# - Real-time task state (queued, running, success, error)
# - Progress percentage
# - Error messages for failed tasks
# - Integration with vSphere task manager

Features: - Automatic tracking of all async operations - Progress monitoring via govmomi task manager - Detailed error reporting - Task history visibility in vCenter

Console Access (v0.2.3+)

Generate direct vSphere web client console URLs:

# Access provided in VM status
kubectl get vm web-server -o yaml

status:
  consolURL: "https://vcenter.example.com/ui/app/vm;nav=h/urn:vmomi:VirtualMachine:vm-123:xxxxx/summary"
  phase: Running

Features: - Direct browser-based VM console access - No additional tools required - Works with vSphere web client - Includes VM instance UUID for reliable identification - Generated automatically in Describe operations

Template Management

Creating Templates

# Convert existing VM to template
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMTemplate
metadata:
  name: create-ubuntu-template
spec:
  sourceVM: "ubuntu-base-vm"
  datacenter: "Production"
  targetFolder: "/vm/templates"
  templateName: "ubuntu-22.04-template"

  # Template metadata
  annotation: |
    Ubuntu 22.04 LTS Template
    Created: 2024-01-15
    Includes: cloud-init, open-vm-tools

  # Template customization
  powerOff: true                   # Power off before conversion
  removeSnapshots: true           # Clean up snapshots
  updateTools: true               # Update VMware tools

Content Library Integration

# Deploy from content library
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMImage  
metadata:
  name: centos-stream-9
spec:
  source:
    contentLibrary: "OS Templates"
    item: "CentOS-Stream-9"
    datacenter: "Production"

  # Content library item properties
  properties:
    version: "9.0"
    provider: "CentOS"
    osType: "linux"

StoragePod Support (Datastore Cluster Auto-Selection)

VirtRigaud supports automatic datastore selection from a vSphere Datastore Cluster (StoragePod). When a StoragePod is configured, the provider picks the member datastore with the most free space at VM creation time. This is a lightweight alternative to Storage DRS and does not require Storage DRS to be enabled on the cluster.

Precedence Order

Priority Source When Used
1 (highest) placement.datastore on the VM Explicit datastore always wins
2 placement.storagePod on the VM Per-VM Datastore Cluster override
3 PROVIDER_DEFAULT_STORAGE_POD env var Provider-level default Datastore Cluster
4 (lowest) PROVIDER_DEFAULT_DATASTORE env var Final fallback to a fixed datastore

Configuring a Default StoragePod

Set the environment variable on the vSphere provider deployment:

providers:
  vsphere:
    env:
      - name: PROVIDER_DEFAULT_STORAGE_POD
        value: "DatastoreCluster-SSD"

Per-VM StoragePod Placement

Override at the VM level via the placement field:

apiVersion: infra.virtrigaud.io/v1beta1
kind: VirtualMachine
metadata:
  name: web-application
spec:
  providerRef:
    name: vsphere-prod
  classRef:
    name: standard-vm
  imageRef:
    name: ubuntu-22-04-template
  powerState: On
  placement:
    datacenter: "Production"
    cluster: "Compute-Cluster"
    storagePod: "DatastoreCluster-SSD"   # picks member with most free space
    folder: "/vm/applications"

How It Works

  1. VirtRigaud enumerates all datastores that are members of the named StoragePod via the vSphere API.
  2. It retrieves the FreeSpace summary for each member datastore.
  3. The datastore with the highest free space is selected.
  4. If multiple datastores are tied, the first one in the list is chosen (stable selection).

Note

If placement.datastore is also set on the same VM, it takes priority and the StoragePod is ignored.

Storage Policies

# VMClass with vSAN storage policy (use extraConfig for vSphere-specific storage policies)
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMClass
metadata:
  name: high-performance
spec:
  cpu: 8
  memory: "32Gi"
  diskDefaults:
    type: eagerzeroedthick
    size: "50Gi"
  extraConfig:
    # vSAN storage policies (provider-specific keys)
    "storage.homePolicy": "VM Storage Policy - Performance"
    "storage.diskPolicy": "VM Storage Policy - SSD Only"

Network Advanced Configuration

# Distributed switch port group
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMNetworkAttachment
metadata:
  name: frontend-network
spec:
  network:
    vsphere:
      portgroup: "DPG-Frontend-VLAN100"
      distributedSwitch:
        name: "DSwitch-Production"
      vlan:
        type: vlan
        vlanID: 100
    type: external

---
# Backend network attachment
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMNetworkAttachment
metadata:
  name: backend-network
spec:
  network:
    vsphere:
      portgroup: "LS-Backend-App"
    type: external

---
# Reference them from the VM
apiVersion: infra.virtrigaud.io/v1beta1
kind: VirtualMachine
metadata:
  name: multi-nic-vm
spec:
  # ...
  networks:
    - name: frontend
      networkRef:
        name: frontend-network
    - name: backend
      networkRef:
        name: backend-network

High Availability

# VM with HA/DRS settings
apiVersion: infra.virtrigaud.io/v1beta1
kind: VirtualMachine
metadata:
  name: critical-application
spec:
  providerRef:
    name: vsphere-prod
  # ... other config ...

  # High availability configuration
  availability:
    # HA restart priority
    restartPriority: "high"          # disabled, low, medium, high
    isolationResponse: "powerOff"    # none, powerOff, shutdown
    vmMonitoring: "vmMonitoringOnly" # vmMonitoringDisabled, vmMonitoringOnly, vmAndAppMonitoring

    # DRS configuration
    drsAutomationLevel: "fullyAutomated"  # manual, partiallyAutomated, fullyAutomated
    drsVmBehavior: "fullyAutomated"       # manual, partiallyAutomated, fullyAutomated

    # Anti-affinity rules
    antiAffinityGroups: ["web-tier", "database-tier"]

    # Host affinity (pin to specific hosts)
    hostAffinityGroups: ["production-hosts"]

Snapshot Management

# Advanced snapshot configuration
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMSnapshot
metadata:
  name: pre-upgrade-snapshot
spec:
  vmRef:
    name: web-application

  # Snapshot settings
  name: "Pre-upgrade snapshot"
  description: "Snapshot before application upgrade"
  memory: true                    # Include memory state
  quiesce: true                   # Quiesce guest filesystem

  # Retention policy
  retention:
    maxSnapshots: 3               # Keep max 3 snapshots
    maxAge: "7d"                  # Delete after 7 days

  # Schedule (optional)
  schedule: "0 2 * * 0"          # Weekly at 2 AM Sunday

Troubleshooting

Common Issues

❌ Connection Failed

Symptom: failed to connect to vSphere: connection refused

Causes & Solutions:

  1. Network connectivity:

    # Test connectivity to vCenter
    telnet vcenter.example.com 443
    
    # Test from Kubernetes pod
    kubectl run debug --rm -i --tty --image=curlimages/curl -- \
      curl -k https://vcenter.example.com
    

  2. DNS resolution:

    # Test DNS resolution
    nslookup vcenter.example.com
    
    # Use IP address if DNS fails
    

  3. Firewall rules: Ensure port 443 is accessible from Kubernetes cluster

❌ Authentication Failed

Symptom: Login failed: incorrect user name or password

Solutions:

  1. Verify credentials:

    # Test credentials manually
    kubectl get secret vsphere-credentials -o yaml
    
    # Decode and verify
    echo "base64-password" | base64 -d
    

  2. Check user permissions:

  3. Verify user exists in vCenter
  4. Check assigned roles and privileges
  5. Ensure user is not locked out

  6. Test login via vSphere Client: Verify credentials work in the GUI

❌ Insufficient Privileges

Symptom: operation requires privilege 'VirtualMachine.Interact.PowerOn'

Solution: Grant required privileges to the service account:

# Required privileges for VirtRigaud:
# - Datastore privileges:
#   * Datastore.AllocateSpace
#   * Datastore.Browse  
#   * Datastore.FileManagement
# - Network privileges:
#   * Network.Assign
# - Resource privileges:
#   * Resource.AssignVMToPool
# - Virtual machine privileges:
#   * VirtualMachine.* (all) or specific subset
# - Global privileges:
#   * Global.EnableMethods
#   * Global.DisableMethods

❌ Template Not Found

Symptom: template 'ubuntu-template' not found

Solutions:

# List available templates
govc ls /datacenter/vm/templates/

# Check template path and permissions
govc object.collect -s vm/templates/ubuntu-template summary.config.name

# Verify template is properly marked as template
govc object.collect -s vm/templates/ubuntu-template config.template

❌ Datastore Issues

Symptom: insufficient disk space or datastore not accessible

Solutions:

# Check datastore capacity
govc datastore.info datastore-name

# List accessible datastores
govc datastore.ls

# Check datastore cluster configuration
govc cluster.ls

❌ Network Configuration

Symptom: network 'VM Network' not found

Solutions:

# List available networks
govc ls /datacenter/network/

# Check distributed port groups
govc dvs.portgroup.info

# Verify network accessibility from cluster
govc cluster.network.info

Validation Commands

Test your vSphere setup before deploying:

# 1. Install and configure govc CLI tool
export GOVC_URL='https://vcenter.example.com'
export GOVC_USERNAME='administrator@vsphere.local'
export GOVC_PASSWORD='password'
export GOVC_INSECURE=1  # for self-signed certificates

# 2. Test connectivity
govc about

# 3. List datacenters
govc ls

# 4. List clusters and hosts
govc ls /datacenter/host/

# 5. List datastores
govc ls /datacenter/datastore/

# 6. List networks
govc ls /datacenter/network/

# 7. List templates
govc ls /datacenter/vm/templates/

# 8. Test VM creation (dry run)
govc vm.create -c 1 -m 1024 -g ubuntu64Guest -net "VM Network" test-vm
govc vm.destroy test-vm

Debug Logging

Enable verbose logging for the vSphere provider:

providers:
  vsphere:
    env:
      - name: LOG_LEVEL
        value: "debug"
      - name: GOVMOMI_DEBUG
        value: "true"
    endpoint: "https://vcenter.example.com"

Monitor vSphere tasks:

# Monitor recent tasks in vCenter
govc task.ls

# Get details of specific task
govc task.info task-123

Performance Optimization

Resource Allocation

# High-performance VMClass
apiVersion: infra.virtrigaud.io/v1beta1
kind: VMClass
metadata:
  name: performance-optimized
spec:
  cpu: 16
  memory: "64Gi"
  performanceProfile:
    cpuHotAddEnabled: false        # Disable for better performance
    memoryHotAddEnabled: false     # Disable for better performance
    latencySensitivity: high
    hyperThreadingPolicy: prefer
  resourceLimits:
    cpuReservation: 8000           # MHz — guarantee CPU resources
    memoryReservation: "64Gi"      # Guarantee full memory
  extraConfig:
    "numvcpus.coresPerSocket": "8" # Align with NUMA topology

Storage Optimization

# Storage-optimized configuration
spec:
  storage:
    diskFormat: "eagerZeroedThick"  # Best performance, more space usage
    controllerType: "pvscsi"        # Paravirtual SCSI for better performance
    multiwriter: false              # Disable unless needed

    # vSAN optimization
    storagePolicy: "Performance-Tier"
    cachingPolicy: "writethrough"   # or "writeback" for better performance

    # Multiple controllers for high IOPS
    scsiControllers:
      - type: "pvscsi"
        busNumber: 0
        maxDevices: 15
      - type: "pvscsi" 
        busNumber: 1
        maxDevices: 15

Network Optimization

# High-performance networking
networks:
  - name: high-performance
    portGroup: "DPG-HighPerf-SR-IOV"
    adapter: "vmxnet3"             # Best performance adapter
    sriov: true                    # SR-IOV for near-native performance
    bandwidth:
      reservation: 1000            # Guaranteed bandwidth (Mbps)
      limit: 10000                 # Maximum bandwidth (Mbps)
      shares: 100                  # Priority level

API Reference

For complete API reference, see the Provider API Documentation.

Contributing

To contribute to the vSphere provider:

  1. See the Provider Development Guide
  2. Check the GitHub repository
  3. Review open issues

Support