Provider Architecture¶
This document describes the provider architecture in VirtRigaud.
Overview¶
VirtRigaud uses a Remote Provider architecture where providers run as independent pods, communicating with the manager controller via gRPC. This design provides scalability, security, and reliability benefits.
Architecture¶
┌─────────────────┐ ┌───────────────────┐ ┌─────────────────┐
│ VirtualMachine │ │ Provider │ │ Provider Runtime│
│ CRD │ │ CRD │ │ Deployment │
└─────────────────┘ └───────────────────┘ └─────────────────┘
│ │ │
│ │ │
v v │
┌─────────────────┐ ┌───────────────────┐ │
│ Manager │ │ Provider │ │
│ Controller │ │ Controller │ │
│ │ │ │ │
│ ┌─────────────┤ │ - Creates Deploy │ │
│ │ VM Reconcile│ │ - Creates Service │ │
│ │ │ │ - Updates Status │ │
│ └─────────────┤ │ │ │
│ │ └───────────────────┘ │
│ ┌─────────────┤ │
│ │ gRPC Client │◄──────────────────────────────────────┘
│ │ │ gRPC Connection
│ └─────────────┤ Port 9090
└─────────────────┘
Provider Components¶
1. Provider Runtime Deployments¶
Each Provider resource automatically creates:
- Deployment: Runs provider-specific containers
- Service: ClusterIP service for gRPC communication
- ConfigMaps: Provider configuration
- Secret mounts: Credentials for hypervisor access
Configuration Flow: Provider Resource → Provider Pod¶
The VirtRigaud Provider Controller automatically translates your Provider resource configuration into the appropriate command-line arguments and environment variables for the provider pod.
Command-Line Arguments¶
The controller generates these arguments from your Provider spec:
| Provider Field | Generated Argument | Example |
|---|---|---|
spec.type | --provider-type | --provider-type=vsphere |
spec.endpoint | --provider-endpoint | --provider-endpoint=https://vcenter.example.com |
spec.runtime.service.port | --grpc-addr | --grpc-addr=:9090 |
| (hardcoded) | --metrics-addr | --metrics-addr=:8080 |
| (optional) | --tls-enabled | --tls-enabled=false |
Environment Variables¶
The controller also sets these environment variables:
| Provider Field | Environment Variable | Example |
|---|---|---|
spec.type | PROVIDER_TYPE | vsphere |
spec.endpoint | PROVIDER_ENDPOINT | https://vcenter.example.com |
metadata.namespace | PROVIDER_NAMESPACE | default |
metadata.name | PROVIDER_NAME | vsphere-datacenter |
| (optional) | TLS_ENABLED | false |
Secret Volume Mounts¶
Credentials from spec.credentialSecretRef are automatically mounted at:
- Mount Path:
/etc/virtrigaud/credentials/ - Files Created: Each secret key becomes a file
username→/etc/virtrigaud/credentials/usernamepassword→/etc/virtrigaud/credentials/passwordtoken→/etc/virtrigaud/credentials/token
Complete Example¶
When you create this Provider resource:
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: vsphere-datacenter
namespace: default
spec:
type: vsphere
endpoint: "https://vcenter.example.com:443"
credentialSecretRef:
name: vsphere-credentials
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.2.0"
service:
port: 9090
The controller automatically creates a deployment with:
Command-line arguments:
/provider-vsphere \
--grpc-addr=:9090 \
--metrics-addr=:8080 \
--provider-type=vsphere \
--provider-endpoint=https://vcenter.example.com:443 \
--tls-enabled=false
Environment variables:
PROVIDER_TYPE=vsphere
PROVIDER_ENDPOINT=https://vcenter.example.com:443
PROVIDER_NAMESPACE=default
PROVIDER_NAME=vsphere-datacenter
TLS_ENABLED=false
Volume mounts:
/etc/virtrigaud/credentials/username # Contains: admin@vsphere.local
/etc/virtrigaud/credentials/password # Contains: your-password
✅ Key Point: You Don't Configure This Manually¶
The beauty of VirtRigaud's Remote Provider architecture is that you never need to manually configure command-line arguments or environment variables. Simply create the Provider resource, and the controller handles all the deployment details automatically!
2. Provider Images¶
Specialized images for each provider type:
- ghcr.io/projectbeskar/virtrigaud/provider-vsphere: vSphere provider with govmomi
- ghcr.io/projectbeskar/virtrigaud/provider-libvirt: LibVirt provider via virsh commands
- ghcr.io/projectbeskar/virtrigaud/provider-proxmox: Proxmox VE provider
- ghcr.io/projectbeskar/virtrigaud/provider-mock: Mock provider for testing
3. gRPC Communication¶
- Protocol: gRPC with protocol buffers
- Security: Secure communication over TLS (optional)
- Health: Built-in health checks and graceful shutdown
- Metrics: Prometheus metrics on port 8080
Provider Configuration¶
Basic Provider Setup¶
apiVersion: v1
kind: Secret
metadata:
name: vsphere-credentials
namespace: default
type: Opaque
stringData:
username: "admin@vsphere.local"
password: "your-password"
---
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: vsphere-datacenter
namespace: default
spec:
type: vsphere
endpoint: "https://vcenter.example.com:443"
credentialSecretRef:
name: vsphere-credentials
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.2.0"
service:
port: 9090
Advanced Configuration¶
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: libvirt-cluster
namespace: production
spec:
type: libvirt
endpoint: "qemu+ssh://admin@kvm.example.com/system"
credentialSecretRef:
name: libvirt-credentials
defaults:
cluster: production
rateLimit:
qps: 20
burst: 50
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-libvirt:v0.2.0"
replicas: 3
service:
port: 9090
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "2"
memory: "2Gi"
# High availability setup
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/instance: libvirt-cluster
topologyKey: kubernetes.io/hostname
# Node placement
nodeSelector:
workload-type: compute
tolerations:
- key: "compute-dedicated"
operator: "Equal"
value: "true"
effect: "NoSchedule"
# Environment variables
env:
- name: LIBVIRT_DEBUG
value: "1"
- name: PROVIDER_TIMEOUT
value: "300s"
Security Model¶
Pod Security¶
- Non-root execution: All containers run as non-root users
- Read-only filesystem: Immutable container filesystem
- Minimal capabilities: Reduced Linux capabilities
- Security contexts: Enforced via deployment templates
Credential Isolation¶
- Separated secrets: Each provider has dedicated credential secrets
- Scoped access: Providers only access their own hypervisor credentials
- RBAC isolation: Fine-grained RBAC per provider namespace
Network Security¶
- Service mesh ready: Compatible with Istio/Linkerd
- Network policies: Optional traffic restrictions
- TLS support: Secure gRPC communication (configurable)
Communication Protocol¶
gRPC Service Definition¶
service Provider {
rpc Validate(ValidateRequest) returns (ValidateResponse);
rpc Create(CreateRequest) returns (CreateResponse);
rpc Delete(DeleteRequest) returns (TaskResponse);
rpc Power(PowerRequest) returns (TaskResponse);
rpc Reconfigure(ReconfigureRequest) returns (TaskResponse);
rpc Describe(DescribeRequest) returns (DescribeResponse);
rpc TaskStatus(TaskStatusRequest) returns (TaskStatusResponse);
rpc ListCapabilities(CapabilitiesRequest) returns (CapabilitiesResponse);
}
Error Handling¶
- Retry logic: Exponential backoff for transient failures
- Circuit breakers: Prevent cascade failures
- Timeout controls: Configurable per-operation timeouts
- Status reporting: Conditions reflected in Kubernetes status
Observability¶
Metrics¶
Provider pods expose Prometheus metrics on port 8080:
# Request metrics
provider_grpc_requests_total{method="Create",status="success"} 42
provider_grpc_request_duration_seconds{method="Create",quantile="0.95"} 2.5
# VM metrics
provider_vms_total{state="running"} 15
provider_vms_total{state="stopped"} 3
# Health metrics
provider_health_status{provider="vsphere-datacenter"} 1
provider_hypervisor_connection_status{endpoint="vcenter.example.com"} 1
Logging¶
- Structured logs: JSON format with correlation IDs
- Log levels: Configurable verbosity (debug, info, warn, error)
- Request tracing: Context propagation across gRPC calls
Health Checks¶
- Kubernetes probes: Liveness and readiness probes
- gRPC health protocol: Standard health check implementation
- Hypervisor connectivity: Validates connection to external systems
Deployment Patterns¶
Single Provider Setup¶
# Simple development setup
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: dev-vsphere
spec:
type: vsphere
endpoint: "https://vcenter-dev.example.com:443"
credentialSecretRef:
name: dev-credentials
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.2.0"
High Availability Setup¶
# Production HA setup
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: prod-vsphere
spec:
type: vsphere
endpoint: "https://vcenter-prod.example.com:443"
credentialSecretRef:
name: prod-credentials
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.2.0"
replicas: 3
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/instance: prod-vsphere
topologyKey: kubernetes.io/hostname
Multi-Environment Setup¶
# Development environment
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: dev-libvirt
namespace: development
spec:
type: libvirt
endpoint: "qemu+ssh://dev@libvirt-dev.example.com/system"
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-libvirt:v0.2.0"
resources:
requests:
cpu: "100m"
memory: "128Mi"
---
# Production environment
apiVersion: infra.virtrigaud.io/v1beta1
kind: Provider
metadata:
name: prod-libvirt
namespace: production
spec:
type: libvirt
endpoint: "qemu+ssh://prod@libvirt-prod.example.com/system"
runtime:
mode: Remote
image: "ghcr.io/projectbeskar/virtrigaud/provider-libvirt:v0.2.0"
replicas: 2
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
Benefits¶
Scalability¶
- Horizontal scaling: Multiple provider replicas per hypervisor
- Resource isolation: Independent resource allocation per provider
- Load distribution: gRPC load balancing across provider instances
Security¶
- Credential isolation: Hypervisor credentials isolated to provider pods
- Network segmentation: Providers can run in separate namespaces
- Least privilege: Manager runs without direct hypervisor access
Reliability¶
- Fault isolation: Provider failures don't affect the manager
- Independent updates: Provider images updated separately
- Circuit breaking: Automatic failure detection and recovery
Operational Excellence¶
- Rolling updates: Zero-downtime provider updates
- Health monitoring: Built-in health checks and metrics
- Debugging: Isolated provider logs and observability
Troubleshooting¶
Common Issues¶
-
Image Pull Failures
-
Network Connectivity
-
Credential Issues
Debugging Commands¶
# Check provider status
kubectl describe provider vsphere-datacenter
# Check provider deployment
kubectl get deployment -l app.kubernetes.io/instance=vsphere-datacenter
# Check provider pods
kubectl get pods -l app.kubernetes.io/instance=vsphere-datacenter
# View provider logs
kubectl logs -l app.kubernetes.io/instance=vsphere-datacenter -f
# Check provider metrics
kubectl port-forward svc/virtrigaud-provider-vsphere-datacenter 8080:8080
curl http://localhost:8080/metrics
Performance Tuning¶
# Optimize for high-volume workloads
spec:
rateLimit:
qps: 100 # Increase API rate limit
burst: 200 # Allow burst capacity
runtime:
replicas: 5 # Scale out for throughput
resources:
requests:
cpu: "1" # Guarantee CPU resources
memory: "1Gi"
limits:
cpu: "4" # Allow burst CPU
memory: "4Gi"
Best Practices¶
Resource Management¶
- Right-sizing: Start with small requests, monitor and adjust
- Limits: Always set memory limits to prevent OOM kills
- QoS: Use Guaranteed QoS for production workloads
Security¶
- Secrets rotation: Implement regular credential rotation
- Network policies: Restrict provider-to-hypervisor traffic
- RBAC: Use dedicated service accounts per provider
Monitoring¶
- Alerting: Set up alerts on provider health metrics
- Dashboards: Create Grafana dashboards for provider metrics
- Log aggregation: Centralize logs for debugging and auditing
Migration and Upgrades¶
Provider Image Updates¶
# Update provider image
kubectl patch provider vsphere-datacenter -p '
{
"spec": {
"runtime": {
"image": "ghcr.io/projectbeskar/virtrigaud/provider-vsphere:v0.2.0"
}
}
}'
# Monitor rollout
kubectl rollout status deployment virtrigaud-provider-vsphere-datacenter