Skip to main content
Version: main 🚧

Embedded etcd

Limited vCluster Tenancy Configuration Support

This feature is only available for the following:

Running the control plane as a container and the following worker node types:
  • Host Nodes
  • Private Nodes
Running the control plane as a binary for vCluster Standalone, which uses private nodes.
Enterprise-Only Feature

This feature is an Enterprise feature. See our pricing plans or contact our sales team for more information.

Upgrade Notice

An issue exists when upgrading etcd (from version 3.5.1 or later, but earlier than 3.5.20) to version 3.6. This upgrade path can lead to a failed upgrade and cause the virtual cluster to break. etcd version 3.5.20 includes a fix that migrates membership data to the v3 data store. This migration prevents the issue when upgrading to version 3.6.

To avoid this issue, vCluster does not upgrade etcd to version 3.6 until vCluster version 0.29.0.

Any vCluster running a version earlier than 0.24.2, must first be upgraded to a version between 0.24.2 and 0.28.x, before upgrading to version 0.29.0.

For more information, see the official etcd documentation.

When using this backing store option, etcd is deployed as part of the vCluster control plane pod to reduce the overall footprint.

controlPlane:
backingStore:
etcd:
embedded:
enabled: true

How embedded etcd works​

Embedded etcd starts the etcd binary with the Kubernetes control plane inside the vCluster pod. This enables vCluster to run in high availability (HA) scenarios without requiring a separate StatefulSet or Deployment.

vCluster fully manages embedded etcd and provides these capabilities:

  • Dynamic scaling: Scales the etcd cluster up or down based on vCluster replica count.
  • Automatic recovery: Recovers etcd in failure scenarios such as corrupted members.
  • Seamless migration: Migrates from SQLite or deployed etcd to embedded etcd automatically.
  • Simplified deployment: Requires no additional StatefulSets or Deployments.

Scaling behavior​

vCluster dynamically builds the etcd cluster based on the number of desired replicas. For example, when you scale vCluster from 1 to 3 replicas, vCluster automatically adds the new replicas as members to the existing single-member cluster. Similarly, vCluster removes etcd members when you scale down the cluster.

When scaling down breaks quorum (such as scaling from 3 to 1 replicas), vCluster rebuilds the etcd cluster without data loss or interruption. This enables dynamic scaling up and down of vCluster.

Disaster recovery​

When embedded etcd encounters failures, vCluster provides both automatic and manual recovery options to restore cluster capabilities.

Automatic recovery​

vCluster recovers the etcd cluster automatically in most failure scenarios by removing and readding the failing member. Automatic recovery occurs in these cases:

  • Unresponsive member: Etcd member is unresponsive for more than 2 minutes.
  • Detected issues: Corruption or another alarm is detected on the etcd member.

vCluster attempts to recover only a single replica at a time. If recovering an etcd member results in quorum loss, vCluster does not recover the member automatically.

Manual recovery​

Recover a single replica​

When a single etcd replica fails, vCluster can recover the replica automatically in most cases, including:

  • Replica database corruption
  • Replica database deletion
  • Replica PersistentVolumeClaim (PVC) deletion
  • Replica removal from etcd cluster using etcdctl member remove ID
  • Replica stuck as a learner

If vCluster cannot recover the single replica automatically, wait at least 10 minutes before deleting the replica pod and PVC. This action causes vCluster to rejoin the etcd member.

Recover the entire cluster​

In rare cases, the entire etcd cluster requires manual recovery. This occurs when the majority of etcd member replicas become corrupted or deleted simultaneously (such as 2 of 3, 3 of 5, or 4 of 7 replicas). In this scenario, etcd fails to start and vCluster cannot recover automatically.

note

Normal pod restarts or terminations do not require manual recovery. These events trigger automatic leader election within the etcd cluster.

Recovery procedures depend on whether the first replica (the pod ending with -0) is among the failing replicas.

note

The recovery procedure for the first replica also depends on your StatefulSet's podManagementPolicy configuration (Parallel or OrderedReady). See the first replica recovery section for details on migrating between policies if needed.

Find your vCluster namespace

If using VirtualClusterInstance (platform), the vCluster StatefulSet runs in a different namespace than the VirtualClusterInstance itself. Find the StatefulSet namespace with:

kubectl get virtualclusterinstance <instance-name> -n <vci-namespace> -o jsonpath='{.spec.clusterRef.namespace}'

For example, if your VirtualClusterInstance is named my-vcluster in the p-default namespace, the StatefulSet might be in vcluster-my-vcluster-p-default.

If using Helm, the namespace is what you specified during installation (e.g., vcluster-my-team).

Modify the following with your specific values to replace on the whole page and generate copyable commands:

Use the following procedures when some replicas are still functioning:


  1. Scale the StatefulSet to one replica:

    kubectl scale statefulset my-vcluster --replicas=1 -n vcluster-my-team

    Verify only one pod is running:

    kubectl get pods -l app=vcluster -n vcluster-my-team
  2. Monitor the rebuild process:

    kubectl logs -f my-vcluster-0 -n vcluster-my-team

    Watch for log messages indicating etcd is ready and the cluster is in good condition.

  3. Scale back up to your target replica count:

    Modify the following with your specific values to generate a copyable command:
    kubectl scale statefulset my-vcluster --replicas=3 -n vcluster-my-team

    Verify all replicas are running:

    kubectl get pods -l app=vcluster -n vcluster-my-team
    kubectl logs my-vcluster-0 -n vcluster-my-team | grep "cluster is ready"

Complete data loss recovery​

warning

This recovery method results in data loss up to the last backup point. Only proceed if you have verified that all etcd replicas are corrupted and no working replicas remain.

When the majority of etcd member replicas become corrupted or deleted simultaneously, the entire cluster requires recovery from backup.

Prerequisites

Before starting recovery, ensure you have:

  • Created a snapshot using vcluster snapshot create <vcluster-name> --include-volumes <storage-location>
  • The snapshot location URL (for example, s3://my-bucket/backup or oci://registry/repo:tag)
  • Access to the host cluster namespace where the vCluster is deployed

For detailed snapshot creation instructions, see Create snapshots.

  1. Verify all PVCs are corrupted or inaccessible:

    kubectl get pvc -l app=vcluster -n vcluster-my-team

    Modify the following with your specific values to generate a copyable command:
    kubectl describe pvc data-my-vcluster-0 data-my-vcluster-1 data-my-vcluster-2 -n vcluster-my-team
  2. Stop all vCluster instances before beginning recovery:

    kubectl scale statefulset my-vcluster --replicas=0 -n vcluster-my-team

    Verify all pods have terminated:

    kubectl get pods -l app=vcluster -n vcluster-my-team
  3. PVC deletion timing

    After scaling down, wait a few seconds to ensure pods have fully terminated before deleting PVCs. If a pod restarts immediately after PVC deletion, the PVC may get stuck in a "Terminating" state. If this happens, delete the pod again to allow the PVC deletion to complete.

    Delete all corrupted PVCs:

    Modify the following with your specific values to generate a copyable command:
    kubectl delete pvc data-my-vcluster-0 data-my-vcluster-1 data-my-vcluster-2 -n vcluster-my-team

    Verify PVCs are deleted:

    kubectl get pvc -l app=vcluster -n vcluster-my-team

    Expected output: No resources found

  4. Why scale up before restore?

    The vCluster CLI requires an accessible vCluster instance to execute the restore command. Scaling up creates a new, empty vCluster that the CLI can connect to. The vcluster restore command will then scale it back down automatically, restore the etcd data from the snapshot, and restart the vCluster with restored data.

    Scale up to the desired number of replicas:

    Modify the following with your specific values to generate a copyable command:
    kubectl scale statefulset my-vcluster --replicas=3 -n vcluster-my-team

    Wait for pods to be running:

    kubectl get pods -l app=vcluster -n vcluster-my-team

    Expected output showing all replicas running:

    NAME            READY   STATUS    RESTARTS   AGE
    my-vcluster-0 1/1 Running 0 45s
    my-vcluster-1 1/1 Running 0 43s
    my-vcluster-2 1/1 Running 0 41s
  5. Use the vCluster CLI to restore from your snapshot. The restore process will:

    1. Pause the vCluster (scale down to 0)
    2. Delete the current PVCs
    3. Start a snapshot pod to restore etcd data
    4. Restore PVCs from volume snapshots
    5. Resume the vCluster (scale back up)
    Modify the following with your specific values to generate a copyable command:
    vcluster restore my-vcluster s3://my-bucket/backup -n vcluster-my-team

    Expected output:

    16:16:38 info Pausing vCluster my-vcluster
    16:16:38 info Scale down statefulSet vcluster-my-team/my-vcluster...
    16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-0
    16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-1
    16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-2
    16:16:39 info Starting snapshot pod for vCluster vcluster-my-team/my-vcluster...
    ...
    Successfully restored snapshot
    16:16:42 info Resuming vCluster my-vcluster
    Authentication for remote storage

    If using S3 or OCI registry, ensure you have the appropriate credentials configured:

    • S3: Use AWS CLI credentials or pass credentials in the URL
    • OCI: Use Docker login or pass credentials in the URL

    See Create snapshots for authentication details.

  6. Connect to the vCluster and verify your workloads are restored:

    vcluster connect my-vcluster -n vcluster-my-team

    Check that your resources are present:

    kubectl get pods -A
    kubectl get pvc -A

    If everything looks correct, disconnect:

    vcluster disconnect

Config reference​

embedded required object ​

Embedded defines to use embedded etcd as a storage backend for the virtual cluster

enabled required boolean false ​

Enabled defines if the embedded etcd should be used.

migrateFromDeployedEtcd required boolean false ​

MigrateFromDeployedEtcd signals that vCluster should migrate from the deployed external etcd to embedded etcd.

snapshotCount required integer ​

SnapshotCount defines the number of snapshots to keep for the embedded etcd. Defaults to 10000 if less than 1.

extraArgs required string[] [] ​

ExtraArgs are additional arguments to pass to the embedded etcd.