Version: main 🚧

Embedded etcd

Limited vCluster Tenancy Configuration Support

This feature is only available for the following:

Running the control plane as a container and the following worker node types:

Host Nodes
Private Nodes

Running the control plane as a binary for vCluster Standalone, which uses private nodes.

Enterprise-Only Feature

This feature is an Enterprise feature. See our pricing plans or contact our sales team for more information.

Upgrade Notice

An issue exists when upgrading etcd (from version 3.5.1 or later, but earlier than 3.5.20) to version 3.6. This upgrade path can lead to a failed upgrade and cause the virtual cluster to break. etcd version 3.5.20 includes a fix that migrates membership data to the v3 data store. This migration prevents the issue when upgrading to version 3.6.

To avoid this issue, vCluster does not upgrade etcd to version 3.6 until vCluster version 0.29.0.

Any vCluster running a version earlier than 0.24.2, must first be upgraded to a version between 0.24.2 and 0.28.x, before upgrading to version 0.29.0.

For more information, see the official etcd documentation.

When using this backing store option, etcd is deployed as part of the vCluster control plane pod to reduce the overall footprint.

controlPlane:
  backingStore:
    etcd:
      embedded:
        enabled: true

How embedded etcd works

Embedded etcd starts the etcd binary with the Kubernetes control plane inside the vCluster pod. This enables vCluster to run in high availability (HA) scenarios without requiring a separate StatefulSet or Deployment.

vCluster fully manages embedded etcd and provides these capabilities:

Dynamic scaling: Scales the etcd cluster up or down based on vCluster replica count.
Automatic recovery: Recovers etcd in failure scenarios such as corrupted members.
Seamless migration: Migrates from SQLite or deployed etcd to embedded etcd automatically.
Simplified deployment: Requires no additional StatefulSets or Deployments.

Scaling behavior

vCluster dynamically builds the etcd cluster based on the number of desired replicas. For example, when you scale vCluster from 1 to 3 replicas, vCluster automatically adds the new replicas as members to the existing single-member cluster. Similarly, vCluster removes etcd members when you scale down the cluster.

When scaling down breaks quorum (such as scaling from 3 to 1 replicas), vCluster rebuilds the etcd cluster without data loss or interruption. This enables dynamic scaling up and down of vCluster.

Disaster recovery

When embedded etcd encounters failures, vCluster provides both automatic and manual recovery options to restore cluster capabilities.

Automatic recovery

vCluster recovers the etcd cluster automatically in most failure scenarios by removing and readding the failing member. Automatic recovery occurs in these cases:

Unresponsive member: Etcd member is unresponsive for more than 2 minutes.
Detected issues: Corruption or another alarm is detected on the etcd member.

vCluster attempts to recover only a single replica at a time. If recovering an etcd member results in quorum loss, vCluster does not recover the member automatically.

Manual recovery

Recover a single replica

When a single etcd replica fails, vCluster can recover the replica automatically in most cases, including:

Replica database corruption
Replica database deletion
Replica PersistentVolumeClaim (PVC) deletion
Replica removal from etcd cluster using etcdctl member remove ID
Replica stuck as a learner

If vCluster cannot recover the single replica automatically, wait at least 10 minutes before deleting the replica pod and PVC. This action causes vCluster to rejoin the etcd member.

Recover the entire cluster

In rare cases, the entire etcd cluster requires manual recovery. This occurs when the majority of etcd member replicas become corrupted or deleted simultaneously (such as 2 of 3, 3 of 5, or 4 of 7 replicas). In this scenario, etcd fails to start and vCluster cannot recover automatically.

note

Normal pod restarts or terminations do not require manual recovery. These events trigger automatic leader election within the etcd cluster.

Recovery procedures depend on whether the first replica (the pod ending with -0) is among the failing replicas.

note

The recovery procedure for the first replica also depends on your StatefulSet's podManagementPolicy configuration (Parallel or OrderedReady). See the first replica recovery section for details on migrating between policies if needed.

Find your vCluster namespace

If using VirtualClusterInstance (platform), the vCluster StatefulSet runs in a different namespace than the VirtualClusterInstance itself. Find the StatefulSet namespace with:

kubectl get virtualclusterinstance <instance-name> -n <vci-namespace> -o jsonpath='{.spec.clusterRef.namespace}'

For example, if your VirtualClusterInstance is named my-vcluster in the p-default namespace, the StatefulSet might be in vcluster-my-vcluster-p-default.

If using Helm, the namespace is what you specified during installation (e.g., vcluster-my-team).

Modify the following with your specific values to replace on the whole page and generate copyable commands:

VCLUSTER_NAME

NAMESPACE

VCLUSTER_LABEL

Use the following procedures when some replicas are still functioning:

First replica is not failing
First replica is failing

Scale the StatefulSet to one replica:

kubectl scale statefulset my-vcluster --replicas=1 -n vcluster-my-team

Verify only one pod is running:

kubectl get pods -l app=vcluster -n vcluster-my-team

Monitor the rebuild process:
```
kubectl logs -f my-vcluster-0 -n vcluster-my-team
```
Watch for log messages indicating etcd is ready and the cluster is in good condition.

Scale back up to your target replica count:

Modify the following with your specific values to generate a copyable command:

REPLICA COUNT

kubectl scale statefulset my-vcluster --replicas=3 -n vcluster-my-team

Verify all replicas are running:

kubectl get pods -l app=vcluster -n vcluster-my-team
kubectl logs my-vcluster-0 -n vcluster-my-team | grep "cluster is ready"

warning

Before attempting any recovery procedure, create a backup of your virtual cluster using vcluster snapshot create --include-volumes. This ensures both the virtual cluster's etcd data and persistent volumes are backed up.

If the virtual cluster's etcd is in a bad state and the snapshot command fails, you can still back up from the host cluster (which has its own functioning etcd). Use your preferred backup solution (e.g., Velero, Kasten, or cloud-native backup tools) to back up the host cluster namespace containing the vCluster resources. Ensure the backup includes:

All Kubernetes resources in the vCluster namespace (StatefulSet, Services, etc.)
PersistentVolumeClaims and their associated volume data (contains the virtual cluster's etcd data)
Secrets and ConfigMaps

When restored, the vCluster pods will restart and the virtual cluster will be recreated from the backed-up etcd data.

If using namespace syncing, back up all synced namespaces on the host cluster as well.

The recovery procedure depends on your StatefulSet podManagementPolicy configuration. vCluster version 0.20 and later use Parallel by default. Earlier versions used OrderedReady.

info

If more than one pod is down with podManagementPolicy: OrderedReady, you must first migrate to Parallel before attempting recovery.

Check your configuration:

kubectl get statefulset my-vcluster -n vcluster-my-team -o jsonpath='{.spec.podManagementPolicy}'

Parallel (default)
OrderedReady (legacy)

First, identify the PVC for replica-0:
```
kubectl get pvc -l app=vcluster -n vcluster-my-team
```
The PVC name typically follows the pattern data-<vcluster-name>-0 but may vary if customized in your configuration. Note the exact name from the output above, then delete the corrupted pod and its PVC:
Modify the following with your specific values to generate a copyable command:
PVC PREFIX
kubectl delete pod my-vcluster-0 -n vcluster-my-team kubectl delete pvc data-my-vcluster-0 -n vcluster-my-team
The pod restarts with a new empty PVC. The initial attempts fail because the new member tries to join the existing etcd cluster but lacks the required data. After 1-3 pod restarts, vCluster's automatic recovery detects the empty member and properly adds it as a new learner, allowing it to sync data from healthy members and join the cluster.

Monitor the recovery process:

kubectl get pods -l app=vcluster -n vcluster-my-team -w

Check the logs to verify the pod rejoins successfully:

kubectl logs -f my-vcluster-0 -n vcluster-my-team

caution

If more than one pod is down with podManagementPolicy: OrderedReady, migrate to Parallel first before attempting recovery.

Check that the StatefulSet retains PVCs on deletion:
```
kubectl get statefulset my-vcluster -n vcluster-my-team -o jsonpath='{.spec.persistentVolumeClaimRetentionPolicy}'
```
The policy should be Retain. This is the default but can be overridden by controlPlane.statefulSet.persistence.volumeClaim.retentionPolicy in your configuration.

Delete the StatefulSet without deleting the pods:

kubectl delete statefulset my-vcluster -n vcluster-my-team --cascade=orphan

Update your virtual cluster configuration to use Parallel pod management policy.

If using a VirtualClusterInstance, edit the instance and update the podManagementPolicy:

kubectl edit virtualclusterinstance my-vcluster -n vcluster-my-team

Then add or update this section in the spec:

spec:
  template:
    helmRelease:
      values: |
        controlPlane:
          statefulSet:
            scheduling:
              podManagementPolicy: Parallel

If using Helm, update your values.yaml to set the pod management policy:

values.yaml
controlPlane:
  statefulSet:
    scheduling:
      podManagementPolicy: Parallel

Then apply the update:

helm upgrade my-vcluster vcluster --repo https://charts.loft.sh --namespace vcluster-my-team --reuse-values -f values.yaml

The StatefulSet is recreated with Parallel policy and pods pick up the existing PVCs.

Now follow the same procedure as for Parallel mode.
First, identify the PVC for replica-0:
```
kubectl get pvc -l app=vcluster -n vcluster-my-team
```
The PVC name typically follows the pattern data-<vcluster-name>-0 but may vary if customized in your configuration. Note the exact name from the output above, then delete the corrupted pod and its PVC:
Modify the following with your specific values to generate a copyable command:
PVC PREFIX
kubectl delete pod my-vcluster-0 -n vcluster-my-team kubectl delete pvc data-my-vcluster-0 -n vcluster-my-team
The pod restarts with a new empty PVC. The initial attempts fail because the new member tries to join the existing etcd cluster but lacks the required data. After 1-3 pod restarts, vCluster's automatic recovery detects the empty member and properly adds it as a new learner, allowing it to sync data from healthy members and join the cluster.

warning

Never clone PVCs from other replicas. Cloning PVCs causes etcd member ID conflicts and results in data loss.

Complete data loss recovery

warning

This recovery method results in data loss up to the last backup point. Only proceed if you have verified that all etcd replicas are corrupted and no working replicas remain.

When the majority of etcd member replicas become corrupted or deleted simultaneously, the entire cluster requires recovery from backup.

Prerequisites

Before starting recovery, ensure you have:

Created a snapshot using vcluster snapshot create <vcluster-name> --include-volumes <storage-location>
The snapshot location URL (for example, s3://my-bucket/backup or oci://registry/repo:tag)
Access to the host cluster namespace where the vCluster is deployed

For detailed snapshot creation instructions, see Create snapshots.

Verify all PVCs are corrupted or inaccessible:

kubectl get pvc -l app=vcluster -n vcluster-my-team

Modify the following with your specific values to generate a copyable command:

PVC PREFIX

kubectl describe pvc data-my-vcluster-0 data-my-vcluster-1 data-my-vcluster-2 -n vcluster-my-team

Stop all vCluster instances before beginning recovery:

kubectl scale statefulset my-vcluster --replicas=0 -n vcluster-my-team

Verify all pods have terminated:

kubectl get pods -l app=vcluster -n vcluster-my-team

PVC deletion timing
After scaling down, wait a few seconds to ensure pods have fully terminated before deleting PVCs. If a pod restarts immediately after PVC deletion, the PVC may get stuck in a "Terminating" state. If this happens, delete the pod again to allow the PVC deletion to complete.
Delete all corrupted PVCs:
Modify the following with your specific values to generate a copyable command:
PVC PREFIX
kubectl delete pvc data-my-vcluster-0 data-my-vcluster-1 data-my-vcluster-2 -n vcluster-my-team
Verify PVCs are deleted:
```
kubectl get pvc -l app=vcluster -n vcluster-my-team
```
Expected output: No resources found
Why scale up before restore?
The vCluster CLI requires an accessible vCluster instance to execute the restore command. Scaling up creates a new, empty vCluster that the CLI can connect to. The vcluster restore command will then scale it back down automatically, restore the etcd data from the snapshot, and restart the vCluster with restored data.
Scale up to the desired number of replicas:
Modify the following with your specific values to generate a copyable command:
DESIRED REPLICA COUNT
kubectl scale statefulset my-vcluster --replicas=3 -n vcluster-my-team
Wait for pods to be running:
```
kubectl get pods -l app=vcluster -n vcluster-my-team
```
Expected output showing all replicas running:
```
NAME            READY   STATUS    RESTARTS   AGE
my-vcluster-0   1/1     Running   0          45s
my-vcluster-1   1/1     Running   0          43s
my-vcluster-2   1/1     Running   0          41s
```
Use the vCluster CLI to restore from your snapshot. The restore process will:
1. Pause the vCluster (scale down to 0)
2. Delete the current PVCs
3. Start a snapshot pod to restore etcd data
4. Restore PVCs from volume snapshots
5. Resume the vCluster (scale back up)
Modify the following with your specific values to generate a copyable command:
SNAPSHOT LOCATION
vcluster restore my-vcluster s3://my-bucket/backup -n vcluster-my-team
Expected output:
```
16:16:38 info Pausing vCluster my-vcluster
16:16:38 info Scale down statefulSet vcluster-my-team/my-vcluster...
16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-0
16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-1
16:16:39 info Deleting vCluster pvc vcluster-my-team/data-my-vcluster-2
16:16:39 info Starting snapshot pod for vCluster vcluster-my-team/my-vcluster...
...
Successfully restored snapshot
16:16:42 info Resuming vCluster my-vcluster
```
Authentication for remote storage
If using S3 or OCI registry, ensure you have the appropriate credentials configured:
- S3: Use AWS CLI credentials or pass credentials in the URL
- OCI: Use Docker login or pass credentials in the URL
See Create snapshots for authentication details.
Connect to the vCluster and verify your workloads are restored:
```
vcluster connect my-vcluster -n vcluster-my-team
```
Check that your resources are present:
```
kubectl get pods -A
kubectl get pvc -A
```
If everything looks correct, disconnect:
```
vcluster disconnect
```

Config reference

`embedded` required object

Embedded defines to use embedded etcd as a storage backend for the virtual cluster

`enabled` required boolean false

Enabled defines if the embedded etcd should be used.

`migrateFromDeployedEtcd` required boolean false

MigrateFromDeployedEtcd signals that vCluster should migrate from the deployed external etcd to embedded etcd.

`snapshotCount` required integer

SnapshotCount defines the number of snapshots to keep for the embedded etcd. Defaults to 10000 if less than 1.

`extraArgs` required string[] []

ExtraArgs are additional arguments to pass to the embedded etcd.

How embedded etcd works​

Scaling behavior​

Disaster recovery​

Automatic recovery​

Manual recovery​

Recover a single replica​

Recover the entire cluster​

Complete data loss recovery​

Config reference​

embedded required object ​

enabled required boolean false ​

migrateFromDeployedEtcd required boolean false ​

snapshotCount required integer ​

extraArgs required string[] [] ​

How embedded etcd works

Scaling behavior

Disaster recovery

Automatic recovery

Manual recovery

Recover a single replica

Recover the entire cluster

Complete data loss recovery

Config reference

`embedded` required object

`enabled` required boolean false

`migrateFromDeployedEtcd` required boolean false

`snapshotCount` required integer

`extraArgs` required string[] []