Skip to content

Instantly share code, notes, and snippets.

@caruccio
Last active August 14, 2024 15:23
Show Gist options
  • Save caruccio/be825aa39d53535217494369cc793dbd to your computer and use it in GitHub Desktop.
Save caruccio/be825aa39d53535217494369cc793dbd to your computer and use it in GitHub Desktop.
Migrate EBS based PVC to other cluster and/or availability zone
#!/bin/bash
#
# 05-08-2024: Complete rewrite
#
usage()
{
cat <<-EOF
Usage: $0 <REQUIRED-PARAMETERS...> [OPTIONAL-PARAMETERS...]
This script can clone a PV/PVC to:
- A new PV/PVC in the same namespace (same cluster)
- Another namespace (same or new cluster)
- A new Availability Zone (same or new cluster)
- A new cluster
Both cluster and AZ must be from the same region and account.
Required parameters
-------------------
--source-namespace=''
Namespace to clone PVC from.
--source-pvc=''
PVC name to clone from.
Optional parameters
-------------------
--aws-profile=''
AWS profile name to use. AWS auth methods works as usual.
--delete-snapshot
Delete snapshot.
--dry-run
Only print changes.
--kubectl-bin-source='kubectl'
Kubectl binary. Usefull for when clusters have too far skewed version.
--kubectl-bin-target='kubectl'
Kubectl binary. Usefull for when clusters have too far skewed version.
--source-aws-zone=''
AWS zone of the volume. Auto discovered if unspecified.
--source-cluster=''
Kubeconfig context to read PVC/PV from. Defaults to current context.
--source-snapshot-id=''
Snapshot ID to create target volume from.
--source-workload=''
Kind/Name of the workload to stop/start during the cloning.
--target-aws-zone=''
AWS zone to restore the snapshot. Defaults to same as source.
--target-cluster=''
Kubeconfig context to create the target PVC/PV. Defaults to same cluster.
--target-cluster-id=''
AWS cluster ID for target volume. You can find it in AWS tag 'kubernetes.io/cluster/{ID}'.
--target-fs-type=''
Filesystem type for target volume. Defaults to same as source.
--target-namespace=''
Namespace to clone PVC to.
--target-pv=''
PV name for the clone. Defaults to CSI compliant random UUID.
--target-pvc=''
PVC name for the clone. Defaults to source PVC name.
--target-storage-class=''
Storage Class name for target PVC. This is required if target cluster
doesn't has a storageClass with same name.
--target-volumet-type=''
EBS volume type for target volume. Usually 'gp2' or 'gp3'.
If --source-cluster and --target-cluster are the same, --source-pvc and --target-pvc can't have the
same name if they are in the same namespace.
EOF
exit ${1:-0}
}
if [ $# -eq 0 ]; then
usage
fi
set -a # export all vars
SOURCE_KUBECTL=kubectl
TARGET_KUBECTL=kubectl
SOURCE_SNAPSHOT_ID=''
DELETE_SNAPSHOT=false
DRY_RUN=false
SOURCE_DRY_RUN=''
TARGET_DRY_RUN=''
DRY_RUN_CMD=''
declare -A BOOL_FLAGS
BOOL_FLAGS[--dry-run]=true
BOOL_FLAGS[--delete-snapshot]=true
while [ $# -gt 0 ]; do
opt=$1
val=""
if [[ "$opt" =~ --[a-z0-9]+(-[a-z0-9]+){0,}= ]]; then
val=${opt#*=}
opt=${opt%%=*}
shift
else
if ${BOOL_FLAGS[--dry-run]:-false}; then
opt="$1"
val="true"
shift
elif ${BOOL_FLAGS[--delete-snapshot]:-false}; then
opt="$1"
val="true"
shift
else
opt="$1"
val="$2"
shift 2
fi
fi
case $opt in
--aws-profile) AWS_PROFILE=$val ;;
--delete-snapshot) DELETE_SNAPSHOT=true ;;
--dry-run) DRY_RUN=true DRY_RUN_CMD=debug ;;
--kubectl-bin-source) SOURCE_KUBECTL=$val ;;
--kubectl-bin-target) TARGET_KUBECTL=$val ;;
--source-aws-zone) SOURCE_AWS_ZONE=$val ;;
--source-cluster) SOURCE_CLUSTER=$val ;;
--source-namespace) SOURCE_NAMESPACE=$val ;; # required
--source-pvc) SOURCE_PVC=$val ;; # required
--source-snapshot-id) SOURCE_SNAPSHOT_ID=$val ;;
--source-workload) SOURCE_WORKLOAD=$val ;;
--target-aws-zone) TARGET_AWS_ZONE=$val ;;
--target-cluster) TARGET_CLUSTER=$val ;;
--target-cluster-id) TARGET_CLUSTER_ID=$val ;;
--target-fs-type) TARGET_FS_TYPE=$val ;;
--target-namespace) TARGET_NAMESPACE=$val ;; # required
--target-pv) TARGET_PV=$val ;;
--target-pvc) TARGET_PVC=$val ;;
--target-storage-class) TARGET_STORAGE_CLASS=$val ;;
--target-volume-type) TARGET_VOLUME_TYPE=$val ;;
*) usage
esac
done
function info()
{
echo -e $'\E[1;96m+' "$*" $'\E[0m'
}
function debug()
{
echo -e $'\E[2;96m+' "$*" $'\E[0m'
}
function err()
{
echo -e $'\E[1;31m+' "$*" $'\E[0m'
}
if $DRY_RUN; then
debug 'Dry-run: nothing will be created/changed'
SOURCE_KUBECTL_VERSION_MINOR=$($SOURCE_KUBECTL version -o json --client | jq -r '.clientVersion.minor')
TARGET_KUBECTL_VERSION_MINOR=$($TARGET_KUBECTL version -o json --client | jq -r '.clientVersion.minor')
[ "$SOURCE_KUBECTL_VERSION_MINOR" -ge 18 ] && SOURCE_DRY_RUN='--dry-run=client -o yaml' || SOURCE_DRY_RUN='--dry-run=true -o yaml'
[ "$TARGET_KUBECTL_VERSION_MINOR" -ge 18 ] && TARGET_DRY_RUN='--dry-run=client -o yaml' || TARGET_DRY_RUN='--dry-run=true -o yaml'
fi
for required in SOURCE_NAMESPACE SOURCE_PVC; do
if ! [ -v $required ]; then
param=${required,,}
param=${param//_/-}
err "Missing required parameter: $param"
exit 1
fi
done
if ! [ -v SOURCE_CLUSTER ]; then
SOURCE_CLUSTER=$($SOURCE_KUBECTL config current-context)
fi
if ! [ -v TARGET_CLUSTER ]; then
TARGET_CLUSTER="$SOURCE_CLUSTER"
fi
if ! [ -v TARGET_NAMESPACE ]; then
TARGET_NAMESPACE=$SOURCE_NAMESPACE
fi
if ! [ -v TARGET_PVC ]; then
TARGET_PVC=$SOURCE_PVC
fi
if [ "$SOURCE_CLUSTER" == "$TARGET_CLUSTER" ]; then
if [ "$SOURCE_NAMESPACE" == "$TARGET_NAMESPACE" ]; then
if [ "$SOURCE_PVC" == "$TARGET_PVC" ]; then
err "Can't clone PVC to same cluster/namespace/name"
exit 1
fi
fi
fi
info "Checking if can reach cluster(s)"
for context in "$SOURCE_CLUSTER" "$TARGET_CLUSTER"; do
if ! $SOURCE_KUBECTL config get-contexts -o name | grep -q "^$context\$"; then
err "Cluster not found: $context"
exit 1
fi
if ! $SOURCE_KUBECTL version --context="$context" &>/dev/null; then
err "Unable to reach cluster: $context"
err "Please try it with: kubectl version --context=\"$context\""
exit 1
fi
done
SOURCE_KUBECTL+=" --context=$SOURCE_CLUSTER"
TARGET_KUBECTL+=" --context=$TARGET_CLUSTER"
if ! [ -v TARGET_PV ]; then
TARGET_PV=pvc-$(</proc/sys/kernel/random/uuid)
fi
if [ -v SOURCE_WORKLOAD ]; then
if ! [[ $SOURCE_WORKLOAD =~ .*/.* ]]; then
err "Invalid workload name: Expecting kind/name. Got: $SOURCE_WORKLOAD"
exit 1
else
info "Reading workload replicas"
WORKLOAD_REPLICAS=$($SOURCE_KUBECTL get -n $SOURCE_NAMESPACE $SOURCE_WORKLOAD --template={{.spec.replicas}})
fi
else
SOURCE_WORKLOAD=""
WORKLOAD_REPLICAS=0
fi
set -eu
info "Reading source PVC"
DATA_SOURCE_PVC="$($SOURCE_KUBECTL get -n $SOURCE_NAMESPACE pvc $SOURCE_PVC -o json)"
SOURCE_PV=$(jq -r '.spec.volumeName' <<<$DATA_SOURCE_PVC)
SOURCE_STORAGE=$(jq -r '.spec.resources.requests.storage // empty' <<<$DATA_SOURCE_PVC)
SOURCE_STORAGE_CLASS=$(jq -r '
.spec.storageClassName
// .metadata.annotations["volume.kubernetes.io/storage-class"]
// .metadata.annotations["volume.beta.kubernetes.io/storage-class"]
// empty' <<<$DATA_SOURCE_PVC)
if ! [ -v TARGET_STORAGE_CLASS ]; then
DATA_TARGET_STORAGE_CLASS=$($TARGET_KUBECTL get sc -o json | jq -r '.items[] | select(.metadata.annotations["storageclass.kubernetes.io/is-default-class"]=="true") // empty')
TARGET_STORAGE_CLASS=$(jq -r '.metadata.name' <<<$DATA_TARGET_STORAGE_CLASS)
fi
if ! [ -v TARGET_STORAGE_CLASS ] || [ -z "$TARGET_STORAGE_CLASS" ]; then
err "Unable to find default target storageclass. Please specify one with --target-storage-class"
exit 1
fi
DATA_SOURCE_STORAGE_CLASS="$($SOURCE_KUBECTL get sc $SOURCE_STORAGE_CLASS -o json)"
DATA_TARGET_STORAGE_CLASS="$($TARGET_KUBECTL get sc $TARGET_STORAGE_CLASS -o json)"
if [ -z "$DATA_SOURCE_STORAGE_CLASS" ]; then
err "Source storage class not found: $SOURCE_STORAGE_CLASS"
exit 1
elif [ -z "$DATA_TARGET_STORAGE_CLASS" ]; then
err "Target storage class not found: $TARGET_STORAGE_CLASS"
exit 1
fi
if ! [ -v TARGET_VOLUME_TYPE ]; then
TARGET_VOLUME_TYPE=$(jq -r '.parameters.type // empty' <<<$DATA_TARGET_STORAGE_CLASS)
fi
if [ -z "$TARGET_VOLUME_TYPE" ]; then
err "Unable to determine target EBS volume type"
err "Please check field .parameters.type from target's storageclass/$TARGET_STORAGE_CLASS"
exit 1
fi
info "Reading source PV"
DATA_SOURCE_PV="$($SOURCE_KUBECTL get -n $SOURCE_NAMESPACE pv $SOURCE_PV -o json)"
SOURCE_VOLUME_ID=$(jq -r '.spec.csi.volumeHandle // .spec.awsElasticBlockStore.volumeID // empty' <<<$DATA_SOURCE_PV | awk -F/ '{print $NF}')
SOURCE_FS_TYPE=$(jq -r '.spec.csi.fsType // .spec.awsElasticBlockStore.fsType // empty' <<<$DATA_SOURCE_PV)
SOURCE_VOLUME_MODE=$(jq -r '.spec.volumeMode // "Filesystem"' <<<$DATA_SOURCE_PV)
SOURCE_AWS_ZONE=$(jq -r 'try(
.spec.nodeAffinity.required.nodeSelectorTerms[] |
.matchExpressions[] |
select(.key=="topology.ebs.csi.aws.com/zone" and .operator=="In") |
.values[0]) // .spec.awsElasticBlockStore.volumeID | split("/")[2]' <<<$DATA_SOURCE_PV)
if ! [ -v SOURCE_AWS_ZONE ] || ! [[ "$SOURCE_AWS_ZONE" =~ [a-za-z]-[a-z]+-[0-9][a-z] ]]; then
err "Unable to discover AWS Zone for source PV $SOURCE_PV"
err "Please specify one with --source-aws-zone"
err "Found zone: '$SOURCE_AWS_ZONE'"
exit 1
fi
TARGET_STORAGE=$SOURCE_STORAGE
TARGET_VOLUME_MODE=$SOURCE_VOLUME_MODE
if ! [ -v TARGET_FS_TYPE ]; then
TARGET_FS_TYPE=$SOURCE_FS_TYPE
fi
if ! [ -v TARGET_AWS_ZONE ]; then
TARGET_AWS_ZONE=$SOURCE_AWS_ZONE
fi
export AWS_DEFAULT_REGION=${SOURCE_AWS_ZONE:0:-1}
info "Checking if can reach AWS"
DATA_SOURCE_VOLUME=$(aws ec2 describe-volumes --volume-ids $SOURCE_VOLUME_ID | jq -r '.Volumes[0] // empty' 2>/dev/null)
if [ -z "$DATA_SOURCE_VOLUME" ]; then
err "Unable to read volume $SOURCE_VOLUME_ID from AWS Zone $SOURCE_AWS_ZONE"
err "Maybe credentials are unset or invalid?"
err "You can use flags --aws-profile and to specify your credentials"
exit 1
fi
SOURCE_CLUSTER_ID=$(jq -r '.Tags[] | select(.Key|startswith("kubernetes.io/cluster/")) | .Key | split("/")[2] // empty' <<<$DATA_SOURCE_VOLUME)
if [ "$SOURCE_CLUSTER" == "$TARGET_CLUSTER" ]; then
TARGET_CLUSTER_ID="$SOURCE_CLUSTER_ID"
fi
if ! [ -v TARGET_CLUSTER_ID ]; then
err "Missing required parameter --target-cluster-id={ID}"
exit 1
fi
if $TARGET_KUBECTL get -n $TARGET_NAMESPACE pvc $TARGET_PVC &>/dev/null; then
err "Target PVC already exists: $TARGET_NAMESPACE/$TARGET_PVC"
err "Please delete it first and try again"
exit 1
fi
if $TARGET_KUBECTL get -n $TARGET_NAMESPACE pv $TARGET_PV &>/dev/null; then
err "Target PV already exists: $TARGET_NAMESPACE/$TARGET_PV"
err "Please delete it first and try again"
exit 1
fi
TARGET_RETAIN_POLICY=$(jq -r '.reclaimPolicy // "Retain"' <<<$DATA_TARGET_STORAGE_CLASS)
TARGET_PROVISIONER=$(jq -r '.provisioner // empty' <<<$DATA_TARGET_STORAGE_CLASS)
TARGET_CSI_PROVISIONER_IDENTITY=''
case "$TARGET_PROVISIONER" in
ebs.csi.aws.com)
# don't need to be unique for each volume, only for each provisioner
# https://github.com/kubernetes-csi/external-provisioner/blob/1194963/cmd/csi-provisioner/csi-provisioner.go#L283
#TARGET_CSI_PROVISIONER_IDENTITY=1722555589472-9999-ebs.csi.aws.com
TARGET_CSI_PROVISIONER_IDENTITY=$($TARGET_KUBECTL get leases --all-namespaces -o json \
| jq -r '.items[] | select(.metadata.name=="ebs-csi-aws-com") | .spec.holderIdentity // empty' \
| sed 's/ebs-csi-aws-com/ebs.csi.aws.com/')
;;
kubernetes.io/aws-ebs) :
;;
*)
err "Unable to determine storageclass provisioner for target volume"
exit 1
esac
BOLD='\E[1m'
RESET='\E[0m'
echo -e $"
Sumary:
-------------------------------------------------------------------------------
AWS:
Profile: $BOLD${AWS_PROFILE:-(none)}$RESET
Region: $BOLD$AWS_DEFAULT_REGION$RESET
Source:
Cluster: $BOLD$SOURCE_CLUSTER (ID=$SOURCE_CLUSTER_ID)$RESET
Namespace: $BOLD$SOURCE_NAMESPACE$RESET
PV/PVC: $BOLD$SOURCE_PV/$SOURCE_PVC ($SOURCE_STORAGE)$RESET
VolumeID: $BOLD$SOURCE_VOLUME_ID$RESET
VolumeMode: $BOLD$SOURCE_VOLUME_MODE$RESET
SnapshotID: $BOLD${SOURCE_SNAPSHOT_ID:-auto}$RESET
AWS Zone: $BOLD$SOURCE_AWS_ZONE$RESET
Workload: $BOLD${SOURCE_WORKLOAD:-(none)} (Replicas: $WORKLOAD_REPLICAS)$RESET
StorageClass: $BOLD${SOURCE_STORAGE_CLASS:-(none)}$RESET
Target:
Cluster: $BOLD$TARGET_CLUSTER (ID=$TARGET_CLUSTER_ID)$RESET
Namespace: $BOLD$TARGET_NAMESPACE$RESET
PV/PVC: $BOLD$TARGET_PV/$TARGET_PVC$RESET
AWS Zone: $BOLD$TARGET_AWS_ZONE$RESET
ProvisionerID: $BOLD${TARGET_CSI_PROVISIONER_IDENTITY:-(none)}$RESET
StorageClass: $BOLD${TARGET_STORAGE_CLASS:-(none)}$RESET
-------------------------------------------------------------------------------
"
read -p 'Press [ENTER] to start '
echo
if [ "$($SOURCE_KUBECTL get pv $SOURCE_PV -o jsonpath={.spec.persistentVolumeReclaimPolicy})" != "Retain" ]; then
info "Setting reclaimPolicy=Retain for source PV in orther to avoid accidental deletion"
$DRY_RUN_CMD $SOURCE_KUBECTL patch pv $SOURCE_PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
fi
if [ -v SOURCE_WORKLOAD ]; then
info "Refreshing workload replicas"
WORKLOAD_REPLICAS=$($SOURCE_KUBECTL get -n $SOURCE_NAMESPACE $SOURCE_WORKLOAD --template={{.spec.replicas}})
if [ $WORKLOAD_REPLICAS -gt 0 ]; then
info "Scaling down $SOURCE_WORKLOAD: $WORKLOAD_REPLICAS -> 0"
$DRY_RUN_CMD $SOURCE_KUBECTL scale -n $SOURCE_NAMESPACE $SOURCE_WORKLOAD --replicas=0
while ! $DRY_RUN; do
replicas="$($SOURCE_KUBECTL get -n $SOURCE_NAMESPACE $SOURCE_WORKLOAD --template={{.status.replicas}})"
[ "$replicas" == "0" ] && break || true
[ "$replicas" == "<no value>" ] && break || true
debug "Waiting for pod(s) to terminate: $replicas remaining"
sleep 1
done
fi
fi
function create_target_volume()
{
if $DRY_RUN; then
debug 'Dry-run: Skipping target volume creation'
TARGET_VOLUME_ID='vol-00000000000000000'
return
fi
DESCRIPTION="Cloned from cluster=$SOURCE_CLUSTER ns=$SOURCE_NAMESPACE, pvc=$SOURCE_PVC, pv=$SOURCE_PV, volumeId=$SOURCE_VOLUME_ID"
info "Waiting for volume $SOURCE_VOLUME_ID to become available"
debug "Tip: to force detach, execute in a separated terminal: aws ec2 detach-volume --volume-id $SOURCE_VOLUME_ID"
aws ec2 wait volume-available --volume-id $SOURCE_VOLUME_ID
if [ -n "$SOURCE_SNAPSHOT_ID" ]; then
info "Using existing snapshot $SOURCE_SNAPSHOT_ID"
else
info "Creating snapshot from $SOURCE_VOLUME_ID"
SOURCE_SNAPSHOT_ID=$(aws ec2 create-snapshot --volume-id $SOURCE_VOLUME_ID --description "$DESCRIPTION" --output text --query SnapshotId)
SOURCE_SNAPSHOT_PROGRESS=''
while [ "$SOURCE_SNAPSHOT_PROGRESS" != "100%" ]; do
sleep 3
SOURCE_SNAPSHOT_PROGRESS=$(aws ec2 describe-snapshots --snapshot-ids $SOURCE_SNAPSHOT_ID --query "Snapshots[*].Progress" --output text)
info "Snapshot ID: $SOURCE_SNAPSHOT_ID $SOURCE_SNAPSHOT_PROGRESS"
done
fi
aws ec2 wait snapshot-completed --filter Name=snapshot-id,Values=$SOURCE_SNAPSHOT_ID
info "Creating volume from snapshot $SOURCE_SNAPSHOT_ID"
TAG_SPEC="
ResourceType=volume,
Tags=[
{ Key=ebs.csi.aws.com/cluster, Value=true },
{ Key=kubernetes.io/cluster/$TARGET_CLUSTER_ID, Value=owned },
{ Key=CSIVolumeName, Value=$TARGET_PV },
{ Key=kubernetesCluster, Value=$TARGET_CLUSTER_ID },
{ Key=Name, Value=$TARGET_CLUSTER_ID-dynamic-$TARGET_PV },
{ Key=kubernetes.io/created-for/pv/name, Value=$TARGET_PV },
{ Key=kubernetes.io/created-for/pvc/name, Value=$TARGET_PVC },
{ Key=kubernetes.io/created-for/pvc/namespace, Value=$TARGET_NAMESPACE }
]"
TARGET_VOLUME_ID=$(aws ec2 create-volume \
--availability-zone $TARGET_AWS_ZONE \
--snapshot-id $SOURCE_SNAPSHOT_ID \
--volume-type $TARGET_VOLUME_TYPE \
--output text \
--query VolumeId \
--tag-specifications "${TAG_SPEC//[[:space:]]/}")
}
create_target_volume
info "Created target volume: $TARGET_VOLUME_ID"
info "Creating target PVC $TARGET_NAMESPACE/$TARGET_PVC"
function yamlobj()
{
local filename="$2"
case "$1" in
create) cat > "$filename" ;;
append) cat >> "$filename" ;;
*) err "yamlobj: invalid parameter: $1"; exit 1
esac
}
FILE_TARGET_PVC_YAML=$TARGET_PVC.yaml
yamlobj create $FILE_TARGET_PVC_YAML <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: $TARGET_PVC
namespace: $TARGET_NAMESPACE
annotations:
volume.beta.kubernetes.io/storage-provisioner: $TARGET_PROVISIONER
volume.kubernetes.io/storage-provisioner: $TARGET_PROVISIONER
finalizers:
- kubernetes.io/pvc-protection
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: $TARGET_STORAGE
storageClassName: $TARGET_STORAGE_CLASS
volumeMode: $TARGET_VOLUME_MODE
volumeName: $TARGET_PV
EOF
$TARGET_KUBECTL apply -f $FILE_TARGET_PVC_YAML $TARGET_DRY_RUN
if $DRY_RUN; then
DATA_TARGET_PVC='{"metadata":{"uid":"00000000-0000-0000-0000-000000000000"}}'
else
DATA_TARGET_PVC=$($TARGET_KUBECTL get -n $TARGET_NAMESPACE pvc $TARGET_PVC -o json)
fi
TARGET_PVC_UID=$(jq -r '.metadata.uid' <<<$DATA_TARGET_PVC)
info "Creating target PV $TARGET_NAMESPACE/$TARGET_PV"
FILE_TARGET_PV_YAML=$TARGET_PV.yaml
yamlobj create $FILE_TARGET_PV_YAML <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: $TARGET_PV
labels:
failure-domain.beta.kubernetes.io/region: $AWS_DEFAULT_REGION
failure-domain.beta.kubernetes.io/zone: $TARGET_AWS_ZONE
EOF
if [ "$TARGET_PROVISIONER" == 'ebs.csi.aws.com' ]; then
yamlobj append $FILE_TARGET_PV_YAML <<EOF
annotations:
pv.kubernetes.io/provisioned-by: $TARGET_PROVISIONER
volume.kubernetes.io/provisioner-deletion-secret-name: ""
volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
EOF
fi
yamlobj append $FILE_TARGET_PV_YAML <<EOF
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: $TARGET_STORAGE
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: $TARGET_PVC
namespace: $TARGET_NAMESPACE
uid: $TARGET_PVC_UID
persistentVolumeReclaimPolicy: $TARGET_RETAIN_POLICY
storageClassName: $TARGET_STORAGE_CLASS
volumeMode: $TARGET_VOLUME_MODE
EOF
if [ "$TARGET_PROVISIONER" == 'ebs.csi.aws.com' ]; then
yamlobj append $FILE_TARGET_PV_YAML <<EOF
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: topology.ebs.csi.aws.com/zone
operator: In
values:
- $TARGET_AWS_ZONE
csi:
driver: ebs.csi.aws.com
fsType: $TARGET_FS_TYPE
volumeAttributes:
storage.kubernetes.io/csiProvisionerIdentity: $TARGET_CSI_PROVISIONER_IDENTITY
volumeHandle: $TARGET_VOLUME_ID
EOF
elif [ "$TARGET_PROVISIONER" == 'kubernetes.io/aws-ebs' ]; then
yamlobj append $FILE_TARGET_PV_YAML <<EOF
awsElasticBlockStore:
fsType: $TARGET_FS_TYPE
volumeID: aws://$TARGET_AWS_ZONE/$TARGET_VOLUME_ID
EOF
fi
$TARGET_KUBECTL apply -f $FILE_TARGET_PV_YAML $TARGET_DRY_RUN
if $DRY_RUN; then
debug "Dry-run: Skipping test pod creation"
exit 0
fi
TEST_POD=test-$TARGET_PV
FILE_TEST_POD_YAML=$TEST_POD.yaml
yamlobj create $FILE_TEST_POD_YAML <<EOF
apiVersion: v1
kind: Pod
metadata:
name: $TEST_POD
namespace: $TARGET_NAMESPACE
spec:
containers:
- name: alpine
image: alpine
command:
- /bin/cat
tty: true
stdin: true
volumeMounts:
- name: data
mountPath: /data
readOnly: true
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- name: data
persistentVolumeClaim:
claimName: $TARGET_PVC
restartPolicy: Never
terminationGracePeriodSeconds: 0
EOF
info "Creating test pod $TEST_POD"
$TARGET_KUBECTL create -f $FILE_TEST_POD_YAML
$TARGET_KUBECTL wait -n $TARGET_NAMESPACE pod $TEST_POD --for=condition=Ready --timeout=180s
$TARGET_KUBECTL exec -n $TARGET_NAMESPACE -it $TEST_POD -- ls -la /data
info "Deleting test pod $TEST_POD"
$TARGET_KUBECTL delete pod -n $TARGET_NAMESPACE $TEST_POD
if ${DELETE_SNAPSHOT}; then
info "Deleting created snapshot: $SOURCE_SNAPSHOT_ID"
$DRY_RUN_CMD aws ec2 delete-snapshot --snapshot-id $SOURCE_SNAPSHOT_ID
fi
info Finished
echo
$TARGET_KUBECTL -n $TARGET_NAMESPACE get pv/$TARGET_PV pvc/$TARGET_PVC
@derjohn
Copy link

derjohn commented Jan 12, 2024

Thx, that saved the time and headaches.
In my case it is EKS + CSI Plugin. In this case the volumeId path a different path in the object:

SOURCE_VOLUMEID=$(kubectl -n $SOURCE_NAMESPACE get pv $SOURCE_PVNAME '--template={{.spec.csi.volumeHandle}}')

@caruccio
Copy link
Author

Nice! Will add this to script.

@caruccio
Copy link
Author

@derjohn please validade this works for you.
Thanks

@derjohn
Copy link

derjohn commented Jan 13, 2024

Hello @caruccio ,
great, it works for me (TM) :-) .... I made two further changes.
1.) Auto-generate the target-pv-name in the style the CSI driver does it by itself
2.) Auto-Delete the target-pvc name . So, if you set source-pvc and target-pvc to the same value it will delete the pvc and re-create it with the same name. (In my case the pods are already in pending state, so the pvc can be deleted and re-created and the pod start immediately (cool!).
I use it like this:
./migrate-pv-to-zone.sh kubecost kubecost-cost-analyzer kubecost kubecost-cost-analyzer eu-central-1b

Here is my diff proposal:

--- migrate-pv-to-zone.sh.orig  2024-01-13 12:03:59.522382481 +0100
+++ migrate-pv-to-zone.sh       2024-01-13 12:03:51.123353050 +0100
@@ -2,4 +2,4 @@
 
-if [ $# -lt 6 ]; then
-    echo "Usage: $0 [source-namespace] [source-pvc-name] [target-namespace] [target-pvc-name] [target-aws-zone] [target-pv-name] [kind/workload=None]"
+if [ $# -lt 5 ]; then
+    echo "Usage: $0 <source-namespace> <source-pvc-name> <target-namespace> <target-pvc-name> <target-aws-zone> [<target-pv-name>] [<kind/workload=None>]"
     echo "Clone EBS, PV and PVC from source to target. Will stop kind/workload if defined."
@@ -15,3 +15,3 @@
 TARGET_ZONE=$5
-TARGET_PVNAME=$6
+TARGET_PVNAME=${6:-"pvc-$(cat /proc/sys/kernel/random/uuid)"}
 
@@ -104,2 +104,3 @@
 echo Creating new PV/PVC...
+kubectl delete -n $TARGET_NAMESPACE pvc $TARGET_PVCNAME ||:
 kubectl apply -f - <<EOF

@rlanore
Copy link

rlanore commented Mar 29, 2024

Thank you so much for this. I have made some update to fit my use:

  • Add source zone name info
  • Add cluster name var. Use into some volume tags
  • Update AWS volume tag to current volume
  • Add waiter snapshot progress to skip max attempt error
  • Switch to gp3
  • Update source pv to retain policy

my diff:

--- migrate-pv-to-zone.sh.orig  2024-03-29 11:08:26.755378764 +0100
+++ migrate-pv-to-zone.sh       2024-03-29 11:12:55.651384780 +0100
@@ -1,7 +1,7 @@
 #!/bin/bash

-if [ $# -lt 6 ]; then
-    echo "Usage: $0 [source-namespace] [source-pvc-name] [target-namespace] [target-pvc-name] [target-aws-zone] [target-pv-name] [kind/workload=None]"
+if [ $# -lt 5 ]; then
+    echo "Usage: $0 <source-namespace> <source-pvc-name> <target-namespace> <target-pvc-name> <target-aws-zone> [<target-pv-name>] [<kind/workload=None>]"
     echo "Clone EBS, PV and PVC from source to target. Will stop kind/workload if defined."
     exit
 fi
@@ -13,7 +13,9 @@
 TARGET_NAMESPACE=$3
 TARGET_PVCNAME=$4
 TARGET_ZONE=$5
-TARGET_PVNAME=$6
+#TARGET_PVNAME=$6
+# Make target pvc name compliant with csi drivers if not defined
+TARGET_PVNAME=${6:-"pvc-$(cat /proc/sys/kernel/random/uuid)"}

 if [ $# -gt 6 ]; then
     DEPLOYMENTOBJ=$7
@@ -25,9 +27,11 @@

 ## Nao precisa mexer a partir daqui

+CLUSTER_NAME=`kubectl config current-context`
 SOURCE_PVNAME=$(kubectl -n $SOURCE_NAMESPACE get pvc $SOURCE_PVCNAME --template={{.spec.volumeName}})
 SOURCE_VOLUMEID=$(kubectl -n $SOURCE_NAMESPACE get pv $SOURCE_PVNAME --template='{{or .spec.awsElasticBlockStore.volumeID ""}}{{or .spec.csi.volumeHandle ""}}' | awk -F/ '{print $NF}')
 SOURCE_STORAGE=$(kubectl -n $SOURCE_NAMESPACE get pvc $SOURCE_PVCNAME --template={{.spec.resources.requests.storage}})
+SOURCE_ZONE=$(kubectl -n $SOURCE_NAMESPACE get pv $SOURCE_PVNAME --template='{{ (index (index (index .spec.nodeAffinity.required.nodeSelectorTerms 0).matchExpressions 0).values 0)}}')
 SOURCE_VOLUMEMODE=$(kubectl -n $SOURCE_NAMESPACE get pv $SOURCE_PVNAME --template={{.spec.volumeMode}})

 if [ -z "$SOURCE_VOLUMEMODE" ]; then
@@ -43,6 +47,7 @@
     ${SOURCE_PVCNAME@A}
     ${SOURCE_PVNAME@A}
     ${SOURCE_VOLUMEID@A}
+    ${SOURCE_ZONE@A}
     ${SOURCE_STORAGE@A}
     ${SOURCE_VOLUMEMODE@A}

@@ -87,21 +92,30 @@

 echo "Creating snapshot from $SOURCE_VOLUMEID... "
 SNAPSHOTID=$(aws ec2 create-snapshot --volume-id $SOURCE_VOLUMEID --description "$DESCRIPTION" --output text --query SnapshotId)
+SNAPSHOTPROGRESS=$(aws ec2 describe-snapshots --snapshot-ids $SNAPSHOTID --query "Snapshots[*].Progress" --output text)
+while [ $SNAPSHOTPROGRESS != "100%"  ]
+do
+    sleep 15
+    echo "Snapshot ID: $SNAPSHOTID $SNAPSHOTPROGRESS"
+    SNAPSHOTPROGRESS=$(aws ec2 describe-snapshots --snapshot-ids $SNAPSHOTID --query "Snapshots[*].Progress" --output text)
+done
 aws ec2 wait snapshot-completed --filter Name=snapshot-id,Values=$SNAPSHOTID
 echo ${SNAPSHOTID@A}

 echo "Creating volume from snapshot $SNAPSHOTID... "
-TAGSPEC="ResourceType=volume,Tags=[{Key=Name,Value=$TARGET_NAMESPACE-$TARGET_PVNAME},{Key=kubernetes.io/created-for/pv/name,Value=$TARGET_PVNAME},{Key=kubernetes.io/created-for/pvc/name,Value=$TARGET_PVCNAME},{Key=kubernetes.io/created-for/pvc/namespace,Value=$TARGET_NAMESPACE}]"
+TAGSPEC="ResourceType=volume,Tags=[{Key=ebs.csi.aws.com/cluster,Value=true},{Key=kubernetes.io/cluster/$CLUSTER_NAME,Value=owned},{Key=CSIVolumeName,Value=$TARGET_PVNAME},{Key=kubernetesCluster,Value=$CLUSTER_NAME},{Key=Name,Value=$CLUSTER_NAME-dynamic-$TARGET_PVNAME},{Key=kubernetes.io/created-for/pv/name,Value=$TARGET_PVNAME},{Key=kubernetes.io/created-for/pvc/name,Value=$TARGET_PVCNAME},{Key=kubernetes.io/created-for/pvc/namespace,Value=$TARGET_NAMESPACE}]"
 TARGET_VOLUMEID=$(aws ec2 create-volume \
     --availability-zone $TARGET_ZONE \
     --snapshot-id $SNAPSHOTID \
-    --volume-type gp2 \
+    --volume-type gp3 \
     --output text \
     --query VolumeId \
     --tag-specifications "$TAGSPEC")
 echo ${TARGET_VOLUMEID@A}

 echo Creating new PV/PVC...
+kubectl patch pv $SOURCE_PVNAME -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
+kubectl delete -n $TARGET_NAMESPACE pvc $TARGET_PVCNAME ||:
 kubectl apply -f - <<EOF
 ---
 apiVersion: v1
 

@caruccio
Copy link
Author

A good alternative is this tool: https://github.com/utkuozdemir/pv-migrate

@laiminhtrung1997
Copy link

Dear @caruccio
I have a use case for moving the PV to another availability zone. Can this script or the tool pv-migrate do?

@caruccio
Copy link
Author

caruccio commented Jul 24, 2024

Hi @laiminhtrung1997. This script was built for this exact use case: move a PVC from zone-X to zone-Y.
It works by creating a snapshot of the source EBS volume and creating a new volume, pv and pvc inside the target zone.

** This works only for AWS EBS backed PVCs**

Please, before anything, make a backup of your volumes and read the script carefully.

pv-migrate on the other hand, makes a local dump (tar.gz) of the f iles from the source PVC and send it to a new one. It works for any PVC from any cloud provider (including on-premisses)

@caruccio
Copy link
Author

Hey @derjohn, @rlanore. Thansk for your suggestions. I've just applied it in the script.

@caruccio
Copy link
Author

caruccio commented Aug 5, 2024

Complete rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment