Skip to content

Instantly share code, notes, and snippets.

@msteffen
Last active October 15, 2021 08:51
Show Gist options
  • Save msteffen/d12eae308c54304518a8cd1184906288 to your computer and use it in GitHub Desktop.
Save msteffen/d12eae308c54304518a8cd1184906288 to your computer and use it in GitHub Desktop.
Deploy a 1-node minio cluster in a GKE cluster, and then run a Pachyderm cluster on top of it

Step 1: Create a GKE cluster

$ CLUSTER_NAME=msteffen-cluster-$(date +%Y%m%d)
$ GCP_ZONE=us-west1-a
$ STORAGE_NAME=pach-disk
$ STORAGE_SIZE=10
$ gcloud config set container/cluster ${CLUSTER_NAME}
$ gcloud config set compute/zone ${GCP_ZONE}
$ gcloud container clusters create ${CLUSTER_NAME} --scopes storage-rw --machine-type n1-standard-4 --num-nodes=3
$ gcloud compute disks create --size=${STORAGE_SIZE}GB ${STORAGE_NAME}

Step 2: Deploy minio in the cluster and create a minio bucket

First, create the cluster, and get creds from the logs

$ kc run minio --image=minio/minio -l suite=pachyderm,app=minio --port=9000 -- server /export
deployment "minio" created

$ kc get all
NAME                       READY     STATUS    RESTARTS   AGE
po/minio-395231453-jmg8l   1/1       Running   0          3s
...

$ kc logs po/minio-395231453-jmg8l
Created minio configuration file successfully at /root/.minio

Endpoint:  http://10.0.2.5:9000  http://127.0.0.1:9000
AccessKey: 76RXXXXXXXXXXXXXX2VB 
SecretKey: A1/otTxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxPEH 
Region:    us-east-1
SQS ARNs:  <none>

Browser Access:
   http://10.0.2.5:9000  http://127.0.0.1:9000
...

# from the logs, we can set these variables (note that MINIO_IP has no http:// prefix)
$ MINIO_ID=76RXXXXXXXXXXXXXX2VB
$ MINIO_SECRET=A1/otTxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxPEH
$ MINIO_IP=10.0.2.5:9000

# port-forward to 1-node minio cluster that now exists
$ kc port-forward minio-395231453-jmg8l 9000:9000 &

Now, download the minio client and create the minio bucket (note that MINIO_BUCKET doesn't have the host prefix, which is how Pachyderm needs it)

# download minio client (mc)
$ curl https://dl.minio.io/client/mc/release/linux-amd64/mc -o ~/bin/mc && chmod +x ~/bin/mc

# Add host for minio running in k8s
$ mc config host add gke-minio http://localhost:9000 ${MINIO_ID} ${MINIO_SECRET}

$ MINIO_BUCKET=minio-pach-$(uuid | cut -d- -f1)
$ mc mb gke-minio/${MINIO_BUCKET}
Handling connection for 9000
Bucket created successfully `gke-minio/minio-pach-b90ea95e`.

Step 3: Run Pachyderm

$ pachctl deploy custom \
  --persistent-disk google \
  --object-store s3 \
  --static-etcd-volume=${STORAGE_NAME} \
  ${STORAGE_NAME} ${STORAGE_SIZE} \
  ${MINIO_BUCKET} ${MINIO_ID} ${MINIO_SECRET} ${MINIO_IP}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment