Erik's Thoughts and Musings

Apple, DevOps, Technology, and Reviews

CKS Exams - Failed

About 3 weeks ago I took the CKS exam twice and failed. I took the first exam and failed by 1 point. I picked myself up, studied a bunch for 3 days and took it again. Then failed by 2 points. I was faster taking the exam the second time, but was tripped up by two things that took a lot of my time. I can't talk too much about it because of NDA reasons, but needless to say I was crushed from all of the studying I did. It was a practical exam and while I feel like I knew all of the subject matter, the time pressure is no joke. One of the questions that I felt like I messed up on was on both of the tests.

After I failed the retake, I sent a feedback email to the Linux Foundation that I was disappointed in my results. After a few days, they sent back a message saying that I need better time management to double check questions, which I figured. And one of the questions I called out that I felt was too hard for a 2 hour test. I was sure that I got it wrong. They told me that I got it right! I still can't believe that part.

The silver lining is that they gave me a 50% discount for a CKS retake. Now that I am back from Summer vacation I am back to studying. I am hoping to take it next weekend.

CKS Practice Exam Soon

I think I am almost ready to take the Certified Kubernetes Security Specialist (CKS) practice exam soon on killer.sh. I made it through the 11-hour YouTube series that comes widely recommended. I also did this nice set of videos that go through the killer.sh questions with answers from the Cloudastic channel.

I am penciling in taking the practice exam this Saturday or Sunday, but coincidentally it also is at the same time as the Pokemon Go Fest Global. I will be spending the morning catching Pokemon with my kids and hoping I won't be too wiped out to do the exam in the afternoon.

I am setting aside 4 hours to get through the exam even though it will only grade you for 2 to mimic the real exam. That is what I did for the Certified Kubernetes Administrator (CKA) exam. The practice exam is meant to be tougher than the real thing.

CKS Studying Back on Track

My Certified Kubernetes Security Specialist (CKS) studying is back on track after weeks of distraction. I am posting all of my notes to GitHub:

https://github.com/elrikose/tutorials/tree/main/k8s/cks

And have been following this free 11 hour course on YouTube:

https://www.youtube.com/watch?v=d9xfB5qaOfg

It is a little old (Kubernetes 1.19 and 1.22 are referenced), but there is some good theory and practice in the course. I am over halfway through.

I got an email from the Linux Foundation a couple weeks ago suggesting that I take the exam before September 15th. Apparently October through December are the busiest exam taking months. I have penciled in that I want to take the exam before the end of the summer.

GitHub Actions Certification

Last week I went out to the Microsoft Build 2024 Conference in Seattle. I saw a lot of interesting technology at the event, most of it centered around AI.

One of the perks of going is GitHub (owned by Microsoft) was giving out a choice of a $99 certification. I picked the GitHub Actions certification since the Foundations course looked a little too easy. Yesterday I started studying for it via GitHub's study courses. I am hoping that I can take the test sometime in early summer.

Studying for CKS (Update)

I have been studying for the Certified Kubernetes Security Specialist mostly on the weekends.

The CKS is a practical test, so for the most part I have been spending time on theory and only a little on the practice. I want to see the forrest for the trees. I read the book Certified Kubernetes Security Specialist (CKS) Study Guide by Benjamin Muschko.

On the practical side, I have made it through all of the exercises on Killercoda in a very superficial way. I plan to revisit after switching gears.

I also have two good open source Github repositories that have CKS sample questions, including Muschko's:

Next step is to go through and understand all of the questions and answers.

Also there is a cool Network Policy visual editor that I plan to understand.

https://networkpolicy.io

Studying for CKS

For Cyber Monday, I signed up to take the Certified Kubernetes Security Specialist (CKS) certification to help me with my career. Last year I took the Certified Kubernetes Admistrator (CKA) exam so this is the next logical progression of that learning path.

I started strong with the studying by reading CKS literature and watching videos. The holidays slowed my pace considerably. The issue is that I am also interested in other things at the moment that are distracting me, including:

  • I am enjoying some YouTube channels that are teaching basic electronics. I was awful at Electrical Engineering 201 in college and want to remedy that by self-learning.
  • I binged and watched all of Star Trek: The Original Series and Star Trek: The Animated Series. Both I have never seen end-to-end.
  • My wife got me a continuous glucose monitor (CGM) for a present. I have been researching my glucose spikes. While I don't have diabetes, I have been interested in improving my health lately. Understanding my lows and highs and how it relates to carbs has been enlightening.
  • I have been trying to work out more. I have been doing that by playing Pokemon Go. It gets me out of the house to either bike or walk.

Tonight I plan to crack open the CKS book! I would expect future blog posts as I dive in deeper with security. Some things I already am familiar with like kube-bench and network polices. More Linux specific things like AppArmor, I have my work cut out for me.

Github Usage 2023

Yesterday I was looking through Github and noticed my contributions fell off right around the time that I switched jobs last May.

Github

The new company uses Azure DevOps for source code and CI/CD, so now I only make changes to my personal code that lives there. Most often I only do a single commit here and there that doesn't even register. I should go and do a cleanup of old forks.

Edit: Derp. I just realized that my .gitconfig still points to my old company's email address which is no longer connected to my Github account. When I commit using the old company's email address it doesn't register as a contribution anymore. Fixed it, committed 4 changes today, and now things are magically showing up for 2024. 🙌

Using kubectl with Velero

Velero is an open source tool to Many of the commands in the velero CLI can also be invoked straight from using kubectl. The reason this works is that Velero installs Custom Resource Definitions (CRDs) that extend the Kubernetes cluster. You can query items right from the cluster or via a tool like k9s.

Here are some examples:

# Get all schedules in the cluster
kubectl get schedule -A

# Get schedules in the velero namespace
kubectl get schedule -n velero

# Get the yaml configuration of the `my-aks-cluster` schedule
kubectl get schedule my-aks-cluster -n velero -o yaml

# Get all backups
kubectl get backup -n velero

# Describe an individual backup
kubectl describe backup my-aks-cluster-20231218120035 -n velero

# Get all restores
kubectl get restore -n velero

# Describe an individual restore
kubectl get restore test-backup-20231020134940 -n velero -o yaml

Using the Velero CLI

Velero comes with a pretty handy command-line interface (CLI) for pretty much anything you want to do regarding backups and restores:

  • Scheduling / Creating backups
  • Backup information
  • Deleting backups
  • Full restores from backups
  • Selective restores from backups

To manually schedule, backup or restore, the CLI tool is mandatory. If you simply want to review the state of backups and restores in the cluster, you can use kubectl via the installed Velero. I'll do a followup post about how to use Velero with kubectl

To install Velero CLI, follow the instructions on the Basic Install page:

https://velero.io/docs/v1.8/basic-install/

All of the following examples use velero to invoke actions with the Velero agent. If you are going to do a lot of interaction with the backups, I recommend you set the default namespace via either of the two commands, otherwise you need to add -n velero to all commands below. To set the default namespace:

# Third party tool via https://github.com/ahmetb/kubectx
kubens velero

# More verbose namespace 
kubectl config set-context --current --namespace velero

Scheduling / Creating Backups

The first thing that you will want to do with Velero is to create a scheduled backup. To create a backup schedule named my-aks-cluster that runs every 4 hours and will expire a backup after 30 days (720 hours)

velero create schedule my-aks-cluster -n velero --schedule="0 */6 * * *" --ttl 720h0m0s

This will backup all namespaces and all disk volumes in the cluster. At the present time, we don't have any persistent disk volumes in our development or production clusters.

If you want list all of the schedules that have been configured:

$ velero schedule get
NAME             STATUS    CREATED                         SCHEDULE      BACKUP TTL   LAST BACKUP   SELECTOR   PAUSED
my-aks-cluster   Enabled   2023-11-24 15:47:00 -0500 EST   0 */6 * * *   720h0m0s     1h ago        <none>     false

If you want to do a one off ad-hoc backup named ad-hoc-backup, you can use the scheduled one as a template:

velero backup create ad-hoc-backup --from-schedule my-aks-cluster

Or you can simply create one from scratch by specifying what to include or exclude:

# Create a backup including only the nginx namespace.
velero backup create nginx-backup --include-namespaces nginx

# Create a backup excluding the velero and default namespaces.
velero backup create selective-backup --exclude-namespaces velero,default

Backup Information

To list all backups that have been done on the cluster:

$ velero backup get
NAME                            STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
my-aks-cluster-20231218180035   Completed   0        0          2023-12-18 13:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218120035   Completed   0        0          2023-12-18 07:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218060035   Completed   0        0          2023-12-18 01:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218000034   Completed   0        0          2023-12-17 19:00:34 -0500 EST   29d       default            <none>
...
my-aks-cluster-20231120180030   Completed   0        0          2023-11-20 13:00:30 -0500 EST   1d        default            <none>
my-aks-cluster-20231120120029   Completed   0        0          2023-11-20 07:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231120060029   Completed   0        0          2023-11-20 01:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231120000029   Completed   0        0          2023-11-19 19:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231119180029   Completed   0        0          2023-11-19 13:00:29 -0500 EST   22h       default            <none>
my-aks-cluster-20231119120028   Completed   0        0          2023-11-19 07:00:28 -0500 EST   16h       default            <none>
my-aks-cluster-20231119060028   Completed   0        0          2023-11-19 01:00:28 -0500 EST   10h       default            <none>
my-aks-cluster-20231119000028   Completed   0        0          2023-11-18 19:00:28 -0500 EST   4h        default            <none>
my-aks-cluster-20231023180022   Completed   0        0          2023-10-23 14:00:22 -0400 EDT   33d       default            <none>

You can describe an individual backup by using the describe command and choosing the name of the backup:

$ velero backup describe my-aks-cluster-20231218180035
Name:         my-aks-cluster-20231218180035
Namespace:    velero
...

Phase:  Completed


Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

...

TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

...

Started:    2023-12-18 13:00:35 -0500 EST
Completed:  2023-12-18 13:00:52 -0500 EST

Expiration:  2024-01-17 13:00:35 -0500 EST

Total items to be backed up:  1262
Items backed up:              1262

Velero-Native Snapshots: <none included>

You can get the logs of an individual backup by using the logs command:

$ velero backup logs my-aks-cluster-20231218180035
time="2023-12-18T18:00:35Z" level=info msg="Setting up backup temp file" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:617"
time="2023-12-18T18:00:35Z" level=info msg="Setting up plugin manager" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:624"
time="2023-12-18T18:00:35Z" level=info msg="Getting backup item actions" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:628"
time="2023-12-18T18:00:35Z" level=info msg="Setting up backup store to check for backup existence" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:633"
time="2023-12-18T18:00:36Z" level=info msg="Writing backup version file" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:197"
time="2023-12-18T18:00:36Z" level=info msg="Including namespaces: *" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:203"
time="2023-12-18T18:00:36Z" level=info msg="Excluding namespaces: <none>" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:204"
time="2023-12-18T18:00:36Z" level=info msg="Including resources: *" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/util/collections/includes_excludes.go:506"
time="2023-12-18T18:00:36Z" level=info msg="Excluding resources: <none>" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/util/collections/includes_excludes.go:507"
time="2023-12-18T18:00:36Z" level=info msg="Backing up all volumes using pod volume backup: false" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:222"
...

Deleting Backups

Deleting a backup should not be necessary with the TTL set to 30 days, but here is the mechanism to delete it.

$ velero backup delete my-aks-cluster-20231023180022
Are you sure you want to continue (Y/N)? y
Request to delete backup "my-aks-cluster-20231023180022" submitted successfully.
The backup will be fully deleted after all associated data (disk snapshots, backup files, restores) are removed.

You can also simply delete the backup from the storage account manually.

Full restores

To fully restore all items from a backup into a cluster, supply the backup name to --from-backup

velero restore create --from-backup my-aks-cluster-20231023180022

All restores can be listed in the same way backups can be listed:

$ velero restore get
NAME                               BACKUP              STATUS      STARTED                         COMPLETED                       ERRORS   WARNINGS   CREATED                         SELECTOR
nginx-test-backup-20231218173505   nginx-test-backup   Completed   2023-12-18 17:35:06 -0500 EST   2023-12-18 17:35:09 -0500 EST   0        1          2023-12-18 17:35:06 -0500 EST   <none>

An individual restore can also be described:

$ velero restore describe nginx-test-backup-20231218173505
Name:         nginx-test-backup-20231218173505
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:                       Completed
Total items to be restored:  10
Items restored:              10

Started:    2023-12-18 17:35:06 -0500 EST
Completed:  2023-12-18 17:35:09 -0500 EST
...

As well as logs retrieved:

$ velero restore logs nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="starting restore" logSource="pkg/controller/restore_controller.go:523" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Starting restore of backup velero/nginx-test-backup" logSource="pkg/restore/restore.go:423" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'serviceaccounts' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'configmaps' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'pods' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'replicasets.apps' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:2206" resource=clusterclasses.cluster.x-k8s.io restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'endpoints' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'services' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-202312181735

Selective restore

Often you will want to just restore a namespace or a set of namespaces from a backup. In this example it will only restore the test namespace:

velero restore create --from-backup test-backup-20231020134940 --include-namespaces test --include-resources "*"

With the --include-resources you can also choose what Kubernetes resources to restore. This example only restores pods in the test namespace:

velero restore create --from-backup test-backup-20231020134940 --include-namespaces test --include-resources "pods"

Installing Velero in AKS

Velero is an open source Kubernetes backup service. The Velero service runs within the cluster in the velero namespace. It can backup all of the Kubernetes configuration manifests (including Custom Resource Definitions - CRDs) as well as any persistent volumes (PVs) that are attached to pods.

Velero Setup

The Velero setup, install, and configuration is completed using a helm chart. To download the chart clone it from Github:

git clone https://github.com/vmware-tanzu/helm-charts.git
cd helm-charts/charts/velero/

In Azure Entra ID, create an App Registration that will be used as the service account for the backups. I created "AKS Velero Backup" that works across the tenant so I could potentially back up clusters in any of my subscriptions. This service account will backup the configuration and any persistent volumes (PVs) to a storage account.

$ az ad sp list --display-name "AKS Velero Backup" -o table
DisplayName           Id                                    AppId                                 CreatedDateTime
--------------------  ------------------------------------  ------------------------------------  --------------------
AKS Velero Backup     <redacted>                            <redacted>                            2023-12-28T16:22:09Z

Add Contributor IAM privileges to the app ID of "AKS Velero Backup":

az role assignment create --assignee <App ID> --role "Contributor" --scope /subscriptions/<Subscription ID>

The helm chart also requires a credentials file that is used to be able to backup to the Azure storage account:

cat << EOF  > ./credentials-velero
AZURE_SUBSCRIPTION_ID=<Azure Subscription ID>
AZURE_TENANT_ID=<Azure Tenant ID>
AZURE_CLIENT_ID=<App ID>
AZURE_CLIENT_SECRET=<App ID Secret>
AZURE_RESOURCE_GROUP=<Resource Group of the AKS nodes>
AZURE_CLOUD_NAME=AzurePublicCloud
EOF

With the credentials created, it is just a matter of setting the Helm chart variables using the cloud credential file as a parameter.

  • ${SUBSCRIPTION_ID} is the Azure subscription ID of where the storage account (Azure bucket) lives.
  • savelerobackups is the storage account name to save the backups
  • rg-velero-backups is the resource group for the storage account
  • --set-file credentials.secretContents.cloud is where you set the credentials for the Azure subscription
helm upgrade --install velero velero \
     --repo https://vmware-tanzu.github.io/helm-charts \
     --create-namespace --namespace velero \
     --set configuration.backupStorageLocation[0].name=velero.io/azure \
     --set configuration.backupStorageLocation[0].bucket="my-aks-cluster" \
     --set configuration.backupStorageLocation[0].config.subscriptionId=${SUBSCRIPTION_ID} \
     --set configuration.backupStorageLocation[0].config.storageAccount=savelerobackups \
     --set configuration.backupStorageLocation[0].config.resourceGroup=rg-velero-backups \
     --set configuration.volumeStorageLocation[0].name=velero.io/azure \
     --set configuration.volumeSnapshotLocation[0].config.resourceGroup=rg-velero-backups \
     --set configuration.volumeSnapshotLocation[0].config.subscriptionId=${SUBSCRIPTION_ID} \
     --set initContainers[0].name=velero-plugin-for-microsoft-azure \
     --set initContainers[0].image=velero/velero-plugin-for-microsoft-azure:master \
     --set initContainers[0].volumeMounts[0].mountPath=/target \
     --set initContainers[0].volumeMounts[0].name=plugins \
     --set image.repository=velero/velero \
     --set image.pullPolicy=Always \
     --set backupsEnabled=true \
     --set snapshotsEnabled=true \
     --set-file credentials.secretContents.cloud=./credentials-velero

This should install the chart deploying the Kubernetes Deployment, CRDs, and any other dependencies needed by Velero. The end result is you should have the velero deployment and service in the velero namespace:

$ kubectl get all -n velero
NAME                         READY   STATUS    RESTARTS   AGE
pod/velero-79b6f59d6-hv46x   1/1     Running   0          4d15h

NAME             TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/velero   ClusterIP   10.0.13.58   <none>        8085/TCP   23d

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/velero   1/1     1            1           23d

NAME                               DESIRED   CURRENT   READY   AGE
replicaset.apps/velero-69b6f59d6   1         1         1       4d15h

Now that you have the backup agent installed, the next step is to create a backup schedule. Velero uses the Cron syntax for scheduled backups. Using the velero CLI tool, here is how you create a schedule that runs every 6 hours and the backup lives for 30 days (720 hours):

velero create schedule my-aks-cluster --schedule="0 */6 * * *" --ttl 720h0m0s -n velero

A more Kubernetes approach is the create a Kubernetes manifest configuration file (using Velero's Schedule CRD):

apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: my-aks-cluster
  namespace: velero
spec:
  schedule: 0 */6 * * *
  template:
    csiSnapshotTimeout: 0s
    hooks: {}
    includedNamespaces:
    - '*'
    itemOperationTimeout: 0s
    metadata: {}
    ttl: 720h0m0s
  useOwnerReferencesInBackup: false

And then apply the configuration with kubectl. Within the next 6 hours, the cluster should be backed up to the service account in the Helm chart configuration variable configuration.backupStorageLocation[0].config.storageAccount.

Next blog post would be how to user Velero's CLI to backup and restore.

References