Erik's Thoughts and Musings

Apple, DevOps, Technology, and Reviews

Studying for CKS (Update)

I have been studying for the Certified Kubernetes Security Specialist mostly on the weekends.

The CKS is a practical test, so for the most part I have been spending time on theory and only a little on the practice. I want to see the forrest for the trees. I read the book Certified Kubernetes Security Specialist (CKS) Study Guide by Benjamin Muschko.

On the practical side, I have made it through all of the exercises on Killercoda in a very superficial way. I plan to revisit after switching gears.

I also have two good open source Github repositories that have CKS sample questions, including Muschko's:

Next step is to go through and understand all of the questions and answers.

Also there is a cool Network Policy visual editor that I plan to understand.

https://networkpolicy.io

Studying for CKS

For Cyber Money, I signed up to take the Certified Kubernetes Security Specialist (CKS) certification to help me with my career. Last year I took the Certified Kubernetes Admistrator (CKA) exam so this is the next logical progression of that learning path.

I started strong with the studying by reading CKS literature and watching videos. The holidays slowed my pace considerably. The issue is that I am also interested in other things at the moment that are distracting me, including:

  • I am enjoying some YouTube channels that are teaching basic electronics. I was awful at Electrical Engineering 201 in college and want to remedy that by self-learning.
  • I binged and watched all of Star Trek: The Original Series and Star Trek: The Animated Series. Both I have never seen end-to-end.
  • My wife got me a continuous glucose monitor (CGM) for a present. I have been researching my glucose spikes. While I don't have diabetes, I have been interested in improving my health lately. Understanding my lows and highs and how it relates to carbs has been enlightening.
  • I have been trying to work out more. I have been doing that by playing Pokemon Go. It gets me out of the house to either bike or walk.

Tonight I plan to crack open the CKS book! I would expect future blog posts as I dive in deeper with security. Some things I already am familiar with like kube-bench and network polices. More Linux specific things like AppArmor, I have my work cut out for me.

Github Usage 2023

Yesterday I was looking through Github and noticed my contributions fell off right around the time that I switched jobs last May.

Github

The new company uses Azure DevOps for source code and CI/CD, so now I only make changes to my personal code that lives there. Most often I only do a single commit here and there that doesn't even register. I should go and do a cleanup of old forks.

Edit: Derp. I just realized that my .gitconfig still points to my old company's email address which is no longer connected to my Github account. When I commit using the old company's email address it doesn't register as a contribution anymore. Fixed it, committed 4 changes today, and now things are magically showing up for 2024. 🙌

Using kubectl with Velero

Velero is an open source tool to Many of the commands in the velero CLI can also be invoked straight from using kubectl. The reason this works is that Velero installs Custom Resource Definitions (CRDs) that extend the Kubernetes cluster. You can query items right from the cluster or via a tool like k9s.

Here are some examples:

# Get all schedules in the cluster
kubectl get schedule -A

# Get schedules in the velero namespace
kubectl get schedule -n velero

# Get the yaml configuration of the `my-aks-cluster` schedule
kubectl get schedule my-aks-cluster -n velero -o yaml

# Get all backups
kubectl get backup -n velero

# Describe an individual backup
kubectl describe backup my-aks-cluster-20231218120035 -n velero

# Get all restores
kubectl get restore -n velero

# Describe an individual restore
kubectl get restore test-backup-20231020134940 -n velero -o yaml

Using the Velero CLI

Velero comes with a pretty handy command-line interface (CLI) for pretty much anything you want to do regarding backups and restores:

  • Scheduling / Creating backups
  • Backup information
  • Deleting backups
  • Full restores from backups
  • Selective restores from backups

To manually schedule, backup or restore, the CLI tool is mandatory. If you simply want to review the state of backups and restores in the cluster, you can use kubectl via the installed Velero. I'll do a followup post about how to use Velero with kubectl

To install Velero CLI, follow the instructions on the Basic Install page:

https://velero.io/docs/v1.8/basic-install/

All of the following examples use velero to invoke actions with the Velero agent. If you are going to do a lot of interaction with the backups, I recommend you set the default namespace via either of the two commands, otherwise you need to add -n velero to all commands below. To set the default namespace:

# Third party tool via https://github.com/ahmetb/kubectx
kubens velero

# More verbose namespace 
kubectl config set-context --current --namespace velero

Scheduling / Creating Backups

The first thing that you will want to do with Velero is to create a scheduled backup. To create a backup schedule named my-aks-cluster that runs every 4 hours and will expire a backup after 30 days (720 hours)

velero create schedule my-aks-cluster -n velero --schedule="0 */6 * * *" --ttl 720h0m0s

This will backup all namespaces and all disk volumes in the cluster. At the present time, we don't have any persistent disk volumes in our development or production clusters.

If you want list all of the schedules that have been configured:

$ velero schedule get
NAME             STATUS    CREATED                         SCHEDULE      BACKUP TTL   LAST BACKUP   SELECTOR   PAUSED
my-aks-cluster   Enabled   2023-11-24 15:47:00 -0500 EST   0 */6 * * *   720h0m0s     1h ago        <none>     false

If you want to do a one off ad-hoc backup named ad-hoc-backup, you can use the scheduled one as a template:

velero backup create ad-hoc-backup --from-schedule my-aks-cluster

Or you can simply create one from scratch by specifying what to include or exclude:

# Create a backup including only the nginx namespace.
velero backup create nginx-backup --include-namespaces nginx

# Create a backup excluding the velero and default namespaces.
velero backup create selective-backup --exclude-namespaces velero,default

Backup Information

To list all backups that have been done on the cluster:

$ velero backup get
NAME                            STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
my-aks-cluster-20231218180035   Completed   0        0          2023-12-18 13:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218120035   Completed   0        0          2023-12-18 07:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218060035   Completed   0        0          2023-12-18 01:00:35 -0500 EST   29d       default            <none>
my-aks-cluster-20231218000034   Completed   0        0          2023-12-17 19:00:34 -0500 EST   29d       default            <none>
...
my-aks-cluster-20231120180030   Completed   0        0          2023-11-20 13:00:30 -0500 EST   1d        default            <none>
my-aks-cluster-20231120120029   Completed   0        0          2023-11-20 07:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231120060029   Completed   0        0          2023-11-20 01:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231120000029   Completed   0        0          2023-11-19 19:00:29 -0500 EST   1d        default            <none>
my-aks-cluster-20231119180029   Completed   0        0          2023-11-19 13:00:29 -0500 EST   22h       default            <none>
my-aks-cluster-20231119120028   Completed   0        0          2023-11-19 07:00:28 -0500 EST   16h       default            <none>
my-aks-cluster-20231119060028   Completed   0        0          2023-11-19 01:00:28 -0500 EST   10h       default            <none>
my-aks-cluster-20231119000028   Completed   0        0          2023-11-18 19:00:28 -0500 EST   4h        default            <none>
my-aks-cluster-20231023180022   Completed   0        0          2023-10-23 14:00:22 -0400 EDT   33d       default            <none>

You can describe an individual backup by using the describe command and choosing the name of the backup:

$ velero backup describe my-aks-cluster-20231218180035
Name:         my-aks-cluster-20231218180035
Namespace:    velero
...

Phase:  Completed


Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

...

TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

...

Started:    2023-12-18 13:00:35 -0500 EST
Completed:  2023-12-18 13:00:52 -0500 EST

Expiration:  2024-01-17 13:00:35 -0500 EST

Total items to be backed up:  1262
Items backed up:              1262

Velero-Native Snapshots: <none included>

You can get the logs of an individual backup by using the logs command:

$ velero backup logs my-aks-cluster-20231218180035
time="2023-12-18T18:00:35Z" level=info msg="Setting up backup temp file" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:617"
time="2023-12-18T18:00:35Z" level=info msg="Setting up plugin manager" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:624"
time="2023-12-18T18:00:35Z" level=info msg="Getting backup item actions" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:628"
time="2023-12-18T18:00:35Z" level=info msg="Setting up backup store to check for backup existence" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/controller/backup_controller.go:633"
time="2023-12-18T18:00:36Z" level=info msg="Writing backup version file" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:197"
time="2023-12-18T18:00:36Z" level=info msg="Including namespaces: *" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:203"
time="2023-12-18T18:00:36Z" level=info msg="Excluding namespaces: <none>" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:204"
time="2023-12-18T18:00:36Z" level=info msg="Including resources: *" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/util/collections/includes_excludes.go:506"
time="2023-12-18T18:00:36Z" level=info msg="Excluding resources: <none>" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/util/collections/includes_excludes.go:507"
time="2023-12-18T18:00:36Z" level=info msg="Backing up all volumes using pod volume backup: false" backup=velero/my-aks-cluster-20231218180035 logSource="pkg/backup/backup.go:222"
...

Deleting Backups

Deleting a backup should not be necessary with the TTL set to 30 days, but here is the mechanism to delete it.

$ velero backup delete my-aks-cluster-20231023180022
Are you sure you want to continue (Y/N)? y
Request to delete backup "my-aks-cluster-20231023180022" submitted successfully.
The backup will be fully deleted after all associated data (disk snapshots, backup files, restores) are removed.

You can also simply delete the backup from the storage account manually.

Full restores

To fully restore all items from a backup into a cluster, supply the backup name to --from-backup

velero restore create --from-backup my-aks-cluster-20231023180022

All restores can be listed in the same way backups can be listed:

$ velero restore get
NAME                               BACKUP              STATUS      STARTED                         COMPLETED                       ERRORS   WARNINGS   CREATED                         SELECTOR
nginx-test-backup-20231218173505   nginx-test-backup   Completed   2023-12-18 17:35:06 -0500 EST   2023-12-18 17:35:09 -0500 EST   0        1          2023-12-18 17:35:06 -0500 EST   <none>

An individual restore can also be described:

$ velero restore describe nginx-test-backup-20231218173505
Name:         nginx-test-backup-20231218173505
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:                       Completed
Total items to be restored:  10
Items restored:              10

Started:    2023-12-18 17:35:06 -0500 EST
Completed:  2023-12-18 17:35:09 -0500 EST
...

As well as logs retrieved:

$ velero restore logs nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="starting restore" logSource="pkg/controller/restore_controller.go:523" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Starting restore of backup velero/nginx-test-backup" logSource="pkg/restore/restore.go:423" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'serviceaccounts' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'configmaps' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'pods' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'replicasets.apps' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:2206" resource=clusterclasses.cluster.x-k8s.io restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'endpoints' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-20231218173505
time="2023-12-18T22:35:07Z" level=info msg="Resource 'services' will be restored into namespace 'nginx-test'" logSource="pkg/restore/restore.go:2293" restore=velero/nginx-test-backup-202312181735

Selective restore

Often you will want to just restore a namespace or a set of namespaces from a backup. In this example it will only restore the test namespace:

velero restore create --from-backup test-backup-20231020134940 --include-namespaces test --include-resources "*"

With the --include-resources you can also choose what Kubernetes resources to restore. This example only restores pods in the test namespace:

velero restore create --from-backup test-backup-20231020134940 --include-namespaces test --include-resources "pods"

Installing Velero in AKS

Velero is an open source Kubernetes backup service. The Velero service runs within the cluster in the velero namespace. It can backup all of the Kubernetes configuration manifests (including Custom Resource Definitions - CRDs) as well as any persistent volumes (PVs) that are attached to pods.

Velero Setup

The Velero setup, install, and configuration is completed using a helm chart. To download the chart clone it from Github:

git clone https://github.com/vmware-tanzu/helm-charts.git
cd helm-charts/charts/velero/

In Azure Entra ID, create an App Registration that will be used as the service account for the backups. I created "AKS Velero Backup" that works across the tenant so I could potentially back up clusters in any of my subscriptions. This service account will backup the configuration and any persistent volumes (PVs) to a storage account.

$ az ad sp list --display-name "AKS Velero Backup" -o table
DisplayName           Id                                    AppId                                 CreatedDateTime
--------------------  ------------------------------------  ------------------------------------  --------------------
AKS Velero Backup     <redacted>                            <redacted>                            2023-12-28T16:22:09Z

Add Contributor IAM privileges to the app ID of "AKS Velero Backup":

az role assignment create --assignee <App ID> --role "Contributor" --scope /subscriptions/<Subscription ID>

The helm chart also requires a credentials file that is used to be able to backup to the Azure storage account:

cat << EOF  > ./credentials-velero
AZURE_SUBSCRIPTION_ID=<Azure Subscription ID>
AZURE_TENANT_ID=<Azure Tenant ID>
AZURE_CLIENT_ID=<App ID>
AZURE_CLIENT_SECRET=<App ID Secret>
AZURE_RESOURCE_GROUP=<Resource Group of the AKS nodes>
AZURE_CLOUD_NAME=AzurePublicCloud
EOF

With the credentials created, it is just a matter of setting the Helm chart variables using the cloud credential file as a parameter.

  • ${SUBSCRIPTION_ID} is the Azure subscription ID of where the storage account (Azure bucket) lives.
  • savelerobackups is the storage account name to save the backups
  • rg-velero-backups is the resource group for the storage account
  • --set-file credentials.secretContents.cloud is where you set the credentials for the Azure subscription
helm upgrade --install velero velero \
     --repo https://vmware-tanzu.github.io/helm-charts \
     --create-namespace --namespace velero \
     --set configuration.backupStorageLocation[0].name=velero.io/azure \
     --set configuration.backupStorageLocation[0].bucket="my-aks-cluster" \
     --set configuration.backupStorageLocation[0].config.subscriptionId=${SUBSCRIPTION_ID} \
     --set configuration.backupStorageLocation[0].config.storageAccount=savelerobackups \
     --set configuration.backupStorageLocation[0].config.resourceGroup=rg-velero-backups \
     --set configuration.volumeStorageLocation[0].name=velero.io/azure \
     --set configuration.volumeSnapshotLocation[0].config.resourceGroup=rg-velero-backups \
     --set configuration.volumeSnapshotLocation[0].config.subscriptionId=${SUBSCRIPTION_ID} \
     --set initContainers[0].name=velero-plugin-for-microsoft-azure \
     --set initContainers[0].image=velero/velero-plugin-for-microsoft-azure:master \
     --set initContainers[0].volumeMounts[0].mountPath=/target \
     --set initContainers[0].volumeMounts[0].name=plugins \
     --set image.repository=velero/velero \
     --set image.pullPolicy=Always \
     --set backupsEnabled=true \
     --set snapshotsEnabled=true \
     --set-file credentials.secretContents.cloud=./credentials-velero

This should install the chart deploying the Kubernetes Deployment, CRDs, and any other dependencies needed by Velero. The end result is you should have the velero deployment and service in the velero namespace:

$ kubectl get all -n velero
NAME                         READY   STATUS    RESTARTS   AGE
pod/velero-79b6f59d6-hv46x   1/1     Running   0          4d15h

NAME             TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/velero   ClusterIP   10.0.13.58   <none>        8085/TCP   23d

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/velero   1/1     1            1           23d

NAME                               DESIRED   CURRENT   READY   AGE
replicaset.apps/velero-69b6f59d6   1         1         1       4d15h

Now that you have the backup agent installed, the next step is to create a backup schedule. Velero uses the Cron syntax for scheduled backups. Using the velero CLI tool, here is how you create a schedule that runs every 6 hours and the backup lives for 30 days (720 hours):

velero create schedule my-aks-cluster --schedule="0 */6 * * *" --ttl 720h0m0s -n velero

A more Kubernetes approach is the create a Kubernetes manifest configuration file (using Velero's Schedule CRD):

apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: my-aks-cluster
  namespace: velero
spec:
  schedule: 0 */6 * * *
  template:
    csiSnapshotTimeout: 0s
    hooks: {}
    includedNamespaces:
    - '*'
    itemOperationTimeout: 0s
    metadata: {}
    ttl: 720h0m0s
  useOwnerReferencesInBackup: false

And then apply the configuration with kubectl. Within the next 6 hours, the cluster should be backed up to the service account in the Helm chart configuration variable configuration.backupStorageLocation[0].config.storageAccount.

Next blog post would be how to user Velero's CLI to backup and restore.

References

Windows Container Image - RabbitMQ

After years of using Docker, today was my first day of debugging a Docker container build for Windows. In fact I was so shocked it wasn't a Linux based container. I was like a deer in headlights of how to debug it when it was causing problems. It was a Rabbit MQ image. Fortunately the container has both cmd.exe and powershell.exe. It took a little web searching how to do certain things like cat and tail -f in PowerShell, but before long I was looking at the logs:

docker exec -it rabbitmq powershell.exe
cd \Users\ContainerAdministrator\AppData\Roaming\RabbitMQ\log
Get-Content .\rabbit@localhost.log -tail 100 -Wait

The log:

=WARNING REPORT==== 28-Dec-2023::06:00:39 ===
closing AMQP connection <0.25045.3> (10.0.83.5:60070 -> 172.25.205.23:5672, vhost: '/', user: 'admin'):
client unexpectedly closed TCP connection
=WARNING REPORT==== 28-Dec-2023::06:00:40 ===
closing AMQP connection <0.15059.3> (10.0.83.5:63367 -> 172.25.205.23:5672, vhost: '/', user: 'admin'):
client unexpectedly closed TCP connection
=INFO REPORT==== 28-Dec-2023::06:01:51 ===
accepting AMQP connection <0.25498.3> (10.0.83.5:60211 -> 172.25.205.23:5672)
=INFO REPORT==== 28-Dec-2023::06:01:51 ===
connection <0.25498.3> (10.0.83.5:60211 -> 172.25.205.23:5672): user 'admin' authenticated and granted access to vhost '/'

And then a little more searching how to get the command-line history when wanted to re-run commands:

doskey /history

Azure Workload Identity Federation

I started working on switching out our Azure DevOps service connections to used federated workload identities. There is a good page and Video in Azure about how Workload Identity Federation works:

https://learn.microsoft.com/en-us/entra/workload-id/workload-identity-federation

Basically the way it works is there is a trust that is setup between Azure DevOps and our Azure subscriptions by using these parameters and their examples:

  • Issuer URL: https://vstoken.dev.azure.com/abcdefc4-ffff-fff-...
  • Subject: sc://org/product/test-emartin-federated
  • Audience: api://AzureADTokenExchange (always)

You then tie that to a service principal in AD that will be used as the identity for doing actions in the subscription. Here is another resource about how to set it up using Terraform:

https://techcommunity.microsoft.com/t5/azure-devops-blog/introduction-to-azure-devops-workload-identity-federation-oidc/ba-p/3908687

Why use Workload identity federation? Up until now the only way to avoid storing service principal secrets for Azure DevOps pipelines was to use a self-hosted Azure DevOps agents with managed identities. Now with Workload identity federation we remove that limitation and enable you to use short-lived tokens for authenticating to Azure. This significantly improves your security posture and removes the need to figure out how to share and rotate secrets. Workload identity federation works with many Azure DevOps tasks, not just the Terraform ones we are focussing on in this article, so you can use it for deploying code and other configuration tasks. I encourage you to learn more about the supported tasks here.

What is Workload identity federation and how does it work Workload identity federation is an OpenID Connect implementation for Azure DevOps that allow you to use short-lived credential free authentication to Azure without the need to provision self-hosted agents with managed identity. You configure a trust between your Azure DevOps organisation and an Azure service principal. Azure DevOps then provides a token that can be used to authenticate to the Azure API.

Here is the terraform:

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=3.0.0"
    }
    azuredevops = {
      source = "microsoft/azuredevops"
      version = ">= 0.9.0"
    }
  }
}

provider "azurerm" {
  features {}
}

resource "azuredevops_project" "example" {
  name               = "Example Project"
  visibility         = "private"
  version_control    = "Git"
  work_item_template = "Agile"
  description        = "Managed by Terraform"
}

resource "azurerm_resource_group" "identity" {
  name     = "identity"
  location = "UK South"
}

resource "azurerm_user_assigned_identity" "example" {
  location            = azurerm_resource_group.identity.location
  name                = "example-identity"
  resource_group_name = azurerm_resource_group.identity.name
}

resource "azuredevops_serviceendpoint_azurerm" "example" {
  project_id                             = azuredevops_project.example.id
  service_endpoint_name                  = "example-federated-sc"
  description                            = "Managed by Terraform"
  service_endpoint_authentication_scheme = "WorkloadIdentityFederation"
  credentials {
    serviceprincipalid = azurerm_user_assigned_identity.example.client_id
  }
  azurerm_spn_tenantid      = "00000000-0000-0000-0000-000000000000"
  azurerm_subscription_id   = "00000000-0000-0000-0000-000000000000"
  azurerm_subscription_name = "Example Subscription Name"
}

resource "azurerm_federated_identity_credential" "example" {
  name                = "example-federated-credential"
  resource_group_name = azurerm_resource_group.identity.name
  parent_id           = azurerm_user_assigned_identity.example.id
  audience            = ["api://AzureADTokenExchange"]
  issuer              = azuredevops_serviceendpoint_azurerm.example.workload_identity_federation_issuer
  subject             = azuredevops_serviceendpoint_azurerm.example.workload_identity_federation_subject
}

AWS - Reserved Instances

I have been using AWS at work for over 3 years to varying degrees. While I feel comfortable using and administering most things, I realized it is time for me to get serious and fill the gaps in my knowledge. AWS has so many bells and whistles that it is daunting to think you can learn everything and keep that knowledge relevant when day-to-day you probably only use 5% of the features.

To fix these gaps, last month I started taking an AWS course on Udemy that will prepare me for one of the lower level AWS DevOps certifications. Due to distractions with kids and life happening, I am still at the beginning 10% of the course still going over the basics. I am using my main AWS account as a sandbox for trying things out in the course. Today my class got to the EC2 section, I realized that I have not been smart when it comes to saving money on my own AWS workloads. For the last 9 months, this blog has been running in AWS. I have been using On-Demand Pricing not a Reserved Instance. I can save 30% of the cost of the server by getting a reserved instance for a year and roughly 60% for 3 years. I plunked down the money to pay for 1 year.

AWS - Modifying EC2 DeleteOnTermination

Delete on Termination

I created my web server late last year on an EC2 instance on AWS. While I built the instance with terraform, I didn't set the EBS for the EC2 instance's "Delete on Termination" flag to false. That would mean if I would terminate the instance instead of stop it, that my main EBS volume would just disappear. While that's not that big of a deal because I built the webserver with automation and could easily regenerate it quicky. I didn't necessarily want to lose things like server logs.

I started poking around the console looking for how to switch the flag and I was perplexed how to set it after the fact. I went poking around the web and found there was no way to do it! You have to use the aws ec2 modify-instance-attribute CLI command to change it

Parameters for the CLI

You need two things to be able to use the AWS CLI command

  • EC2 instance ID
  • Storage device name

The instance ID was easy to get either by using the console or in a roughshod way using the AWS CLI:

$ aws ec2 describe-instances --output yaml | grep Instance
  Instances:
...
    InstanceId: i-04753
    InstanceType: t2.micro
...

The device name is also easy to find in the console by going to the Storage tab, but can also be found via the CLI:

$ aws ec2 describe-instances --output yaml | grep -A 6 BlockDeviceMappings
    BlockDeviceMappings:
    - DeviceName: /dev/xvda
      Ebs:
        AttachTime: '2021-11-28T03:03:28+00:00'
        DeleteOnTermination: true
        Status: attached
        VolumeId: vol-0e40

That would mean our two parameters would be:

  • EC2 instance ID: i-04753
  • Storage device name: /dev/xvda

Running the CLI

First you need to create a json file that specifies the device name and the DeleteOnTermination flag:

[
  {
    "DeviceName": "/dev/xvda",
    "Ebs": {
      "DeleteOnTermination": false
      }
  }
]

And then you invoke the comand:

aws ec2 modify-instance-attribute --instance-id i-04753 --block-device-mappings file://storage.json

There is no output on a successful change, but you can confirm that the change was made with the same command as above:

$ aws ec2 describe-instances --output yaml | grep -A 6 BlockDeviceMappings
    BlockDeviceMappings:
    - DeviceName: /dev/xvda
      Ebs:
        AttachTime: '2021-11-28T03:03:28+00:00'
        DeleteOnTermination: false
        Status: attached
        VolumeId: vol-0e40

Notice DeleteOnTermination is now set to false.

(HT to Pete Wilcock)