Have you ever been doing some threat research and thought “huh, that’s weird?” In my experience, many great discoveries started with a similar sentiment. I think of the recent example of the xz utils supply chain attack where a lone engineer at Microsoft realized his instance of SSH was using too many CPU cycles and was having memory issues while running Valgrind.
Recently I was teaching a workshop on how to do security research in Google Cloud Platform (GCP). Near the end of the class I was demonstrating how you can remove a service account‘s access to an environment by deleting the service account key. Only, it didn’t work. After a few failed attempts to try and make the behavior occur, I decided to chalk it up to strange cloud weirdness and wrapped up the workshop. However, I couldn’t stop thinking about the problem. As soon as the conference was over, I dove deep into what was really going on. It was not at all what I expected.
Quick aside
Throughout this blog post, I’ll be walking through the story of how I uncovered what was going on. As part of that though, I want to highlight what I see as common patterns when doing security research and root cause analysis. In this case, I observed a behavior that I didn’t understand and seemed to not align with the expected outcome. At a high level, here are the steps I would take to try and understand and come to a conclusion about why the observed behavior is occurring. While the main example in this blog takes place in GCP, this methodology applies to threat research outside of the cloud as well.
Step 1: Recreate the problem
Whether you found an issue yourself or are trying to replicate someone else’s research, it is crucial to be able to quickly and accurately reproduce the behavior. My first task was to replicate the behavior in GCP and ensure what happened wasn’t just a fluke. Sure enough, within a few minutes I had recreated the issue, which led me to conclude that the behavior was being caused directly by something I was doing.
With the behavior easily recreated, I now had the difficult task of trying to figure out what exactly was going on. A few questions immediately come to mind:
- Is this expected behavior? Said another way, “Is the system working as designed?”
- If so, what are the security implications of this behavior?
- If not, is this an unknown issue that Google is not aware of?
- Why is this occurring?
- Is there something unique about how I set this up that causes it to occur or would this happen in anyone’s GCP environment?
With some of those questions in mind, let’s walk through how I came to understand exactly what was occurring and how Google responded after I reported it.
Here are the steps I used to recreate the behavior:
And the associated gcloud commands:
$ gcloud iam service-accounts create test-account
$ gcloud projects add-iam-policy-binding <project> --member=test-account@<project-id>.iam.gserviceaccount.com --role=roles/editor
$ gcloud iam service-accounts keys create access_key.json --iam-account=test-account@<project-id>.iam.gserviceaccount.com
$ gcloud auth activate-service-account test-account@<project-id>.iam.gserviceaccount.com --key-file=access_key.json
$ gcloud projects list
$ gcloud iam service-accounts keys delete <key-id> --iam-account=test-account@<project-id>.iam.gserviceaccount.com
$ gcloud projects list
$ curl -X POST \
-H "Authorization: Bearer <access-token>" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \ "https://iam.googleapis.com/v1/projects/<project>/serviceAccounts/test-account@<project-id>.iam.gserviceaccount.com/keys" > response.json
$ cat response.json | jq -r '.privateKeyData' | base64 -d > access.json
Step 2: Narrow down the search space
With so many questions, where do we begin? This is where the science of research becomes more of an art. There are many reasons this might be occurring. How can we effectively narrow it down? Furthermore, how can we know we are correct in what we ultimately find? Let’s first consider the core question we want to answer: why does this service account still have access even though I deleted its service account key? With that in mind, let’s attempt to reason about what services of the over 200 that GCP offers are relevant to the questions we want to answer. We need to determine the relevant components of granting a service account access to a GCP environment.
First we need to understand how a request from a service account is evaluated and ultimately granted or denied. At a high level, it follows these three steps:
- Authentication – verifying the identity of the requestor
- Authorization – ensuring the requestor is allowed to perform the action
- Validation – checking the action against policy/verification of the request
Since the service account still has access, it must be passing all three of these checks. Service accounts and service account keys are part of the IAM service in GCP. This is where I decided to focus my research.
Step 3: Investigate
Now that we have narrowed down the search space to be more manageable, we can begin investigating. There are a number of ways a service account can gain access to a GCP environment: workload identity federation, service account impersonation, attaching a service account, and service account keys, to name a few. In the workshop I was using service account keys, so this is where I focused my initial investigation. I first asked myself: “How does a service account key grant the service account access?” I knew from experience teaching the workshop that you can run the following command to create a service account key file:
gcloud iam service-accounts keys create access.json --iam-account=<svcacct-email>
Here is what a service account key file looks like:
{
"type": "service_account",
"project_id": "<your-project-id>",
"private_key_id": "0123456789012345678901234567890123456789"
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIE...<REDACTED>...A81g==\n-----END PRIVATE KEY-----\n",
"client_email": "some-name@project-id.iam.gserviceaccount.com",
"client_id": "012345678901234567890",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/<svc-acct-email>",
"universe_domain": "googleapis.com"
}
I then went through and tried to figure out as much as I could about each field to determine what role they played in providing the service account access to the GCP environment. Not all fields were well documented, or documented at all.
Field | Meaning |
type | The type of access file, for example, a service account key file |
project_id | The ID of the project the service account key exists in. |
private_key_id | This is the ID of the key itself |
private_key | This is an X.509 private key. There is an associated public key that is stored by Google. It can be accessed by going to the url mentioned in the client_x509_cert_url field |
client_email | The email address of the service account |
client_id | This is the ID of the service account. This can be used to uniquely identify the service account |
auth_uri | The authentication URI for the Google cloud account |
token_uri | The token URI for the Google cloud account |
auth_provider_x509_cert_url | The authentication provider X.509 URL for the Google cloud account |
client_x509_cert_url | A publicly accessible URL to check the public certificate for the service account key |
universe_domain | Defaults to googleapis.com . Seems to be something you can change though |
When considering the various fields in the file I decided that the private_key
field is one of the most interesting. It refers to an x.509 certificate used for verification. I figured this must be part of the authentication portion of the flow. I was also aware of the ability to sign JWT (JSON Web Tokens) using x.509 certificates. I also noticed that when I visited the URL listed in the client_x509_cert_url
, it had the private_key_id
listed there next to a public x.509 certificate.
{
"<private-key-id>": "-----BEGIN CERTIFICATE-----\nMIID<REDACTED>JM01o/\n-----END CERTIFICATE-----\n",
"<private-key-id>": "-----BEGIN CERTIFICATE-----\nMIIC<REDACTED>Pp\n-----END CERTIFICATE-----\n",
"<private-key-id>": "-----BEGIN CERTIFICATE-----\nMIID<REDACTED>3sSQ==\n-----END CERTIFICATE-----\n"
}
Step 4: Test the hypothesis
My first thought was that the creation of the access file was granting me access to the environment. However, this turned out to be false. When you run the gcloud command to create the service account key, you don’t immediately have access to the environment. You can test this by running the command to create the key then trying to run a command as the service account. It will not work. That then led me to look at the next command in the sequence:
gcloud auth activate-service-account access_key.json
This command is what actually configures the gcloud tool to access the GCP environment as the service account. It also needs the access file as input, so it must be using that information to set up access. I now have a bunch of new questions to answer:
- What does that command do that allows me to access GCP as the service account?
- How does it generate credentials to access the environment?
- What information is it using from the
access_key.json
file to do that? - Where is the credential stored on my local system?
Luckily, the gcloud tool is open source… sort of. It’s written in Python, and when you install it you can browse all the Python files to see what they do. After some rummaging through the source code I determined that it was using the info in the access_key.json
file I had created to sign a JWT and then use that to get an access token. It then stores this token locally at ~/.config/gcloud/access_tokens.db
. This is a sqlite database that stores access tokens used by gcloud. To see if this really was the access token the service account was using I ran the following command to list all of the tokens it had stored:
sqlite3 ~/.config/gcloud/access_tokens.db 'select * from access_tokens'
I then found the entry that had the name of my service account. I ran a simple curl
command and sure enough, it worked!
curl -X POST \
-H "Authorization: Bearer <access-token>" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \ "https://iam.googleapis.com/v1/projects/<project>/serviceAccounts/test-account@<project-id>.iam.gserviceaccount.com/keys"
I then verified in GCP’s Logs Explorer that the request was made from the service account. You can see in the authenticationInfo
section a reference to the key id
that was used to authenticate the service account:
"authenticationInfo": {
"principalEmail": "test-account@project.iam.gserviceaccount.com",
"serviceAccountKeyName": "//iam.googleapis.com/projects/<project>/serviceAccounts/test-account@project.iam.gserviceaccount.com/keys/0123456789012345678901234567890123456789",
"principalSubject": "serviceAccount:test-account@project.iam.gserviceaccount.com"
},
So it seems that the key file is used to sign a JWT that is then exchanged for an access token. That access token is stored in a local database. The access token is all that is really needed to make a request to the GCP environment. So in theory I should just be able to revoke that access token and I am good to go, right? Well, it’s not that simple. Here are some of the things I tried to revoke the token:
- I looked in the google-auth library to see if there was a way to revoke a token
- I looked at the token info endpoint documentation
- I looked at service account key credential documentation
- I tried deleting the service account key with
gcloud iam service-accounts keys delete <key-id>
- I tried
gcloud auth revoke <svc_acct>
This just deletes the entry in~/.config/gcloud/credentials.db
and~/.config/gcloud/access_tokens.db
. If you have the raw access token you can still create a new key by saving the access token and running a command like the following:
curl -X POST \
-H "Authorization: Bearer <access-token>" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \ "https://iam.googleapis.com/v1/projects/<project>/serviceAccounts/test-account@<project-id>.iam.gserviceaccount.com/keys" > response.json
Finally, I found this in the gcloud documentation:
Turns out, GCP service account tokens are not revocable. This is why deleting the service account key does nothing!
Step 5: Summarize your findings
At this point I was confident that I understood the process used to get an access token for the service account. I also felt confident that there wasn’t a way to revoke a service account’s access token.
Now it’s time to bring these insights from the realm of research to practical application.
The service account key file is used to generate an access token
The service account key file contains all the info needed to get an access token. This means that anyone with this file has the ability to obtain an access token as the service account. This is the main reason why Google discourages the use of service account keys.
The access token is stored locally on the machine
The access token is trivially accessible for those on the machine. This means it could be targeted by adversaries who gain access to a victim’s computer. With a lifetime of one hour, the tokens are short lived. The service account key credentials are also on the machine and can be used to generate new access tokens. They are located at ~/.config/gcloud/credentials.db
.
The access token cannot be revoked
This means that if an access token is compromised, there is not a good path to remediation aside from disabling the service account and waiting up to an hour for the token to expire. Disabling a service account will likely cause a serious disruption to business operations as the services that leverage that account will not be able to function. This is amplified if the service account happens to be a default service account.
Next steps
Consider potential adversary tradecraft
Putting this research in the context of what is already known about service accounts raises several security concerns. Certain services in GCP, such as Compute Engine and App Engine, generate default service accounts. Up until May 2024 service accounts in an organization were assigned the primitive editor role by default. This role allows service accounts to create service account keys. Why is this important? Well, this means that deleting the service account key and waiting for the access token to expire is no longer an option. The service account can periodically check if it still has its service account key. If not, it can just create a new one for itself and authenticate with that.
This means the only option to remove an adversary’s access to your account is to disable/delete the service account. Unfortunately, that is usually not a practical solution. Here is what Google has to say about it:
“You can disable or delete this service account from your project, but doing so might cause any applications that depend on the service account’s credentials to fail.”
Take steps to mitigate this behavior
So what can you do to protect your GCP environment? Be sure to narrowly scope the permissions of a service account. It should only have the minimum set of permissions needed to perform its actions within the GCP environment. Also, as Google recommends, don’t use service account keys if at all possible. You can enforce this using constraints at the organization level. Below is Google’s guidance regarding proper use of service accounts:
“Depending on your organization policy configuration, the default service account might automatically be granted the Editor role on your project. We strongly recommend that you disable the automatic role grant by enforcing the iam.automaticIamGrantsForDefaultServiceAccounts organization policy constraint. If you created your organization after May 3, 2024, this constraint is enforced by default.
If you disable the automatic role grant, you must decide which roles to grant to the default service accounts, and then grant these roles yourself.If the default service account already has the Editor role, we recommend that you replace the Editor role with less permissive roles. To safely modify the service account’s roles, use Policy Simulator to see the impact of the change, and then grant and revoke the appropriate roles.“
Create detection logic for this behavior
It is possible to detect this type of behavior by looking for a service account creating a service account key for itself. This is highly unusual behavior as most of the time a service account key is created by some other granting authority such as a different service account or a user. A search of this behavior across all of our data yielded no instances of this occurring.
Collapse the box below for a sample log of this activity occurring. You can see that the email address in the authenticationInfo
section is the same as in the request.name field
.
Sample gcloud log
[ { "protoPayload": { "@type": "type.googleapis.com/google.cloud.audit.AuditLog", "status": {}, "authenticationInfo": { "principalEmail": "svc-acct@project.iam.gserviceaccount.com", "serviceAccountKeyName": "//iam.googleapis.com/projects/project/serviceAccounts/svc-acct@project.iam.gserviceaccount.com/keys/0123456789012345678901234567890123456789", "principalSubject": "serviceAccount:svc-acct@project.iam.gserviceaccount.com" }, "requestMetadata": { "callerIp": "0.0.0.0", "callerSuppliedUserAgent": "google-cloud-sdk gcloud/500.0.0 command/gcloud.iam.service-accounts.keys.create invocation-id/4ecd39c540d5451199b2c2d11f93ec5c environment/None environment-version/None client-os/MACOSX client-os-ver/24.1.0 client-pltf-arch/arm interactive/True from-script/False python/3.9.6 term/xterm-256color (Macintosh; Intel Mac OS X 24.1.0),gzip(gfe)", "requestAttributes": { "time": "2024-11-25T15:28:28.083838296Z", "auth": {} }, "destinationAttributes": {} }, "serviceName": "iam.googleapis.com", "methodName": "google.iam.admin.v1.CreateServiceAccountKey", "authorizationInfo": [ { "resource": "projects/-/serviceAccounts/012345678901234567890", "permission": "iam.serviceAccountKeys.create", "granted": true, "resourceAttributes": { "name": "projects/-/serviceAccounts/012345678901234567890", "type": "iam.googleapis.com/ServiceAccountKey" }, "permissionType": "ADMIN_WRITE" } ], "resourceName": "projects/-/serviceAccounts/012345678901234567890", "request": { "@type": "type.googleapis.com/google.iam.admin.v1.CreateServiceAccountKeyRequest", "private_key_type": 2, "name": "projects/-/serviceAccounts/svc-acct@project.iam.gserviceaccount.com" }, "response": { "key_type": 1, "valid_before_time": { "seconds": 253402300799 }, "private_key_type": 2, "name": "projects/project/serviceAccounts/svc-acct@project.iam.gserviceaccount.com/keys/0123456789012345678901234567890123456789", "@type": "type.googleapis.com/google.iam.admin.v1.ServiceAccountKey", "key_algorithm": 2, "key_origin": 2, "valid_after_time": { "seconds": 1732548508 } } }, "insertId": "1xeeyede4w9rg", "resource": { "type": "service_account", "labels": { "unique_id": "012345678901234567890", "project_id": "project", "email_id": "svc-acct@project.iam.gserviceaccount.com" } }, "timestamp": "2024-11-25T15:28:28.052086668Z", "severity": "NOTICE", "logName": "projects/project/logs/cloudaudit.googleapis.com%2Factivity", "receiveTimestamp": "2024-11-25T15:28:28.762918442Z" } ]
If you see something, say something
Ultimately, I decided to report this issue to Google. I was aware that this issue was not going to make front-page news as the latest and greatest GCP vulnerability. I did however feel like the combination of weak defaults, no good remediation strategy, and the ease at which an adversary may use this technique warranted bringing it to their attention and letting them decide the severity of the issue.
We had some back and forth about the potential for this to be abused. Ultimately, Google decided that this issue was properly addressed by recently (May 2024) updating the default constraints in an organization to not grant default service accounts the primitive editor role. Individual accounts’ default service accounts still get the editor role. They also have documented some best practices for managing service account keys that discuss the possibility of something like this occurring. All that said, I still think it is a worthwhile issue to be aware of for the following reasons:
- Many organizations may not be aware of the policy change and haven’t updated their environment.
- Any service account with the following built-in roles can leverage this persistence technique, not just default service accounts. As of this writing, these roles all contain the permissions needed to create service account keys. It is possible that even though the default role isn’t editor anymore, the service account could still get assigned a role that grants it the ability to create service account keys.
roles/assuredoss.admin
roles/securitycenter.admin
roles/iam.serviceAccountKeyAdmin
roles/editor
roles/owner
- If this does occur in a GCP environment, there is no “good” way to fix the problem that won’t likely cause serious service disruption.
Next time you find yourself saying “huh that’s weird,” take some time to apply this research methodology. It will undoubtedly lead to new understanding.
References
https://cloud.google.com/docs/authentication/token-types#access
https://cloud.google.com/iam/docs/service-account-creds#short-lived-credentials