How a malicious actor can own all your VMs

An attacker can use the default service account in Google Cloud to move laterally across compute engine instances. In this blog post, we will review all the required permissions and access scopes that allow an attacker to move laterally from one compute engine instance to all instances in the same Google Cloud project scope. We will also point you to a script that detects and analyzes the misconfiguration that enables this attack.

What is a service account?

A service account enables both applications and compute engine instances (VMs) to interact with the different resources and APIs in Google Cloud. Service accounts are created under your different projects, and you can create as many as you need to represent different levels of access.

Here’s an example of different service accounts for each resource:

Service account types

  • User-managed service accounts – As the name implies, this type of service account is created and maintained by the user. The user manually creates the service account and assigns the specific roles it needs.
  • Google-managed service accounts – This type of service account is created by Google automatically to access various cloud services that need to act on your behalf.
  • Default service accounts – This type of service account is created by Google Cloud and can execute cloud API calls to various resources in the project. Default service accounts are automatically given the Editor role when created. We will focus on this type of service account later.

Roles

Roles are simply a collection of permissions. Since some actions require multiple permissions, they are assigned to a role that principals can use to execute these actions.

Here’s an example of the compute.instanceAdmin role:

There are three types of Basic (also known as Primitive) roles:

  • The Owner role has complete control over the project. They can view, edit, and create all resources.
  • The Editor role can view and modify all resources but not create new ones.
  • The Viewer role has read-only permissions.

If you recall, the default service account is granted the editor role, which gives the default service account broad permissions over the project.

Cloud API access scope

Cloud API access scope defines the API calls a compute engine instance can run. When creating a compute engine instance, there are three types of cloud API access scopes that can be assigned:

      • Allow full access to all cloud APIs
        Contains all cloud API calls, displayed as: ‘https://www.googleapis.com/auth/cloud-platform’
      • Allow default access
        Contains the following authorized APIs calls, displayed as:
        ‘https://www.googleapis.com/auth/devstorage.read_only’
        ‘https://www.googleapis.com/auth/logging.write’
        ‘https://www.googleapis.com/auth/monitoring.write’
        ‘https://www.googleapis.com/auth/service.management.readonly’
        ‘https://www.googleapis.com/auth/servicecontrol’
        ‘https://www.googleapis.com/auth/trace.append’
      • Set access for each APISpecifically run API calls manually chosen by the user.

Putting it all together

Now that we understand all the various components that can allow an attacker to conduct lateral movement, let’s see how it’s done.

By combining the following three items, an attacker can move laterally:

Default service account + Editor role + All cloud API access scope

Our starting point for this scenario is a compute engine instance that has already been compromised or stolen credentials for Google Cloud. So let’s assume we are in the post-breach phase.

First, let’s choose a project:

Now, let’s see if we have the default service account within the project scope:

Our project has the default service account enabled.

Let’s see if the default service account has the editor role (although it is granted automatically, it can still be revoked at any time):

The editor role is in place for our “infamous” default service account.

Let’s move on and enumerate the various instances we have in this project to validate we have the appropriate cloud API access scope needed (which is all cloud API access scope) to execute this attack:

We list the instances:

Let’s take a closer look at the instance “debian1”.

Listing the access scope for the instance “debian1”:

We can see that the instance “debian1” is configured with the default cloud API access scope, which means we can’t conduct lateral movement.

Let’s try another instance.

Listing the access scope for the instance “debian2”:

We can see here that the instance “debian2” has the ‘all cloud API access’ scope, which means we can expand our foothold within the project by gaining access to additional instances. Unfortunately, so could an attacker. 

Obviously, you can run more cloud API calls, but for this particular lateral movement demonstration, let’s see the “gcloud compute ssh” in action. 

We will try to ssh from the “debian2” instance to the “debian1” instance:

And it worked. We can see that once we gain an initial foothold to the compute engine instance “debian2”, we can conduct lateral movement to all instances in the project scope.

Mitigation and suggestions

Now that you understand how the default service account misconfiguration can be used to enable lateral movement in Google Cloud, it is better to take action as soon as possible and address the misconfiguration. Do the following:

  • Follow the principle of least privilege and revoke the editor role from the default service account.
  • After revoking the editor role, add specific roles to the modified default service account.
  • It is strongly recommended to not use the default service account but to create and manage your service accounts for compute engine instances.

Bonus: We wrote a script that detects and analyzes this misconfiguration and can help you review your Google Cloud compute engine instances and take action to eliminate this risk. You can get the script here.

The default service account misconfiguration script iterates through all the available projects in your Google Cloud account (according to your token’s access level) and extracts the compute engine instances that can access all compute engine instances in its project’s scope (for example: can use the command gcloud compute ssh to access other VMs in the same project).