Remediating an issue like today’s outage on Windows machines with the CrowdStrike Falcon Sensor at cloud scale can be particularly time-consuming and labor-intensive, which may extend outages and associated losses. Given our knowledge of Orca’s first-in-the-industry SideScanning, which is based on cloud snapshots, we started wondering if it was possible to use a similar approach to automate an organization’s ability to fix a problem like this.

Spoiler: It is.

Recap

While it’s been almost impossible to miss coverage of this particular issue, a short recap of what is involved in resolving it may be helpful before we proceed.  Published guidance instructs administrators to reboot the machine in Safe Mode, delete a specific file, and reboot back to normal mode.

Obviously, this isn’t a viable resolution on virtual machines hosted in the public cloud as there is no way to get to Safe Mode. Instead, the manual resolution involves mounting the VM’s disk on another VM, deleting the file, then remounting the disk on the original VM – a process that is even more time consuming and that involves the inherent risk of human error.

A Different Approach

Today, we’ve published a sample Python script that automates the process of remediating a problem like this in AWS. We found that it was both quicker and simpler to do it without snapshots; instead, the script:

  1. Stops the affected VM and detaches the disk from the affected VM.
  2. Attaches it to a Linux VM.
  3. Makes the necessary change (deleting the defective file).
  4. Detaches the disk from the Linux VM and reattaches it to the original VM and restarts it.

This enables organizations to accelerate their remediation efforts and reduce the possibility of human error.

Disclaimer

While we’ve tested this in our own environments, any operation that involves deleting files is inherently risky and users accept all associated risks. Please test this extensively in environments where you intend to use this before deploying it extensively.

If you find/fix gaps or you extend the script’s functionality, we encourage you to submit those changes via a pull request.

Usage

Full details are in the README.md in the GitHub repository.  The script requires a Linux VM in the same availability zone as the Windows VMs to be remediated – we tested with Ubuntu.  It also requires installation of the ntfs-3g and ec2metadata packages (or equivalents on a different distribution). Finally, be sure to install the Python requirements via pip or poetry.

As the story and specifics evolve, please continue to follow the Orca Research Pod for updates and reach out with any specific questions.