Sep 17, 2021
When security professionals talk about risk, especially with business executives, we often use metaphors rooted in the physical world. We might talk about coverage, and compare it to the length of a wall that surrounds a group of assets. Perhaps we talk about the height of the wall, to consider how comprehensive our defenses are. To make sure we’re focused on the right defenses, we think about the context of the asset behind the wall.
Three-dimensional analogies are helpful, but the most important dimension – time – is all too often left out of risk conversations. Consider the temporal continuity of risk: at some point in the future, something bad will happen. You don’t know when that will be, so you assign some probability to the risk. As time passes, that risk presumably increases, until it becomes certain, leaving the future for the present, and now you have an incident. After you correct the incident, hopefully you put in place a control to keep that specific risk from happening again, and the risk moves from the present into the past. Problem solved, right?
Not quite so fast, unfortunately. Controls instituted at incident tempo are often overly specific, and may miss out on underlying hazards. Let’s consider a wood fence. Wood decays over time, so one hazard could be written as “the fence parts will weaken and break over time.” One day, the fence section right next to the gate breaks. After corralling all of your horses, you reinforce that fence section, perhaps with some metal cross braces. While that specific incident is unlikely to happen again, every other fence section is still at risk from the uncontrolled hazard of decaying wood.
A good control framework doesn’t try to prevent a specific incident from recurring. Instead, it aims to identify an underlying hazard that appears in many risk scenarios, and puts in place controls to keep any of those scenarios from happening, allowing you to move entire categories of risk from the future straight into the past, skipping the unpleasantness of incidents hitting you in the present. In the wooden fence model, perhaps you monitor the strength of the wood, and institute a regular maintenance and replacement process to ensure that wood doesn’t have enough time to decay.
Security Risk Management then, has four different time horizons it works on: past, present, near-term, and future. Risk Management in the past is the domain of the Compliance team: ensuring that control frameworks that were established to manage a risk continue to do so. The present is the domain of the Security Operations or Incident Response teams: solving problems in real-time triggered by unmanaged risks. Future risk is, unfortunately, often not as clearly owned. Executive teams and boards are often focused on the near-term risks, seeking to identify the most predictable risks. This approach often extends the incident response model of “fix it as it breaks” to a just-in-time “fix it right before it breaks” risk reduction plan.
But future risk is actually where risk management processes can have the greatest impact, because a well-designed control structure can take whole swathes of risks and replace them with a strong control framework. It requires more discipline. Instead of fixing problems one at a time, an organization needs to remain focused on a mission, perhaps improving the state of vulnerability management or identity and access management (IAM).
Obviously, incident management needs to take as much of your time as it needs; that’s one of the definitions of incident, of course. There is a temptation to use the incident tempo of work to also address near-term risks. Unless you believe the risk is really about to happen (ask yourself, “Am I surprised it hasn’t happened yet?” as a way to test that belief), then you will need to identify the right blend of work across near-term and farther future risk. That may seem counterintuitive. Shouldn’t you prioritize near-term risk more than further out risk items?
The benefit of prioritizing further out risk items is you can more carefully create control structures that mitigate entire classes of risk, rather than focusing on the narrow slices that seem likely to happen in the next year. Consider IAM. There are a number of near-term risks around specific assets that too many users have access to; but the aggregate set of risks you can address with a robust IAM program is much greater – and it addresses risks that you might not otherwise prioritize, but which nonetheless will, at some point, cause you trouble.
A robust Security Operations function restores you in the present. A great Compliance function ensures your past remains safe. Thoughtful Risk Management can protect you from the future.
Moving to the cloud – or building outright in the cloud – brings a different spin on these challenges, because unlike in the legacy data center deployment model, there isn’t a physical network that constrains your connections between systems. You can’t just count physical machines to know you have complete coverage; instead, you have to have a robust asset identification system. You can’t trace wires to determine the context of which front-end server is connected to which backend server; instead, you’ll need to trace authentication and data flows to see which machines have access to each other. And layering stacks of agents to achieve comprehensive coverage is a sisyphean task when developers can deploy new systems with a call to an API; you’ll need an agentless approach that covers your entire cloud estate.
Put together, you can have a robust program to not only tackle the imminent hazards in your near future, but also set you up for long-term success with low-cost security risk management.