Stop Color-Coding Vulnerabilities, Start Remediating Them

May 18, 2021

Most organizations are still struggling to optimize their vulnerability management programs. Often, we see stats that show many organizations take well over 30 days to move from patch release to patch deployment. Without taking a hard look in the near term at how and why security and IT teams operate with a continual lag on mean-time-to-remediate (MTTR), we see response times only increasing as the pace of vulnerability discovery overwhelms legacy remediation processes.

 

The original frameworks that are instilled at many organizations for change management and “risk-based” vulnerability management were likely developed several years ago and in some cases these processes could have been developed over a decade ago. We’ve all heard the story of the patch that caused a major outage and since then rigorous processes were put into place so that would never happen again.

 

Fast-forward to 2021 and we have systems that are far more resilient and patch releases that are less likely to cause total outages. However, it is unlikely most organizations would adopt a model that accepts system updates without some type of assurance that the update will not cause an issue. Thus, most organizations remain in a state of limbo where the standard operating procedure has not been challenged and a backlog of patches remains the norm.

 

Organizations that have tried to advance their current state and reduce MTTR have been led down a path of sourcing ever-increasing context around any individual vulnerability discovered. On the surface this should help with better decision making and prioritization of remediation activities, but it has likely only resulted in marginal gains.

 

Taking a step back and surveying both the causes for stagnation and probable paths forward, Optiv developed a strategy to address the common issues that organizations encounter when grappling with vulnerability remediation. Broadly, organizations are not moving forward because of:

 

  • Dated and complex processes
  • Legacy technologies
  • Reactive risk management

 

As a continuous process, remediation has been difficult to fully analyze due to half of the process living in the security organization and the other half existing as an IT function. Adding to that are common attitudes regarding where each side of the organization sees its job as beginning and ending. Security teams may think their task is complete as soon as IT is informed, and IT doesn’t need to worry about security patching until security sends them a ticket. The lack of a continuous process visibility means that it becomes difficult to review the current workflow and identify the means to improve it.

 

To counter this, the first action we recommend is an in-depth review of the vulnerability remediation process, specifically starting at when a vulnerability is discovered and ending when the vulnerability has been validated as fixed. To implement this, Optiv has developed a streamlined remediation pipeline model that can be adapted to any organization.

 

 

Color Coding Img 1
Example Windows remediation pipeline

 

 

When we look at the technologies organizations have in place to discover vulnerabilities and to deploy updates we find scanners and systems management tools that aren’t being used in the most effective way. For example, the CMDB isn’t connected to the vulnerability scanner and scan results don’t directly inform the systems management tool. Fortunately, many existing scan and management platforms have some base-level of integration capabilities. Further building on those capabilities, a new class of tooling has emerged that we have named remediation orchestration and automation platforms (ROAPs); these provide better correlation of CVE to patch and the capabilities to perform rules-based deployment of the correction via an array of systems management solutions.

 

Lastly, risk tolerance is important to address in a way that can be supported with real-world testing and measurable results. While our goal is to move faster, we want to do so safely. In our methodology this means incorporating an automated test cycle into the patch process.

 

If you surveyed a typical software development team and asked what they do to make sure a product update was working before it went to production, you would hear a variety of quality assurance techniques in place, such as user acceptance testing (UAT) phases, and other measures that can be done at scale and at speed. When asking the team that deploys desktop patches, the typical answer is that they deploy patches to a pilot group for a couple of weeks, and then if no one complains they roll the updates out to the rest of the organization. Or, in some instances, they simply wait a month for the rest of the world to test the patch.

 

Instead, our objective is to take the philosophy of test automation before production deployment and apply that to servers and workstations. Now concerns about the update causing problems can be met with test cycle output that provides a comprehensive set of testing conditions and outcomes. This is far more effective than leaving testing up to random user actions in a pilot group. Unit testing provides faster and more accurate results to determine if an update is safe.

 

For all the details on how to build remediation pipelines, what capabilities to look for when selecting remediation tooling and a proof-of-concept example of how to use test automation before production deployments, download our white paper: Accelerating Vulnerability Remediation.

Woodrow Brown
Vice President, Research & Development | Optiv
Woodrow Brown has over twenty years of leadership, service delivery and research experience. As vice president research and development at Optiv, Brown's team analyzes market and technical trends providing continuous input into Optiv’s solution roadmap. Cutting through industry spin, Brown delivers research that provides an accessible understanding of how security technologies can provide optimal business outcomes.