Improving Patch Management, A Measured Approach
February 19, 2015
Organizations face a daily barrage of new vulnerabilities identified in a host of applications and operating systems. Through the deployment of a software update, organizations can reduce the likelihood of a system being compromised by a threat agent and directly reduce risk. With deployment of these updates mandated in multiple regulatory frameworks, the development of a Software Update Management (SUM) program allows organizations to meeting regulatory obligations while improving application and operating system performance and reducing risk.
Development and execution of a SUM program is not just best practice. With today’s threat landscape, it’s a necessity. Numerous breaches have been tied to vulnerabilities, and the recent fine levied by the Office of Civil Rights (OCR) against Anchorage Community Mental Health underscores the need for a mature, robust program.
Improving the State
Depending on the maturity of an organization’s Information Technology (IT) practice, a SUM program will be in place or be on the roadmap for development. For organizations seeking to develop a SUM program, a structured approach to program development is recommended. One such approach is the utilization of DMADV (Define, Measure, Analyze, Design, Verify). The DMADV is often utilized for the development of new processes or programs. This tool heavily relies on the use of defined phases to ensure that the end product meets expectations.
As most organizations have a patch management program and face a multitude of challenges, DMADV will not be the appropriate approach. In these situations, DMAIC (Define, Measure, Analyze, Improve, Control) can be leveraged as a tool designed to improve existing processes and programs. Through the use of clearly defined phases, focusing on the analysis of existing processes, this tool can be utilized for organizations with a pre-existing SUM program.
Executing changes based on a hunch or theory can lead to unnecessary changes being implemented causing a waste of resources and preclude long-term buy-in from stakeholders. In following the DMAIC approach, organizations do not go down “rabbit holes” and implement process changes based on a hunch.
By separating problem solving into multiple phases, a measured approach is taken where there is proof of cause and effect before improvement occurs. This allows for each improvement to be measured and sustained. Throughout this series of blog posts, each phase of the DMAIC will be visited and examples for how an organization can apply this tool towards the improvement of a SUM program will be given.
To properly initiate the DMAIC approach in process improvement, we must start with the definition of what we are attempting to improve. In the “Define” phase you must be sure to ignore what you learned in high school chemistry. We are not following the scientific method here; we are improving a software update management program. Be careful not to jump to conclusions and create a hypothesis. Instead, create a simple problem statement such as “Ineffective software update management program.” Utilizing this problem statement, you can create a charter. This charter will serve as a reference point throughout the DMAIC. A typical charter for this type of process would include:
- Problem Statement
- Simple statement providing an overview of the problem; not a hypothesis.
- Defined scope for improvement; such as business units, applications, endpoint types, etc.
- Using the problem statement as a starting point, clearly define the objective of the improvement. Be careful not to define metric requirements at this point as analysis has not been done to show the current state.
- Define key business partners that will be involved in the program. Application owners in other business verticals, change management staff and support desk personal are common stakeholders that should be defined.
- The authority of the team to execute across business verticals is a critical point as key stakeholders may exist in different reporting structures that may not align with IT.
- Target Benefits
- Explain why SUM is important for the organization. Note that the benefits will be used to sell the program to key stakeholders that may have apprehension regarding their involvement in the program.
In the “Measure” phase of the DMAIC, the current state is measured and a baseline is established. Try not to fall into a trap here and get too technical. There is no need to produce a report for every host in the organization, detailing each vulnerability. Leadership will not understand or use a report like this. Reporting on individual vulnerabilities can be used for technical troubleshooting and host auditing. A focus should be given on the overall performance of the SUM program. Key Performance Indicators (KPIs) for four distinct categories: processes, servers, clients, other (network devices, etc) are one such example of overall program measurement.
- Has the process been executed in the defined cycle? This technological agnostic question is focused on process execution and should not be measured by tools such as vulnerability scanners. The measure of this category can be as simple as a checklist to ensure that major components of the process - such as execution of communication plan - occurred in the order and time defined.
- Additional technical measures of process success, related to static devices such as servers and network equipment, can be based on measuring uptime. Excluding unplanned outages, one of the primary reasons production servers and network equipment are rebooted is due to a patch deployment during a maintenance window. Through measuring uptime, a snapshot can be taken to determine if the target device was rebooted during the last scheduled maintenance window. A failure in rebooting during the maintenance window can indicate a process failure, such as an inaccurate scope. This simple measure is often overlooked and can be one of the most insightful.
- Servers, Clients & Other
- It’s time to break out that vulnerability scanner! Take a measured approach, though, to determine the success of the SUM program. You do not need to report on every vulnerability. Additionally, careful consideration should be given not to report on configuration vulnerabilities. In SUM there will always be a new patch and reconciling this while still providing metrics and showing improvement can be a challenge
One approach is to create Shewhart or control chart. This type of line graph sets upper and lower control limits. This chart should be a measure of average host risk - not missing updates - and different charts should be created for each category of host type being measured (servers, clients and other).
The upper control limit should be set based on the organization’s risk tolerance, defined on objectives set in charter. As patch management is a cyclical process, with new patches being constantly introduced, the lower control limit is used to account for variation. If a patch deployment cycle ends and new patches are released as vulnerability scans are occurring, the lower control limit serves as a buffer to account for this change. By setting the lower control limit, organizations can define acceptable risk that will occur between patch release, deployment and measurement.
To determine the value for each cycle in the control chart, organizations will need to agree on a formula where each host and vulnerability is measured to create an average. The most basic formula is risk = impact*likelihood; however, if organizations are unable to measure impact, a more basic formula utilizing the CVSS score of a vulnerability is an optional starting point. For large organizations, a tool-based approach for generating this data should be used. A basic example of using risk data to populate a control chart is shown below.
Figure 1. Host Risk Chart
Figure 2. Client Risk Control Chart
In my next post in this series, we will explore the Analyze phase of the DIMAC. In this phase we will develop cause-and-effect theories for major factors in SUM.