Quantifying Risk Through Probabilistic Risk Assessments

By Capt. Spencer Figge, CEM, M.SAME, USAF, Capt. Melissa Sallberg, USAF, and 1st Lt. Zachary Allen, USAF

Radomes missle warning
A Probabilistic Risk Assessment conducted with facility condition data from Buckley SFB, Colo., sought to demonstrate a method to quantify risk of mission failure and inform future investment decisions. U.S. SPACE FORCE PHOTO BY TECH. SGT. JT ARMSTRONG

How best to prioritize investments to make optimal use of limited financial resources is critical to managing a large-scale real property portfolio at the enterprise level. For the U.S. Air Force, the decision framework in recent years has largely centered around a scoring matrix that considers a facility’s criticality to the overall mission, measured by the Mission Dependency Index, and the probability of system or facility failure, which is calculated by the current condition assessment. While no prioritization system is perfect, this simple approach has provided a quantitative, data-driven starting point for mission owners and decision-makers to discuss funding prioritization. However, a new approach to quantifying resiliency to aid in prioritizing facility and infrastructure investments is needed if the Air Force expects to have a meaningful and data-driven process guiding its facility investments.

In an attempt to overcome some of these limitations, students from the Air Force Institute of Technology, in conjunction with engineers from Space Operations Command and Headquarters Air Force, sought to test the feasibility of utilizing existing facility work task and condition indices data to conduct a Probabilistic Risk Assessment (PRA) of a unique and mission critical system. Their goal was to quantify risk of mission failure due to infra- structure failure, then use the PRA to inform future infrastructure investments to improve resiliency and mission assurance.

For the study, which analyzed current data from Buckley SFB, Colo., the first step was to construct a system diagram consisting of all the critical infrastructure systems and key system components supporting the mission of interest. The research team received a mission briefing to solidify its understanding of the requirements, followed by an extensive walkthrough of the facilities and interviews with local operations and maintenance personnel. Data collected from the site visit was then combined with an extensive review of the facility as-built and system one-line drawings.

The resulting system diagram was constructed as a fault tree to ease the probability of mission failure calculations.

After the diagram was completed and all key components identified, databases were mined for relevant downtime, cost, condition, and work task data. The cost and condition data for the critical components were obtained from BUILDER easily. However, retrieving the relevant work task data from NexGen IT that could potentially lead to reliability calculations required several iterations of filtering through the work task reports. The’ effort ultimately yielded 92 relevant work tasks since the data began in 2017. These were each assessed individually to determine whether they could be tied to a specific critical component in the system diagram, and whether they contained enough information to make a component reliability calculation.

DECISION SUPPORT

A PRA is a quantitative tool that can be used to estimate the failure rate of a system based on the reliability of the individual components within it. In turn, component reliability can be calculated based on the downtime or number of failures in a given time period.

This approach for analyzing risk is particularly common in fields where a system failure would lead to exceptionally catastrophic consequences, such as in nuclear reactor management or space flight operations.

Ultimately, PRAs are used to help owners and operators make better decisions regarding levels of mission risk and quantify the benefits of investing in more robust, redundant, or reliable systems that can reduce that risk.

Study Limitations

The most successful aspect of the study came from the construction of the system diagram. From this analysis, the team was able to identify several single points of failure that would be good candidates for investment to improve overall system reliability.

However, when it came to quantifying risk, the data contained in the work tasks and condition assessments was insufficient for calculating component reliability or probability of failure.

The data was insufficient in two main ways. First, it was difficult to tie work tasks from NexGen IT to individual system components in the system diagram or in BUILDER. Second, the work tasks did not contain information that indicated actual system or component downtimes needed to perform the necessary reliability calculations. As a result, the team concluded that the data collected under existing guidance and for current needs does not contain enough detail to make PRAs a feasible approach for quantifying critical infrastructure system reliability without substantial changes to existing practices. Additionally, while the compilation of a fault tree-style system diagram yielded meaningful insights into potential mission impacts, the process was manually time-consuming and could not easily be conducted on an enterprise-wide scale.

radar dome, highlighting single points of failure
Although the data gathered from work tasks and condition assessments was insufficient for the research study’s calculations, the preliminary system diagram succeeded in highlighting single points of failure. U.S. AIR FORCE PHOTO BY AIRMAN 1ST CLASS LUKE W. NOWAKOWSKI
Promising Steps

Although the Buckley SFB study did not achieve its desired goals, the team was able to come up with a set of recommendations in the event that PRAs are pursued on an enterprise-wide scale. First, efforts are already ongoing to correlate the inventory and condition data from BUILDER with NexGen IT, which will address one of the roadblocks encountered with this approach. The other issue of the lack of downtime duration or failure mode data would demand an extensive overhaul of data collection practices at the installation level. It would require the recording of failure mode, duration, and actual system or mission impact with each work task to allow for calculating the reliability. As a positive, the framework for recording this type of information already exists within the NexGen IT database. However, substantial changes to data collection guidance at the higher headquarters level would need to be instituted along with data collection practices at the installation level.

While the decision to focus on risk and reliability instead of resiliency may seem like a semantic issue, the shift allows for a more concrete description of what the data would be used for and provides a quantitative methodology for a more accurate comparison of investment opportunities.

Existing data collection policies and practices have done an excellent job in achieving the first aspects of a successful asset management program: identifying what we have, what condition it is in, and how important it is to the mission. If the Air Force wants to continue to advance its practice of asset management, then expectations for data collection and data use need to evolve.

QUANTIFYING CHALLENGES

Since it was released in 2019, the Infrastructure Investment Strategy (I2S) has been a foundational document for Air Force Civil Engineering. The strategy outlines three lines of effort to address the ever-increasing facility and infrastructure maintenance backlog: restore readiness to power projection platforms, cost-effective modernization of infrastructure, and innovation in installation management.

A major focus in each line of effort has been the collection and utilization of data to make data-driven decisions.

The Air Force relies on several databases for the collection and organization of facility and infrastructure data. BUILDER is used to record the condition of various facility system and sub-system components. PAVER is used exclusively for installation and airfield pavements. NexGen IT functions as a work task management tool, capital project planning tool, and real property record. The data in these systems are critical to the planning and prioritization of Air Force facility and infrastructure maintenance and are key tools in advocating for project funding.

The I2S and other documents, such as the more recently released Air Force Climate Action Plan, also stress the need for improving the resiliency of installations and infrastructure to reduce the risk to the mission. Actually quantifying installation resiliency, however, is challenging, as is estimating potential improvements because resiliency actions are tailored to anticipated and specific disruptive events. This makes comparing and prioritizing resiliency investments across a geographically diverse enterprise, such as the Air Force, exceedingly difficult. Additionally, resiliency investments almost always involve adding more robust components or redundant systems. This makes them counter-intuitive in an enterprise that has limited financial resources and has relied on “lowest cost, technically acceptable” as the decision criteria for so long.


Capt. Spencer Figge, CEM, M.SAME, USAF, is Engineering Flight Commander, Ellsworth AFB, S.D.; spencer.figge@us.af.mil.

Capt. Melissa Sallberg, USAF, is Emergency Management Flight Commander, Joint Base Charleston, S.C.; melissa.sallberg.1@us.af.mil.

1st Lt. Zachary Allen, USAF, is Requirements & Optimization Officer in Charge, Buckley SFB, Colo.; zachary.allen.25@us.af.mil.


More News from TME