Bridging the Gap Between Systems Engineering and Program Management via a Risk Aware Framework

Complex engineering systems are prone to schedule slips, budget over runs, and a variety of challenges that compromise delivered value. These challenges are a sign of failure on the part of both management and technical roles, but can be overcome through a Risk Aware Framework that integrates the roles into a cohesive systemic and systematic approach to delivering high-value business outcomes.

This article shows how your organization can become more effective, more efficient, more responsive, and enjoy better business outcomes by bridging the gap between Systems Engineering and Program Management via a Risk Aware Framework. Beginning with an overview of key concepts, this article details the challenges faced by System Engineering and Program Management practitioners every day. The practical framework that follows describes how a principled process can be integrated successfully to streamline project workflow. A case study details how a real-world company successfully implemented a risk framework to improve cost and schedule performance, while improving the likelihood of success of your organization’s own strategy.

This article describes a proven framework to:

  • Overcome challenges and improve cost, schedule, and business performance
  • Assess current capabilities and build to the level your organization needs
  • Manage risk throughout all stages of a Project Life Cycle
  • Deploy best practices for teams and systems

Introduction

How’s this for a system engineering truism? “The best system engineers possess the superior judgment to avoid situations requiring their superior skills to survive.” While arguably more true than a whole wealth of truisms, it doesn’t provide much guidance in our quest to become one of those wiser and more-capable system engineers, especially when we find ourselves in a position of leadership. Which raises an obvious question: How does one develop such profound judgment?

Regardless of our degree of skill and experience, the trick to safely nudging the envelope comes in knowing as much about what we don’t know as what we think we do know and weighing those factors wisely before we venture forth. The process of evaluating those elements before taking action is a process known as “risk management.”

This is an era in which risk-taking is rewarded, leaving companies that run away from risk as plunder to be divided up by the others. Risk is inherent in all activities. It is a normal condition of existence. Every day, companies are exposed to various types of risk. They can be connected to property, liability of third parties, staff or decisions; risk is the usual companion in every business and with direct influence on result. But what is a useful way to think about risks and risk-taking in today’s environment?

Risk is the potential of loss resulting from a given action, activity and/or inaction. Usually we have a choice having an influence on the outcome. Exact definition of risk is given within the ISO/IEEE Standard 16085:2006 on software and systems engineering risk management. 

Risk is not a problem. It is an understanding of the level of threat due to potential problems. A problem is a consequence that has already occurred. Risk is defined by two characteristics of a possible negative future event: probability of occurrence (whether something will happen) and consequences of occurrence (how catastrophic if it happens). If the probability of occurrence is not known, then one has uncertainty and the risk is undefined. 

The computerization of the workplace and the levels of IT dependency that now exist means the risks associated with the failure of IT systems are one of the most potent sources of operational risk within any organization. Systems engineering management related risks could be related to the system products or to the process of developing the system. Risk management in quickly changing environment is a requirement, for it contributes to reaching strategic advantages of a company. Inadequate attention to risk, especially at the early stages of a systems engineering project, is often the reason behind cost overruns, schedule delays, and poor technical performance.

Consider the Future of Your Present Decision

The purpose of risk management is to make decisions, not to admire the risks. No behavior goes more to the core or soul of a company than how it makes decisions. 

Making the right decision means performing risk analysis. Risk analysis is systematic use of available information to determine how often specified events may occur and the magnitude of their consequences. The goal of any of these methods is to help the decision-maker choose a course of action, given a better understanding of the possible outcomes that could occur. By exploring the full space of possible outcomes for a given situation, a good risk analysis can both identify pitfalls and uncover new opportunities. Risk analysis can be performed qualitatively or quantitatively. Qualitative risk analysis generally involves assessing a situation by instinct or feeling and is characterized by descriptive statements. Quantitative risk analysis attempts to assign numeric values to risks, either by using empirical data or by quantifying qualitative assessments.

Three Essential Elements for Success

Bringing risk management to an organization is a change in language and attitude towards risk and its link to decision making. Behavior changes at all organization levels and requires at least three elements:

  • A repeatable process with defined steps and artifacts supported by applicable methods and tools.
  • Widespread access to adequate knowledge sources to fuel the process.
  • Functional behavior including human interactions, motivators, perceptions, communication, decision making processes and risk tolerance.

These are not independent elements; there are strong interactions that must be accounted for in implementing and sustaining risk aware management. Process and knowledge sources, while necessary, cannot by themselves change behavior. The last element is the key and yet it has received little attention. Although change management is a discipline in its own right, there are special considerations for risk management.

To introduce effective risk management practice in an organization requires the role of all three — process, knowledge and behavior — must be understood. However, it is the issues involving functional behavior that will determine whether a risk management practice can be successfully sustained.

Case Study: Rockwell Collins

One company that has thrived with risk is Rockwell Collins (now Collins Aerospace). An independent audit revealed that due to risk management practice, Collins achieved double digit improvement in Cost Performance Index and Schedule Performance Index.  Enterprise Risk Management (ERM) at Rockwell Collins comprises two dominant threads that wind around a central decision process, as well as each other, forming a strong organizational risk culture. The central decision process at Collins is a phase gate model that covers the entire lifecycle of a business opportunity. The decision process directs a compelling set of business questions at key points in the lifecycle of each endeavor. It ensures productive interrogation and structuring of the company’s discretionary investments. As an organization, Rockwell Collins minimizes surprises, provides relevant, objective and timely information to decision-makers, and focuses on asking the right questions. Seven key questions that are regularly and rigorously answered are:

  1. What are today’s risks – are they higher or lower than before?
  2. Are the risks likely to get higher or lower in the future?
  3. What is being done to reduce risks, to monitor risks and to prevent risks in the future?
  4. Who is responsible for the aversion measures – who can I call if things are not correct?
  5. How will I know the aversion measures are being put into place?
  6. What is the timetable for the aversion measures?
  7. How and what should I communicate concerning risk internally, to suppliers, and to the customer?

Focused through the phase-gate decision process, the complementary decision processes of management of risk and risk management allow Collins to address risk in a systemic, multidisciplinary manner that weaves business strategy, finance, program and project management, etc., into a comprehensive and unified whole.

Communication of risks is one the most challenging tasks in risk management. The people in the position to best recognize many of the risks are typically system engineers working on the project. For a variety of reasons these people may not be willing to communicate the risk. Collins’ success is in part to creating an environment that generates information pull. System engineers and their managers must learn that merely identifying a risk, placing it into a risk register, doesn’t mean it will occur (and ignoring it doesn’t mean it won’t). 

At Collins, one means of creating information pull is to condition people at all levels to ask questions that elicit risk causes and characteristics. For example, my long-time associate at Rockwell, the late Art Gemmer described[1] how his organization coached managers to consider certain questions in response to cues, some of which are listed in the table (From “Risk Management: Moving Beyond Process” by Art Gemmer in IEEE Computer, May 1997) below. 


Four Types of Risk 

Successfully answering key risk aware management question demands a widespread access to adequate knowledge sources to fuel the risk aware process. It involves four types of risks: 

Programmatic, Organizational, Economic and Technical. Classification of risks is helpful in order to group those with similar risk characteristics, and is fundamental to any engineering system in order to evaluate them.

Product risks include both end product risks that relate to the basic performance and cost of the system and to enabling products that relate to the products that produce, maintain, support, test, train and dispose of the system. Risks relating to the management of the development effort can be technical management risk or risk caused by external influences. Risks dealing with the internal technical management include those associated with schedules, resources, work flow, on time deliverables, availability of appropriate personnel, potential bottlenecks, critical path operations and the like. Risks dealing with external influences include resource availability, higher authority delegation, level of program visibility, regulatory requirements and the like.

Programmatic Risk:  This is the risk that a major change initiative could fail or the benefits expected of it might not materialize. With an increasing use of projects and programs to drive through change within organizations, this type of risk is often closely associated with strategic risk, as failure can have significant impacts on the organization. Moreover, with the increasing complexity of organizations, managing this type of risk is fast becoming an essential skill.

Risk drivers include: project purpose and need is poorly defined, project scope definition is poor or incomplete, project scope, schedule, objectives, cost and deliverables are not clearly defined or understood, no control over staff priorities, too many projects, consultant or contractor delays, estimating and / or scheduling errors, unplanned work that must be accommodated, communication breakdown with project team, pressure to deliver project on an accelerated schedule, lack of coordination / communication, lack of upper management support, change in key staffing throughout the project, inexperienced workforce / inadequate staff / resource availability, local agency issues, public awareness/support, agreements.

Organizational Risk: Risk drivers include: inexperienced staff assigned, losing critical staff at crucial point of the project, insufficient time to plan, unanticipated project manager workload, internal “red tape” causes delay getting approvals, decisions, functional units not available, overloaded, lack of understanding of complex internal funding procedures, not enough time to plan, priorities change on existing program, new priority project inserted into program, inconsistent cost, time, scope and quality objectives. 

Economic Risk: This covers those risks that can affect the business in terms of its general financial viability. It includes risks associated with the market in which the organization operates (market risk), as well as the ability to finance growth through loans (credit risk). These risks are generally well understood, with a large number of financial instruments and techniques available to the risk manager.

Technical Risk: This is different from operational risk in that it is associated with bringing new technology products to market and introducing new technology into the organizational setting, both of which are high risk ventures. Risk drivers include: design incomplete, environmental analysis incomplete or in error, unexpected geotechnical issues, change requests because of errors, inaccurate assumptions on technical issues in planning stage, surveys late and / or surveys in error, materials / geotechnical / foundation in error, structural designs incomplete or in error, hazardous waste site analysis incomplete or in error, need for design exceptions, consultant design not up to department standards, context sensitive solutions, fact sheet requirements (exceptions to standards). 

A Repeatable Process With Defined Steps 

A widely recognized risk management framework (see below) depicts the different activities involved in the risk aware management associated with system engineering. The framework is represents a dynamic, continuous and highly-iterative process, while the arrows show the logical flow of information between the activities. From this framework, a project may structure a risk management practice best fitting into its system engineering project management structure. 

Dis-functional Behavior: Risk Arrogance

Francis Bacon is quoted as saying, “Man prefers to believe what he prefers to be true.”

Organizations may think that effective risk management will follow merely from a repeatable process and widespread access to adequate information about risk management. However, as my associate the late Art Gemmed said, “Following a repeatable process may mean we are just systematically managing risk poorly. Likewise, having adequate sources of knowledge doesn’t necessarily motivate people to use them correctly.” 

Companies that seem to understand the necessity of risk-taking are sometimes prone to the following strange behavior: They try to emphasize positive thinking by ignoring the possible unfortunate consequences of the risk they’re taking. This is an extreme variant of the can-do attitude. After all, risk awareness involves at least a bit of can’t do thinking, they reason, so it can’t be good. In order to stay positive, they steadfastly refuse to consider much of the downside. If there are things that could go wrong, that would make your project a total fiasco, for example, they would just have you not think about those things at all.

Denial is a major reason risk management is not usually done as part of project management: as my associate Dr. Robert N Charette said, “software project success is based upon minimizing the thought of possible failure.” Typical organizations are focused on the success of a project. Owning up to risks is all too often considered defeatism. The problems created by a “can-do” attitude, paradoxically, increase with the exposure and difficulty of a risk. 

It’s not only managers who are subject to such hubris. If you’re a younger and less experienced system engineer, not only might you be unaware of the risks confronting you, but you may truly believe that any such risks that might emerge can be overcome, because you’re real smart and you’re willing to work real hard.

Many organizations in fact foster such attitudes as part of the can-do ethic. Risk averse and arrogant attitudes lead to system engineering dominated by crisis management and heroics. The organizational incentives are often structured to reward heroics and “can-do” employees. This positive reinforcement further ingrains these destructive attitudes. 

Among aviators there is a saying, “There are old pilots and there are bold pilots, but there are no old bold pilots.” Few experienced system engineers are so foolish as to ignore all risk. When people ignore risk, they do it selectively. The way it typically works is, they take elaborate care to list and analyze and monitor all the minor risks (the ones they can hope to counteract through managerial action) and only ignore the really ugly ones.

Create A New Functional Behavior

Here’s a credo that describes the risk-related functional behaviors for which we should strive.

Manage risk as an asset. We choose the types of risks we face to match our business needs. We understand and anticipate our customers’ and competition’s opportunities and risks as well as their problems. We manage this knowledge as a strategic competitive advantage.

Treat decision making as a skill. Decision-making is a critical skill that we teach, practice, and constantly strive to improve.

Create a pull for risk information. We ask the right questions to obtain risk formation. We actively seek it. We conduct meaningful discussions of our risks and act on the results.

Seek diversity in perspectives and information sources. We seek information from the political, cultural, economic, environmental, and technical realms. We involve multiple disciplines. We listen for and learn from divergent viewpoints.

Minimize uncertainty in time, control, and information. We systematically search for uncertainty wherever it may be. This search is the heart of a learning organization.

Recognize and minimize bias in perceiving risk. We make decisions based on sound information that is derived from an adequate analysis of the situation.

Plan for multiple futures. We plan for the best case, worst case, and several most likely scenarios.

Be proactive. We act before things go wrong. We attack root causes. We look for and address systemic risks.

Make timely, well-informed decisions and commitments. The purpose of risk management is to make decisions, not just identify risks. We understand when decisions must be made. We manage the risks and understand the chances of success before we make commitments.

Reward those who identify and manage risks early, even if the risks become problems. Even prudent risk takers will realize some problems. Our heroes are not just those people who solve problems, but also those who intelligently avoid them.

Be Prepared to Slay a Sacred Cow

A huge obstacle to risk aware management is organizational memory. The memory of “how we’ve always done things around here” is an ad hoc truth that substitutes for “doing things the right way around here.” The root is fear. People don’t want to make mistakes, and the best way to avoid making a mistake is to continue doing things exactly as they’ve always been done.

Organizations get trapped in a kind of circular logic: “We do what we do because it’s the best thing to do. And it’s the best thing to do because it’s what we’ve always done.” Unfortunately, this is a comforting fantasy for too many. What you end up with are sacred cows — things that you take for granted; pro forma (“tick in the box”) risk management processes that you have come to believe will help you get things done. In reality, all they do is get in the way of getting things done. You must be prepared to slay a sacred cow. The greatest challenge may be finding the courage to candidly answer the question, “Are we ready to hear the ruthless truth about the risks of our system engineering decisions?”

The truth is that significant, lasting performance improvement in risk aware management may require a courageous change in your organizational culture even  before driving ongoing, institution-wide initiatives to optimize performance. Progress will occur in your organization when (and only when) a positive vision for the future multiplied by dissatisfaction with the status quo is greater than the natural human resistance to change (and resistance to the truth). Be prepared to challenge yourself to commit to evidence-based risk aware management as a way of organizational life.

Conclusion

Risk aware management is a vital part of successful systems engineering and project management. Although most system engineering managers know what to do, sometimes they just don’t do it. Some of the factors that contribute to this behavior include deficient system engineering processes, failure to adopt a risk aware management process, risk-averse or reckless attitudes, and failure to consider organizational context. 

While there are many possible steps to improve risk management in an organization, some of those that appear to have the most potential for success include training managers to elicit risk information through checklists and improved communication methods, aligning rewards and incentives with risk management activities, examining risks in the context of the organization, and managing risks across an organization. These techniques are not typically part of risk management processes. And therein lies perhaps the most important lesson: Risk Aware Management is more than a process; it requires the right information and the right behavior to bridge the gap between Systems Engineering and Project Management.


Scott Stribrny

An internationally acknowledged authority in project management, information systems/technology, systems engineering, and lean development, Scott is interested in the intersections of business, technology, and organizational risks.