morss 2015: optimizing resource informed metrics
TRANSCRIPT
FROM QUALITATIVE TO QUANTITATIVE AND BACK AGAIN
MAJ Kristin SalingMaj David Kerns
USPACOM J83, Strategic AssessmentsCamp Smith, HI
Optimizing Resource-Informed Metrics for the Strategic Decision Maker
UNCLASSIFIEDThis briefing is:
INTRODUCTION
Most analysis assumes the problem with assessments is with the metrics and the data when those are only symptoms of the problem. If the framework and parameters are badly designed, there can’t be good results.
UNCLASSIFIED
UNCLASSIFIED
Image © http://dilbert.com/
AGENDA
• WHAT IS ASSESSMENT?• PROBLEMS WITH ASSESSMENT• ELEMENTS OF THE PROBLEM• ADDRESSING THE PROBLEM
– DEFINING THE WIN– DEVELOPING THE FRAMEWORK– ANALYZING THE FRAMEWORK– LEADERSHIP ENGAGEMENT
• RESOURCES• DISCUSSION
UNCLASSIFIED
UNCLASSIFIED
WHAT IS ASSESSMENT?
Assessment is a process that evaluates changes in the environment and measures progress of the joint force toward mission accomplishment.
-Joint Pub 3-0
UNCLASSIFIED
UNCLASSIFIED
FUNCTIONS OF ASSESSMENT
• Provide a picture of the current state of the operational environment
• Provide context for stakeholder decisions involving the most effective application of forces, resources and policy
• Provide insight on whether current policies, forces, and resources are effective
UNCLASSIFIED
UNCLASSIFIED
PROBLEMS WITH ASSESSMENT
• Not providing the right level of insight and context to the commander’s decision cycle
• Measuring too many things• Measuring the wrong things• Showing too many numbers• Not showing enough background numbers• Numbers without proper context• False appearance of mathematical rigor• Lack of linkage between objectives & metrics• Failure to integrate performance measures
UNCLASSIFIED
UNCLASSIFIED
PROBLEMS WITH ASSESSMENT
• Guidance for assessment stops at telling staffs to conduct assessment and some differentiation between MOEs and MOPs
• We have a problem with metrics, but the metrics are not the only problem
• If we don’t better codify how to create a total campaign framework, tying in environmental indicators, capabilities, and resources needed, we will not have a good assessment
UNCLASSIFIED
UNCLASSIFIED
ELEMENTS OF THE PROBLEM
Leadership
Contradictory guidance and interest from leaders on what they expect the assessment to deliver
Defining the Win
Incomplete mission analysis prevents creation
of linkages between objectives, metrics, and
success
Framework
Current framework focuses on the wrong elements, colors and numbers vs. insights and analysis
UNCLASSIFIED
UNCLASSIFIED
LEADERSHIP
• Every paper highlighting the failure of assessment indicates the critical need for command emphasis
• Commands and agencies need to integrate assessment training and discussions into commander training and staff orientation
• Assessment teams need to develop their own calendars of key staff engagements to provide the commander and key staff with necessary assessment familiarization
UNCLASSIFIED
UNCLASSIFIED
DEFINING THE WIN
• Metric development begins in mission analysis
• We cannot measure the achievement of an objective or attainment of an endstate if we cannot define what it means to successfully achieve it
• How do we know we are winning if we don’t know how to define winning?
UNCLASSIFIED
UNCLASSIFIED
MODELING THE SYSTEM
SYSTEM PROCESS
CONTROLS
SYSTEMINPUTS
SYSTEMOUTPUTS
EXOGENOUS VARIABLES
BEGINNINGSTATE OF THE
SYSTEM
ERROR VARIABLES, RANDOM ACTIONS
OPERATIONS, ACTIONS, ACTIVITIES
MEASURABLE OUTCOMES AND
INDICATORS THAT INDICATE THE STATE
OF THE SYSTEM
Modeling the environment as a semi-black box system prevents analysts from drawing unnecessary conclusions from actions and ensures they focus on measurable outcomes indicative of the state of the system.
UNCLASSIFIED
UNCLASSIFIED
The Integration Definition for Function Modeling (IDEF0) Model is a way to understand the relationship between the inputs and outputs of a system and other things that may impact them.
ADDRESSING THE PROBLEM
• Focus groups and interviews with analysts, leaders, staff members, and components
• Going “back to basics” with the science of systems analysis
• Developing documents and procedures to train analysts and staff members on and codify better campaign framework development procedures
• Developing the “right” metrics for assessment• Defining the “win” throughout the process• Engaging key leaders on assessment
UNCLASSIFIED
UNCLASSIFIED
BODIES OF ANALYSIS THAT CAN HELP
• Current assessment processes have many echoes in technical operations analysis
• By reexamining the parent theories, we can see what is missing in current assessment practices
COMPLEXITY SCIENCE
SYSTEMSANALYSIS
DECISION ANALYSIS
QUANTITATIVE VALUE
MODELING
SENSITIVITY ANALYSIS
UNCLASSIFIED
UNCLASSIFIED
STARTING WITH MISSION ANALYSISHIGHER HQ MISSION
INTELLIGENCE PREPARATION OF THE BATTLEFIELD
SPECIFIED, IMPLIED, ESSENTIAL TASKS
REVIEW ASSETS
CONSTRAINTS
FACTS + ASSUMPTIONS
RISK ASSESSMENT
CCIR + EEFI
ISR
TIMELINE
MISSION + TASKS
APPROVAL
INFORM CURRENT AND NECESSARY ENVIRONMENTAL CONDITIONS(OBJECTIVES + EFFECTS)
INFORM CURRENT AND NECESSARY PERFORMANCE CONDITIONS(OBJECTIVES + EFFECTS)
INFORM SUCCESS AND THRESHOLD CRITERIA FOR OBJECTIVES
ONCE SUCCESS CRITERIA IS DEFINED FOR THE OBJECTIVES, DEFINING NECESSARY CONDITIONS AND METRICS TO MEASURE THEM IS FAIRLY STRAIGHTFORWARD.
UNCLASSIFIED
UNCLASSIFIED
SUCCESS CRITERIA
• During mission analysis, the planners and analysts define the critical conditions that must be achieved in order for the objective to be considered achieved
• From this success criteria, planners and analysts derive sub-objectives, necessary conditions, or effects
• Repeat the same procedure with conditions or sub-objectives for metric development
UNCLASSIFIED
UNCLASSIFIED
ENVIRONMENT VS. PERFORMANCE
• Analysts generally measure environmental indicators as performance outputs
• There are also performance indicators that show success in terms of an output– Force capability– Capacity– Posture
• There is a difference between task assessment and performance assessment
STRATEGIC ENVIRONMENT
OPERATIONS, ACTIONS,
ACTIVITIES
INITIAL ENVIRONMENT
+CAPABILITIES
END ENVIRONMENT + CAPABILITIES
ACTIONS OF ALLIES,
PARTNERS, ACTORS; NATURAL EVENTS
UNCLASSIFIED
UNCLASSIFIED
Operations, Actions, and Activities (OAAs)
Environmental Metrics Tasks / Performance Metrics
Critical Conditions
(Environment)
Intermediate Military
Objectives
Higher
Objectives
or
Endstates
Measurable items that indicate the
presence of environmental
conditions necessary for, or
indicative of, the objective’s
success, can be direct or proxy
Environmental conditions
necessary for the success
of the objective.
DoD conditions / requirements
necessary for the achievement
of the objective.
Efforts and actions by OPRs with stated achievable and measurable objectives to support the
accomplishment of key (strategic level) tasks, the improvement of environmental indicators, or
the application of resources toward service-specific objectives.
Clearly defined, decisive, and
attainable goal toward which every
military operation should be directed.
UNCLASSIFIED
UNCLASSIFIED
Critical Conditions
(Performance)
BASIC CAMPAIGN FRAMEWORK
Measurable items that indicate the
achievement of capabilities /
resources necessary for or
indicative of the objective’s success,
generally direct
DEVELOPING THE FRAMEWORKFUNDAMENTAL
OBJECTIVE
FUNCTION 1
OBJECTIVE 1.1
VALUE MEASURE1.1.1
VALUE MEASURE 1.1.2
OBJECTIVE 1.2
FUNCTION 2
OBJECTIVE 2.1
OBJECTIVE 2.2
FUNCTION 3
The value hierarchy is a pictorial representation of a value model, starting with the fundamental objective or endstate at the top and decomposing the system into sub-systems or sub-functions, subordinate objectives for those functions, and associated value measures.
UNCLASSIFIED
UNCLASSIFIED
DEVELOPING THE METRIC
• What is a “bad” metric?• A bad metric
– Does not provide context for objective completion
– Is overly vague– Is unnecessarily precise– Does not link to conditions and
objectives– Is measured just for the sake of
measuring something
• What makes a “good” metric?• A good metric
– Allows data collectors or subject matter experts to answer questions relating to the accomplishment of an objective
– Can be objective or subjective (Objective metrics may require additional metrics to provide context for objective accomplishment)
– May have strong links to decision triggers, CCIR, or other important decision factors
UNCLASSIFIED
UNCLASSIFIED
QUALITATIVE AND QUANTITATIVE
• Analysts should be comparing subjective measurements throughout the assessment
• Objective metrics provide good data, but not an assessment – they provide no context
• Objective metrics can be given subjective context either through an additional calculation against a set standard or by obtaining subject matter expertise
UNCLASSIFIED
UNCLASSIFIED
VALUE FUNCTIONS
• Current assessment strategy assumes a linear return to scale (RTS) where all responses are valued equally
• Value functions measure return to scale on the value measure
• These are useful in determining points of diminishing returns or losses
• Value functions can also be discrete, with value given for certain ratings and none for others
Linear RTS
DecreasingRTS
IncreasingRTS
UNCLASSIFIED
UNCLASSIFIED
RATINGS WITH AND WITHOUT VALUE FUNCTIONS
The change in average created by value functions is not always as significant as changes in the individual rating, but it can account for a more accurate description of how a stakeholder assigns value and priority.
DISCRETE VF.
RATING POINTS
1 8
2 7.5
3 6
4 3
5 1
RATINGS VALUE
5 1
1 8
4 3
5 1
1 8
4 3
2 7.5
AVG NVF 5.714286
AVG VF 4.5
WHAT DOES “GREEN” MEAN?The “stoplight” method has been used extremely ineffectively, but it can be made effective through defining success, partial success, or failure for each metric
GREEN Conditions are ranging from the ideal state to the lower limit of the commander’s risk tolerance; no additional resources or policy changes required at this time to improve progress toward the objective; maintain
YELLOW Conditions are outside the commander’s risk tolerance but not at a state deemed critical or dangerous; additional resources or policy changes may be required and can be addressed by amendment to the current plan
RED Conditions are at a state deemed critical or dangerous; branch or contingency plans may need to be enacted, additional resources and policy changes needed to address the environment
UNCLASSIFIED
UNCLASSIFIED
INPUT OPTIONS
• What options provide information in the best context to a decision maker?
• What options provide the best context and clarity to the subject matter expert?
• How much “precision” do you need on a subjective measurement?
Scale A. 5 Point B. Mix
5 Met Exceeded
4 Favorable Met
3 Concerns Concerns
2 Serious Concerns Did Not Meet
1 Did Not Meet Failed
5-Point Scale Options
Scale A. 3 Bins B. Thresholds C. Bins and Ends C. Alt 5 Point D. Alt Mix
10 Met
9 Exceeded Met Exceeded
8
7 Met Favorable Met
6 Some Concerns
5 Concerns Concerns
4 Serious Concerns
3 Will not meet Serious Concerns Did Not Meet
2
1 Failed Did Not Meet Failed
0 Did Not Meet
Favorable
Concerns
Serious Concerns
Favorable
Concerns
Serious Concerns
10 Range Scale Options
In strategic assessment, which is inherently subjective, the number and its precision are only as important as what it can communicate to a decision maker and how intuitive it is to the respondent.
UNCLASSIFIED
UNCLASSIFIED
Scale Confidence Bins
9 Strongly Agree
8 Agree
7 Somewhat Agree
6 Strongly Agree
5 Agree
4 Somewhat Agree
3 Somewhat Agree
2 Agree
1 Strongly Agree
Confounded Option
Favorable
Concerns
Serious Concerns
RATING SYSTEMS AND THRESHOLDS
• Defining an intuitive rating system that allows subject matter experts to best answer questions is integral
• It can sometimes be difficult to translate a more detailed rating system into three color stoplight bins
• Two separate rating systems can be used in concert with the right thresholds established
UNCLASSIFIED
UNCLASSIFIED
STEPS TO IDENTIFYING THRESHOLDS
• STEP 1: Obtain
Sample Data
• STEP 2: Enter
Subjective
Assessment
• STEP 3: Create
Averages
• STEP 4: Sort and
Identify Natural
Thresholds
Planner Executor Intel Client PA AVE Result
5 5 5 2 5 4.4 G
3 4 3 4 4 3.6 G
4 4.5 4 3 2 3.5 G
4 4 4 4 1 3.4 G
4 4 4 3 2 3.4 G
3 4 3 4 3 3.4 Y
2 4 3 3 4.5 3.3 Y
3 4 3 4 2 3.2 Y
2.5 3.5 3 3 4 3.2 Y
3 4 3.5 4 1 3.1 G
3 4 3 4 1 3.0 Y
1 4 2 3 4 2.8 Y
2 2 2.5 5 2 2.7 R
2 2 2 5 2 2.6 R
2 3 1 4 3 2.6 Y
2 2 1 5 3 2.6 R
3 2.5 3 2 2 2.5 R
2 2 1 4.5 3 2.5 R
3 2 3 2 2 2.4 R
2.75
3.4
1 23
4
A good assessment should average stakeholder data to a value that makes intuitive sense to subject matter experts and leaders.
UNCLASSIFIED
UNCLASSIFIED
REFINE AND ANALYZE
• Colors get you close. Discussions add quality and clarity.
– For Case 1-2: Why did PA perceive more failure in the metrics?
– For Case 3-4: Do the Planner and Intel reps know something the others
do not?
– For Case 5-6: If the Client is happy, is that all that matters?
Planner Executor Intel Client PA AVE Result
4 4.5 4 3 2 3.5 G
4 4 4 4 1 3.4 G
3 4 3 4 3 3.4 Y
1 4 2 3 4 2.8 Y
2 3 1 4 3 2.6 R
2 2 2 5 2 2.6 R
Case
1
2
3
4
5
6
UNCLASSIFIED
UNCLASSIFIED
WEIGHTING METRICS + RATERS
• Most metrics in assessment are currently prioritized equally, but not all measures have or should have an equal effect on an outcome
• Weighting is a method of discriminating between metrics in terms of priority
• We can determine the impact of weighting metrics and respondents on the outcome of the objective’s rating through the use of sensitivity analysis
Rater Score Points Weight Result
1 5 1 0.027777778 0.138888889
2 5 3 0.083333333 0.416666667
3 3 2 0.055555556 0.166666667
4 4 8 0.222222222 0.888888889
5 3 7 0.194444444 0.583333333
6 5 4 0.111111111 0.555555556
7 5 6 0.166666667 0.833333333
8 4 5 0.138888889 0.555555556
Score: 4.138888889
Unweighted: 4.25
UNCLASSIFIED
UNCLASSIFIED
SENSITIVITY ANALYSIS
• Determine the impact of weighting metrics and raters with sensitivity analysis
• This can be done either using simulation or rough calculations in Excel
RATER 1 SCORE
0.1 4.22
0.2 4.31
0.3 4.4
0.4 4.48
0.5 4.57
0.6 4.66
0.7 4.74
0.8 4.83
0.9 4.91
UNCLASSIFIED
UNCLASSIFIED
APPLYING METRICS TO RESOURCES
• Tracking resources and applying to subjective ratings is a start, but it is only useful, as with the ratings, when subject matter experts provide context
• Often, this provides a starting point for questions and discussions with experts as to whether the resources spent are necessary to maintain/hold ground or whether the effort is ineffective
RATINGS QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1 2 2 2 3
OBJ 2 1 2 1 1
OBJ 3 3 4 3.5 4
OBJ 4 4 4 4 4
DOLLARS
IN $M QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1 $4.00 $4.00 $4.00 $4.00
OBJ 2 $2.00 $2.00 $2.00 $2.50
OBJ 3 $2.00 $2.00 $2.00 $2.00
OBJ 4 $4.00 $3.00 $3.00 $3.00 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1
OBJ 2
OBJ 3
OBJ 4
UNCLASSIFIED
UNCLASSIFIED
CREATE A LASTING NARRATIVE
• The most useful tool in an analyst’s assessment arsenal is a lasting narrative that codifies the following
– State of the system/objective
– Expert opinion and analysis as to the reason
– Resources applied to changing the system
– Recommended changes to forces, posture, policy, or resource application
UNCLASSIFIED
UNCLASSIFIED
FUTURE WORK
• Finalize development of campaign plan structure to include incorporation of performance and capability elements and performance metrics (resources)
• Incorporate multi-objective decision analysis methods and analyze tradeoffs
• Upgrade existing systems in use to incorporate more robust Gantt and resource tracking functions for more thorough analysis
• Incorporate more focus groups and follow-on analysis time into the assessment – the assessment only begins with a data call, but finishes with thorough analysis of data
UNCLASSIFIED
UNCLASSIFIED
REFERENCES
• Armstrong, J. Scott (2001) Principles of Forecasting: A Handbook for Researchers and Practitioners. Springer: New York
• Campbell, Jason, Michael O’Hanlon, and Jeremy Shapiro (2009). “How to Measure the War.” Policy Review, n. 157, 15-30.
• Downes-Martin, Stephen (2011). “Operations Assessment in Afghanistan is Broken: What is to be Done?” Naval War College Review, Autumn, 103-125.
• Kilcullen, David (2010). “Measuring Progress in Afghanistan.” Counterinsurgency, 51-83.
• Kramlich, Gary (2013). “Assessment vs. Decision Support: Crafting Assessments the Commander Needs.” White Paper. www.milsuite.mil/book/broups/fa49-orsa.
• Parnell, Gregory, Patrick J. Driscoll, Dale L. Henderson (2011). Decision Making in Systems Engineering and Management, 2d ed. Wiley: Hoboken, NJ.
• Schroden, Jonathan (2013). “A New Paradigm for Assessment in Counter-Insurgency.” Military Operations Research, v. 18, n. 3, 5-20.
• Schroden, Jonathan (2013). “Why Operations Assessments Fail: It’s Not Just the Metrics.” Naval War College Review, Autumn, 89-102.
• US Joint Staff (2011). Commander’s Handbook for Assessment Planning and Execution, Version 1.0.
• US Joint Staff (2011). Joint Publication 3-0, Joint Operations. www.dtic.mil
• US Joint Staff (2011). Joint Publication 5-0, Joint Operations Planning. www.dtic.mil
UNCLASSIFIED
UNCLASSIFIED